# How to Normalize Historical Data for Splits, Dividends, Etc.

If you’re using historical data to perform backtesting of trading algorithms it is important that your data be transformed from actual to effective historical prices. Otherwise your algorithm may, for example, interpret a 2:1 stock split as a 50% overnight drop.

Luckily if you’re using Yahoo! Finance as your source for historical data they provide an easy way to account for such issues. Yahoo has a short write-up about this, though in my opinion it falls a bit short. Therefore I’ll provide an example here.

The process of normalizing the data is simply to take the ratio between the stock’s “Adj. Close” to “Close” and apply it to all prices – that is the raw High, Low, Open, and Close.

For example, consider the following output from Yahoo’s historical database:

Table 1
Prices
Close*
4-Dec-0630.4031.1230.2430.841,455,90030.84
1-Dec-0630.3630.9430.0030.361,503,70030.36
1-Dec-063:2 Stock Split
30-Nov-0645.5145.5644.9645.471,155,80030.31
29-Nov-0645.7046.5445.6146.511,381,60031.01
29-Nov-06\$0.09 Dividend
28-Nov-0646.5046.5145.5445.603,001,50030.31

If for each entry of each row (besides volume) we multiply by the ratio of that row’s Adjusted Close to Close, we get the effective historical prices:

Table 2
Prices
Close*
4-Dec-0630.4031.1230.2430.841,455,90030.84
1-Dec-0630.3630.9430.0030.361,503,70030.36
30-Nov-0630.3430.3729.9730.311,155,80030.31
29-Nov-0630.4731.0330.4131.011,381,60031.01
28-Nov-0630.9130.9230.2730.313,001,50030.31

For example, we compute the effective Open of the 30-Nov-06 row by multiplying the original value, 45.51, by 30.31 / 45.47 resulting in 30.34.

## 22 thoughts on “How to Normalize Historical Data for Splits, Dividends, Etc.”

1. Ran says:

Hello,

I wonder why you don’t adjust the volume as well.
Looking at your example, on the 28-Nov-2006 volume was 3 Million share, at a share price that ranged 45.5 to 46.5 this means that ~137M\$ were traded on this stock that given day.

After the adjusting you loose this data, it seems like only 90 M\$ were traded that day.

Or should volume be adjusted as well ???

With respect to Monthly and Weekly historicals, I am running into the issue of how to normalize them. Depending on the type of split and whether the open was higher or lower than the close (1:2 vs. 2:1, for example), if you simply adjust for the ratio of the Adjusted Close to the Close, the weekly/monthly high or low might not actually be the real high or low. I was trying to do this with AIG monthly/weekly data for my self-build stock app and discovered that the simple method applied to daily data didn’t work.

Would it be, in your opinion, simpler to to simply adjust the daily data and as my app runs through the values to calculate the indicators/averages to aggregate the volumes and record monthly/weekly open/high/low/close from the adjusted daily data rather than coming up with some sort of scheme to adjust the data from Yahoo?

I think I see what you’re talking about. Yes it sounds like it would be best to make the adjustments first on your own using the daily data and then computing your own monthly / weekly highs and lows from that.

One other thing – as pointed out by Ran in the other comment above – my script adjusts prices only, not volume. I don’t use volume but if you do you might want some additional split logic to adjust it as well.

– lumi

4. Joe says:

Hi there,

I have used your formula to normalise the data from yahoo for JPM and have used Excel to do this.

However when I compare my JPM chart to the yahoo interactive chart, I find some discrepancies. For example, on 8/10/1998, my chart shows a low of \$16.533, whereas on the yahoo interactive chart JPM does not even close below \$25 in 1998.

Do you know why this is?
Thanks

5. Hi Joe,

First off, I think one potential discrepancy is that the Yahoo Finance chart only adjusts for splits, whereas when you use “adjusted close” you’re normalizing for both splits and dividends.

That being said, I have a Matlab script to do this automatically (click here to get it) and I don’t get values close to \$16.533 either way.

The raw Yahoo historical data (un-normalized) shows a range for the day of \$67.75 to \$69.31. The Yahoo Finance chart for that day shows \$45.667. This is because they’re adjusting the average daily price of about \$68.50 by 2/3 for the split that happened on June 12, 2000.

When I run the Matlab script I get a high and low of \$31.93 and \$31.21 respectively, which is including the effects of splits and dividends since 8/10/98. Your number seems to be half that – so it’s almost like the normalization is happening twice in your code.

– Lumi

6. marzieh says:

hi, i have a big problem, i want to use anfis to predict future but i couldent enter my time series data into matlab, how can i import data into the matlab? i dont khow how i can save data and then load it to use…please help me, thanks

7. Valeri says:

Just a small comment. The number of shares or contracts traded in a security or an entire market during a given period of time. (from wikipedia). So volume doesn’t need to be adjusted by the closeadj/close ratio.

8. Jayant Kalghatgi says:

Hi,
Please let me know any database where I could get adjusted closing price for stocks listed on BSE India.

9. Andrey says:

I think the volume needs to be adjusted. Imagine you had volumes and close prices like this:

10, 12.4
10, 12.4
— split x2 here —
20, 6.2

10, 6.2
10, 6.2
20, 6.2

However, the volume is now incorrect, in terms of if you want to get the currency amount traded from it (volume x price).

10. hi andrey – there has been some discussion of this before. you’re right – adjust for volume if you want currency amount traded. i think most people are just looking to do some forecasting of future price per share so they don’t bother with the volume. thanks!

11. Jose says:

Hi!

Apologise for this questions:
Yahoo:
http://finance.yahoo.com/q/hp?s=IBM&a=08&b=2&c=1985&d=09&e=2&f=1985&g=d
I have:
26-set-85 123.75 123.88 122.25 123.75 8500000 17.47
If I do what you say, I have:
26-set-85 17.47 17.4883 17.2582 17.47 8500000 17.47

But if I am going to Google:
I have:
Sep 26, 1985 0.00 30.97 30.56 30.94 8,502,800
Fisrt I do not have Open, but if I understand the rest of values are adjusted.

So, what is the bih difference?
What is correct?

Thnaks
Jose

12. Andrew says:

Good article and good comments. I agree with adjusting for splits, but I’m wondering about backtesting on dividend adjusted data. Seems to me, past prices, peaks, averages etc all become inaccurate if you use dividend adjusted data. As an extreme example, the average of the last 200 monthly closing prices for GE is 29.87 (as of 5/6/14). For the dividend adjusted close, its 22.29. If I was looking for an algorithm using the 200 month average, wouldn’t I need to use split adjusted prices and account for dividends some other way?

Andrew

13. Jose – if memory serves, Google Finance adjusts only for splits (usually, but not always in my experience) whereas Yahoo adjusts for splits and dividends.

Andrew – yes, for 200 day you would want to adjust for splits only. I once wrote a script that did this. What I did was first pull the unadjusted prices from Yahoo Finance as usual, but then pulled the split information from the Basic Chart page for the stock and used that to do the splits. If you go to the Basic Chart page at Yahoo Finance for any stock, the split information is just below the chart, below volume. For example, for IBM I see:

Splits: 1964-05-18 [5:4], 1966-05-18 [3:2], 1968-04-23 [2:1], May 29, 1973 [5:4], Jun 1, 1979 [4:1], May 28, 1997 [2:1], May 27, 1999 [2:1]

on the page here:

http://finance.yahoo.com/q/bc?s=IBM+Basic+Chart

One note: the split information is not perfect. I’ve come across occasional errors.

14. Andrew says:

Lumi~

Many thanks for your answer. I didn’t know about the split information on the yahoo basic chart. I was guessing the splits from the yahoo prices and the dividend info, but that is tricky and can be error prone especially when prices get low. Now I can try using the split data off the basic chart and cross check it with my guessing logic to detect when the split data is imperfect.

Thanks again

Andrew

15. Ravi Reddy says:

i want to know how you normalizing data based on what ..?
i mean from open data or close data.. what is an theme for data normalization what exactly you get after classification ..?

16. hi ravi – the normalization is so that you get total return (dividends + price appreciation) over a given time period. if you just used “open” or “close” prices to compute return w/o the normalization, you miss the dividends and also might see a 50% drop in price which was actually due to a stock split.

in practice you can just compare two “Adjusted Close” prices from Yahoo Finance to get total return – but some people are downloading the historical data to do price charts, computer technical indicators, etc. so the script puts the effects of dividends and stock splits into all prices (high, low, open, close).