How to Download Historical Stock Data into Matlab
Sunday, November 26th, 2006Note: Please read the disclaimer. The author is not providing professional investing advice or recommendations.
There is a matlab script that goes with this article: get_hist_stock_data.m
Many of us in the engineering field use Matlab for our daily university or professional work. It’s a fast and easy tool for algorithm development and simulations of all sorts. And often, after becoming proficient in Matlab, and having used it to solve many difficult problems, the Big IdeaTM dawns that goes something like this:
With all my background and experience in mathematics, programming, and problem solving, is there any reason that this skill set I’m drawing on right now to design a robust and efficient wireless communication system couldn’t also be used to develop a robust and efficient stock trading algorithm?
After all, isn’t a lot of engineering about recognizing and exploiting patterns or trends - or extracting meaningful information from signals corrupted with noise?
You can see it all now. Code up a trading algorithm, simulate it over historical stock data, tweak, adjust, and repeat until you have a system that would have historically delivered huge returns. Then unleash it upon an unexpecting Wall Street…
… and pray that the future tracks the past…
So the first question is where to get historical stock data for your backtesting? I had seen Compustat mentioned as the data source in many investing books so I contacted them for a quote. Let’s just say that after paying for the license I wouldn’t have had any money left over to actually invest. As S&P kindly told me, it’s really for institutions, not private investors.
But it turns out there is a fairly good free database via Yahoo! Finance’s Historical Portal. It only has high, low, open, close, and volume - and it only has data for businesses that are still alive (survivorship bias) - but it’s at least a step in the right direction!
And depending upon the algorithm you have in mind, these historical prices and volume may be all you’re really after anyway. So the next step is how to easily get the data into Matlab.
The steps for downloading historical stock data from Yahoo! Finance are already available at the Mathworks site. But I’ll expand on those steps a bit.
It is easy to read historical data into Matlab because of Matlab’s java interface. So for those who wish to use java instead of Matlab, the steps are therefore very similar.
Step 1 is to create the proper URL name for the stock and time period you are after. This takes a little deciphering of the URL syntax used by Yahoo! for historical data - which you can do by clicking on various historical data options and noting what appears in the URL address. Here’s an example URL for some historical data (daily) in CSV format for Apple Computer, from November 15, 2005 through February 17, 2006:
http://ichart.finance.yahoo.com/table.csv?s=AAPL&a=10&b=15&c=2005&d=01&e=17&f=2006&g=d&ignore=.csv
You’ll find that after s= comes the ticker symbol, after a= the start month (minus 1), after b= the start day of the month, c= the start year and so on. The final g= parameter lets you choose between getting historical stock information on a daily, weekly, or monthly basis.
So the first step is to create a character string with the desired URL name as described above, saved in a variable called, say, url_name:
url_name=’http://ichart.finance.yahoo.com/table.csv?…
s=AAPL&a=10&b=15&c=2005&d=01&e=17&f=2006&g=d&ignore=.csv’;
Then we set up a buffer for reading from the URL:
buff_reader = java.io.BufferedReader(…
java.io.InputStreamReader(openStream(java.net.URL(url_name))));
We then use this buffer initially to read the first line of the file (the header) and discard:
dummy = readLine(buff_reader);
Now we will set up some sort of while loop and use the command below to read a line of the file at a time and store it into a character string:
char_string = char(readLine(buff_reader));
You’ll need to parse char_string after each read to extract the stock’s high, low, open, close, etc. and use str2num to convert the price strings to doubles. Voila - you’re ready to develop your first simulation! Don’t forget to use the relationship between Adjusted Close and Close to first normalize your data - otherwise 2:1 splits could be seen by your algorithm as 50% price drops and so on.
Never forget that not only is past performance no guarantee of future results, it might have no correlation whatsoever! But if you do develop something that appears to work year in and year out, through expansion and recession, good times and bad, such as Joel Greenblatt’s Magic Formula, you might just be on to something.
There is a matlab script that goes with this article: get_hist_stock_data.m
Technorati Tags: Matlab, Data Mining, Stock Market, Investing