How to Download Historical Stock Data into Matlab

November 26, 2006 · Filed Under Stock Market 

Note: Please read the disclaimer. The author is not providing professional investing advice or recommendations.

If you’ve arrived at this link just wanting to download a copy of my Matlab function that retrieves historical stock data and normalizes for splits & dividends, click here: get_hist_stock_data.m

If you want to understand a little more of the particulars, read on…

Many of us in the engineering field use Matlab for our daily university or professional work. It’s a fast and easy tool for algorithm development and simulations of all sorts. And often, after becoming proficient in Matlab, and having used it to solve many difficult problems, the Big IdeaTM dawns that goes something like this:

With all my background and experience in mathematics, programming, and problem solving, is there any reason that this skill set I’m drawing on right now to design a robust and efficient wireless communication system couldn’t also be used to develop a robust and efficient stock trading algorithm?

After all, isn’t a lot of engineering about recognizing and exploiting patterns or trends – or extracting meaningful information from signals corrupted with noise?

You can see it all now. Code up a trading algorithm, simulate it over historical stock data, tweak, adjust, and repeat until you have a system that would have historically delivered huge returns. Then unleash it upon an unexpecting Wall Street…

… and pray that the future tracks the past…

So the first question is where to get historical stock data for your backtesting? I had seen Compustat mentioned as the data source in many investing books so I contacted them for a quote. Let’s just say that after paying for the license I wouldn’t have had any money left over to actually invest. As S&P kindly told me, it’s really for institutions, not private investors.

But it turns out there is a fairly good free database via Yahoo! Finance’s Historical Portal. It only has high, low, open, close, and volume – and it only has data for businesses that are still alive (survivorship bias) – but it’s at least a step in the right direction!

And depending upon the algorithm you have in mind, these historical prices and volume may be all you’re really after anyway. So the next step is how to easily get the data into Matlab.

The steps for downloading historical stock data from Yahoo! Finance are already available at the Mathworks site. But I’ll expand on those steps a bit.

It is easy to read historical data into Matlab because of Matlab’s java interface. So for those who wish to use java instead of Matlab, the steps are therefore very similar.

Step 1 is to create the proper URL name for the stock and time period you are after. This takes a little deciphering of the URL syntax used by Yahoo! for historical data – which you can do by clicking on various historical data options and noting what appears in the URL address. Here’s an example URL for some historical data (daily) in CSV format for Apple Computer, from November 15, 2005 through February 17, 2006:

http://ichart.finance.yahoo.com/table.csv?s=AAPL&a=10&b=15&c=2005&d=01&e=17&f=2006&g=d&ignore=.csv

You’ll find that after s= comes the ticker symbol, after a= the start month (minus 1), after b= the start day of the month, c= the start year and so on. The final g= parameter lets you choose between getting historical stock information on a daily, weekly, or monthly basis.

So the first step is to create a character string with the desired URL name as described above, saved in a variable called, say, url_name:

url_name=’http://ichart.finance.yahoo.com/table.csv?…
s=AAPL&a=10&b=15&c=2005&d=01&e=17&f=2006&g=d&ignore=.csv’;

Then we set up a buffer for reading from the URL:

buff_reader = java.io.BufferedReader(…
java.io.InputStreamReader(openStream(java.net.URL(url_name))));

We then use this buffer initially to read the first line of the file (the header) and discard:

dummy = readLine(buff_reader);

Now we will set up some sort of while loop and use the command below to read a line of the file at a time and store it into a character string:

char_string = char(readLine(buff_reader));

You’ll need to parse char_string after each read to extract the stock’s high, low, open, close, etc. and use str2num to convert the price strings to doubles. Voila – you’re ready to develop your first simulation! Don’t forget to use the relationship between Adjusted Close and Close to first normalize your data – otherwise 2:1 splits could be seen by your algorithm as 50% price drops and so on.

And in case you missed it at the beginning of the article, here is a link again to my own function that puts all this together: get_hist_stock_data.m.

Never forget that not only is past performance no guarantee of future results, it might have no correlation whatsoever! But if you do develop something that appears to work year in and year out, through expansion and recession, good times and bad, such as Joel Greenblatt’s Magic Formula, you might just be on to something.


Comments

24 Responses to “How to Download Historical Stock Data into Matlab”

  1. Mr. B on August 20th, 2007 9:05 pm

    Hi,

    I have modified this Matlab code to download some ocean wave data from the web. Works very nice.

    Thanks.

  2. Lumilog on August 21st, 2007 5:46 pm

    Great!

    I read somewhere that researchers found a correlation between stock prices and the production levels of butter in Bangladesh. Maybe we should also incorporate ocean wave and sunspot data to build a better trading algorithm… :)

  3. Agu on November 25th, 2007 12:40 pm

    Hi,

    coool ! your tips on taking data from net to matlab.
    I am also an engineer with interests in Finance. Your
    attitude to share your CFA efforts, along the way, its
    amazing ! Keep it Up ! If you are interested, lets keep
    in touch, as i feel there’s lots in common in terms of
    approach and thoughts (from what i read in your blog).
    Anyways, good luck buddy.

  4. abdi on December 8th, 2007 5:50 pm

    Google finance has historical stock data

  5. Zuio on December 10th, 2008 8:30 am

    Thanks for sharing your code.

    I wonder how to best build historical portfolio data. I observed that one stock may have gaps or doubles (latter with google data). Cleaning then becomes an issue. My flow would be like this:
    - get the historical data from symbols in a list
    - set up empty database with working days
    - fill each day for every symbol
    - look anomalies like spikes, gaps etc. – fill etc.

    Any ideas?

  6. Lumilog on December 10th, 2008 11:07 am

    Hi Zuio,

    You could look for anomalies and adjust for them but so far I’ve had good luck just using Yahoo’s database rather than Google’s b/c it has an “adjusted close” column that allows you to automate the accounting for splits & dividends (using ratio of “adjusted close” to “close”).

    - Lumi

  7. Zuio on December 10th, 2008 12:16 pm

    Hi Lumi,

    I am interested in setting up a simple Matlab code for a portfolio strategy. The proper alignment of cleaned time series is then mandatory. Any help is welcome and I would like to share results.

  8. Lumilog on December 12th, 2008 12:21 pm

    Hi Zuio,

    Here is a post where I talk about this:

    http://luminouslogic.com/how-to-normalize-historical-data-for-splits-dividends-etc.htm

    I haven’t done any time-series stuff in a while (hope to get back into it one day) but I wish you luck!

    - Lumi

  9. Istvan on March 21st, 2009 2:22 pm

    Hi, could you help me how to download data into Matlab from url such this one: http://stooq.com/q/d/l/?s=zn.f&i=d

    I would like to use Matlab to save csv file, but Matlab function urlwrite does not work on it.
    Thanks
    Istvan

  10. Lumilog on March 22nd, 2009 11:37 am

    Istvan,

    I have very little time at the moment – I’ll put this on my To Do list and try to get back to you on this soon. It looks like we can use a similar script for stooq as was used for yahoo, but leaving out volume.

    Hope to be back in touch with you soon,
    -lumi

    UPDATE: I tried but it does not appear possible to access the CSV info from Stooq in Matlab. Sorry.

  11. pacca on April 14th, 2009 9:02 am

    do u know if there is a tool to download in Matlab … Google Finance Data ?

  12. Lumilog on April 14th, 2009 10:35 am

    I know there is (or was) a toolbox you could buy from Mathworks to download stock info from yahoo – maybe they have one for google.

    However, what particular elements are you after (historical prices, P/E, etc.)? I have a few different Matlab scripts I use to grab info from various financial sites. I could probably modify one to get whatever you want from google.

  13. pacca on April 15th, 2009 11:33 am

    thank a lot Lumilog.
    I need Hist Prices …

    the other info are not a must at the moment.

    thx
    Pacca

  14. Lumilog on April 15th, 2009 9:35 pm

    just did a quick post with a link to the code here:

    http://luminouslogic.com/how-download-historical-stock-data-google-matlab.htm

    happy hunting!
    -lumi

  15. barron on June 10th, 2009 4:33 pm

    Hey, this is neat. I’m trying to exploit matlab for backtesting but I’m just now beginning. Are you rich and famous yet from your endeavors?

  16. Lumilog on June 11th, 2009 7:28 am

    yes! and you can buy my system for 3 easy payments of $999999.99!

    in all seriousness the CFA Program showed some serious flaws in my previous models so i’m starting again from scratch. i currently use matlab only to automate my own version of getting a current snapshot of my asset allocation profile (sort of like morningstar’s portfolio x-ray, but with a few extra bells and whistles).

    but soon i will unleash algorithm 2.0 and take over the world. you might be eligible for a position in my cabinet!

  17. How to Normalize Historical Data for Splits, Dividends, Etc?FROM luminouslogic) on July 23rd, 2009 3:23 am

    [...] data</a> to perform backtesting of trading algorithms it is imp”; If you’re using historical data to perform backtesting of trading algorithms it is important that your data be transformed from [...]

  18. How to Import Stock Statistics into Matlab?luminouslogic? | WhooL ! on July 23rd, 2009 5:44 pm

    [...] posts have dealt with how to download historical stock prices into Matlab and how to adjust for splits and dividends so that you can try your luck optimizing neural networks [...]

  19. div on November 21st, 2009 10:40 pm

    Hi Lumi

    Great posts. I am a grad student and i am working exactly on this… my background is control engineering.

    I wanted to know how to obtain Historical Data for
    1) Companies , P/E and other fundamental data… such as the data listed on the Key statistics page on yahoo.

    2) How to download other macro economic data such as the FED interest rate, unemployment rate… for this i guess we need to scrape the data from the web page..

    can you help me on this..
    thanks
    Divakar

  20. Lumilog on November 29th, 2009 1:18 pm

    Hi Div – thanks for writing.

    If you can find the metrics on the web, you can easily grab them with Matlab. Of course the trick is finding what you’re after.

    For most backtesting involving metrics beyond just price high, low, open, close, and volume, I’ve never been able to go back beyond about 10 years b/c that’s about all I can find for free.

    As I mentioned in the post, I once inquired about purchasing Compustat in order to have more than just historical price. But it’s outrageously expensive for a private individual. Perhaps your university business department has access?

  21. Divakar on January 3rd, 2010 3:15 am

    Hi Lumi

    Thanks for the reply.
    Yeah i found out that my university business dept has access to compustat. However since i am an eng student i wont have the access to that. I will find a way to work around that. Thanks for pointing it out. I would want to clarify more doubts. if you are ok , can u share ur email id so that i can write too you rather than posting it on ur posts? I know you are busy but your help will be of immense use to me.

    Thanks
    div – sohamm@gmail.com

  22. Sunil on January 30th, 2010 4:06 am

    Hi Lumilog,
    I tried to use your cde to download IBM historical data. I ran into some error I couln’t understand.
    I typed get_hist_stock_data(’IBM’) in the command window, It created a 1530X1 cell of dates strings from 2004-09-27 till todate.
    Could you please inform where I have gone wrong ?

  23. Lumilog on January 30th, 2010 6:31 am

    Hi Sunil – you just need to call the function in a different way, otherwise you only get the first return argument and not all of them. Try:


    [hist_date, hist_high, hist_low, hist_open, hist_close, hist_vol] = get_hist_stock_data(’IBM’);

  24. Sunil on January 30th, 2010 7:25 am

    It works now. Thank you so much.
    It was a silly mistake and proves that I am getting rusty using Matlab.

Leave a Reply