How to Download Historical Stock Data into Matlab

Greetings! You’ve arrived here most likely because you’ve just realized that you’ve been Wasting Your Talents at your job using Matlab to research new medicines, design fuel-efficient engines, or develop next-generation wireless communication technologies when – duh! – you could put those Matlab skills to much better, MUCH better use turning the stock market into your own ATM. From your boat!

Well here it is – the gateway drug – a few simple lines of code that will let you download Yahoo’s free historical stock price data right into Matlab. Then you’re on your own as to what to do with it, but I’ll bet you’ve learned quite a bit of signal processing mojo over the years that you’re salivating to put to good use: particle filters, neural networks, wavelet transforms, …maybe even Viterbi decoders!

Assuming you already speak Matlab well, there are really just a handful of new things to learn.

First, it might help to know exactly where to download from. Yahoo Finance provides a convenient historical data CSV file for each stock currently trading. Type in any ticker, click on Historical Prices, and scroll down to the bottom…

Tricky Move #1 is to crack the URL genome so that you can generate your own URL on the fly. The URLs looks like this:

http://ichart.finance.yahoo.com/table.csv?s=AAPL&a=10&b=15&c=2005&d=01&e=17&f=2006&g=d&ignore=.csv

With a little (VERY little) trial & error, you might come to discover that after s= goes the ticker symbol, after a= the start month (minus 1), after b= the start day, c= the start year and so on. The final g= parameter lets you choose between getting historical stock information on a daily, weekly, or monthly basis. Construct your desired URL as a Matlab character string (let’s call it url_string).

Tricky Move #2 is to tunnel your way directly to a webpage from within Matlab, no browser required! This is done via Matlab’s java interface and your URL string like so:

Tricky Move #3 is to use that connection to read in the individual lines of the webpage’s source code into a buffer.

It’s really no harder than that – just put that last tidbit in a while loop and parse the buffer each time around to grab the data and store it in a matrix. It’s especially easy here since we’re working with CSV. Parsing HTML from other websites is a little trickier, but I’ll bet you’ll graduate quickly (gateway drug, I tell you).

Still sound too hard? OK, just click below to download my source code for free. See you in St. Maarten!

get_hist_stock_data.m

Just Florida. I’m still tweaking my algorithms…

122 thoughts on “How to Download Historical Stock Data into Matlab”

  1. Hi,

    I have modified this Matlab code to download some ocean wave data from the web. Works very nice.

    Thanks.

  2. Great!

    I read somewhere that researchers found a correlation between stock prices and the production levels of butter in Bangladesh. Maybe we should also incorporate ocean wave and sunspot data to build a better trading algorithm… 🙂

  3. Hi,

    coool ! your tips on taking data from net to matlab.
    I am also an engineer with interests in Finance. Your
    attitude to share your CFA efforts, along the way, its
    amazing ! Keep it Up ! If you are interested, lets keep
    in touch, as i feel there’s lots in common in terms of
    approach and thoughts (from what i read in your blog).
    Anyways, good luck buddy.

  4. Thanks for sharing your code.

    I wonder how to best build historical portfolio data. I observed that one stock may have gaps or doubles (latter with google data). Cleaning then becomes an issue. My flow would be like this:
    – get the historical data from symbols in a list
    – set up empty database with working days
    – fill each day for every symbol
    – look anomalies like spikes, gaps etc. – fill etc.

    Any ideas?

  5. Hi Zuio,

    You could look for anomalies and adjust for them but so far I’ve had good luck just using Yahoo’s database rather than Google’s b/c it has an “adjusted close” column that allows you to automate the accounting for splits & dividends (using ratio of “adjusted close” to “close”).

    – Lumi

  6. Hi Lumi,

    I am interested in setting up a simple Matlab code for a portfolio strategy. The proper alignment of cleaned time series is then mandatory. Any help is welcome and I would like to share results.

  7. Istvan,

    I have very little time at the moment – I’ll put this on my To Do list and try to get back to you on this soon. It looks like we can use a similar script for stooq as was used for yahoo, but leaving out volume.

    Hope to be back in touch with you soon,
    -lumi

    UPDATE: I tried but it does not appear possible to access the CSV info from Stooq in Matlab. Sorry.

  8. I know there is (or was) a toolbox you could buy from Mathworks to download stock info from yahoo – maybe they have one for google.

    However, what particular elements are you after (historical prices, P/E, etc.)? I have a few different Matlab scripts I use to grab info from various financial sites. I could probably modify one to get whatever you want from google.

  9. Hey, this is neat. I’m trying to exploit matlab for backtesting but I’m just now beginning. Are you rich and famous yet from your endeavors?

  10. yes! and you can buy my system for 3 easy payments of $999999.99!

    in all seriousness the CFA Program showed some serious flaws in my previous models so i’m starting again from scratch. i currently use matlab only to automate my own version of getting a current snapshot of my asset allocation profile (sort of like morningstar’s portfolio x-ray, but with a few extra bells and whistles).

    but soon i will unleash algorithm 2.0 and take over the world. you might be eligible for a position in my cabinet!

  11. Hi Lumi

    Great posts. I am a grad student and i am working exactly on this… my background is control engineering.

    I wanted to know how to obtain Historical Data for
    1) Companies , P/E and other fundamental data… such as the data listed on the Key statistics page on yahoo.

    2) How to download other macro economic data such as the FED interest rate, unemployment rate… for this i guess we need to scrape the data from the web page..

    can you help me on this..
    thanks
    Divakar

  12. Hi Div – thanks for writing.

    If you can find the metrics on the web, you can easily grab them with Matlab. Of course the trick is finding what you’re after.

    For most backtesting involving metrics beyond just price high, low, open, close, and volume, I’ve never been able to go back beyond about 10 years b/c that’s about all I can find for free.

    As I mentioned in the post, I once inquired about purchasing Compustat in order to have more than just historical price. But it’s outrageously expensive for a private individual. Perhaps your university business department has access?

  13. Hi Lumi

    Thanks for the reply.
    Yeah i found out that my university business dept has access to compustat. However since i am an eng student i wont have the access to that. I will find a way to work around that. Thanks for pointing it out. I would want to clarify more doubts. if you are ok , can u share ur email id so that i can write too you rather than posting it on ur posts? I know you are busy but your help will be of immense use to me.

    Thanks
    div – sohamm@gmail.com

  14. Hi Lumilog,
    I tried to use your cde to download IBM historical data. I ran into some error I couln’t understand.
    I typed get_hist_stock_data(‘IBM’) in the command window, It created a 1530X1 cell of dates strings from 2004-09-27 till todate.
    Could you please inform where I have gone wrong ?

  15. Hi Sunil – you just need to call the function in a different way, otherwise you only get the first return argument and not all of them. Try:


    [hist_date, hist_high, hist_low, hist_open, hist_close, hist_vol] = get_hist_stock_data(‘IBM’);

  16. It works now. Thank you so much.
    It was a silly mistake and proves that I am getting rusty using Matlab.

  17. Hi Lumilog,

    I was looking for a website to download stock time series data going back 5 years when I came upon this great blog…No long ago I did fMRI research to analyze time-series data, and realized while I was doing it that I could probably perform the same or similar statistical analyses to stock time series data to do forcasting–I never got a hold of MATLAB to do my analyses, but wonder where I could download a free copy of the latest version and then try your script for downloading time series data for technology sector stocks from yahoo.finance.

    In any case, it would be great to communicate with you through email to find out more about data pre-processing steps. For fMRI data, there are a series of steps I go through before analyzing the time series–I’m not sure if the assumptions (e.g. normal distribution, homogeneity of variance, etc.) for analysis of stock time series data are different from these and if they require different pre-processing steps as a result (detrending, removing outliers, etc.)

    I appreciate your blog and look forward to hearing from you soon, as your help will go a long way for me.

    Best,
    Lee

  18. Thanks for writing Lee. Unfortunately I know of no free Matlab versions. In the beginning of my research when I didn’t want to pay for matlab, I used a free Java compiler instead (netbeans). It worked but was a little cumbersome since I didnt have lots of experience with Java. Ultimately I decided to bite the bullet and buy my own personal matlab license though to save time.

    The preprocessing steps you mention for fMRI sound almost identical for analyzing market time series, so I guess they are common to all llinear regression and autoregressive models. Other common steps include differencing the data or working with % change instead of $ change and taking natural logs to convert exponential growh rates into linear ones for proper regression.

    Please feel free to email me at any time. See my “About” page link at the top of this page to get my email address (trying to avoid spam by posting it exactly).

    Best,
    Lumi

  19. could you please help as to how to load the fmri data on ma comp using spm toolbox in matlab

  20. Thank you for your short tutorial- I can’t wait to get off work now and mess around with Matlab. I had no idea (but I’m not suprised) that Matlab can use java so easily. My class was given a simple problem to convert different world currencies into US dollars, and with this method I can easily get the data from the web instead of relying upon a static index. I don’t know why this sounds like so much fun.

    Yesterday, I found out about sptool, today I find out about Java, tomorrow… maybe a winning lotto number function?

    Thanks again.

  21. karthik – no experience with spm toolbox. maybe lee will check back with us and respond…

    john – thanks so much for stopping by and leaving a comment. maybe you can fuse the signal processing with currency trading and make a bundle. i’m sure there is more than one way to use an FFT. keep us posted!

    -lumi

  22. Felt like I have to leave a comment for taking advantage of your great work. This is an excellent tool that I will make good use of in my Risk Management course at my university. I’ll myself work on a tool that takes two time series and match the dates between them and post a link to it here later.

    For example. I want to compare ^BVSP with ^DJI. In Brazil, they have different bank holidays than in the US. So, I’ll compare the two date vectors line by line and fill in the missing dates on the other.

    An example:

    change

    ^DJI ^BVSP
    2010-10-13, 55 2010-10-13, 66
    2010-10-12, 44 2010-10-11, 55
    2010-10-11, 33 2010-10-08, 44
    2010-10-08, 22 2010-10-07, 33

    to

    ^DJI ^BVSP
    2010-10-13, 55 2010-10-13, 66
    2010-10-12, 44 2010-10-12, 55
    2010-10-11, 33 2010-10-11, 55
    2010-10-08, 22 2010-10-08, 44

  23. I realized it wasn’t necessary to do such a script. I just used

    todaily(fints(BVSP_date,BVSP_close))

    from Financial Toolbox, and it just put NaN as data in the missing days. all the financial time series functions then ignores the NaN’s.

  24. Simonize – thanks so much for taking the time to leave some comments as well as your code. Let me know if your Risk Mgmt course appears in iTunes U, would be fun to check out. 🙂

    lumi

  25. Hi Lumi.

    Found this blog and I must say, thank you! This is a good script. However, I have a question for you. I’m using your function to screen multiple stocks. I ran into a problem. When I tried retrieving the ticker “AGII”, I received an error message:

    Java exception occurred:
    ice.net.URLNotFoundException: Document not found on server

    at ice.net.HttpURLConnection.getInputStream(OEAB)

    at java.net.URL.openStream(Unknown Source)

    When I looked at this ticker in yahoo finance, there’s no historic data. No wonder it kicked me out. Is there a way to check whether the site has “historic data”? Otherwise, “skip” this ticker and move on the other tickers. I’m not quite an expert on Matlab (and Java for this matter), and I’m actually using your scripts to learn. Any guidance will help.

    Thanks!

    Clarence

  26. Tsaiko – Welcome!

    Clarence – sorry late getting back to you, but glad you figured it out. I sometimes get an error similar to yours. It’s usually b/c I’m trying to look up a ticker that no longer exists, or has been moved to the pink sheets. Sometimes when the Wi-Fi is spotty I get that message too.

    All best,
    Lumi

  27. Nice script. Unfortunately I get this error:

    ??? Undefined function or method ‘get_hist_stock_data’ for input arguments of type ‘char’.

    Any idea on what could be the problem?

  28. Lumilog, I have wanted something like this for a few years and was glad to find your script. Thank you for sharing it.

    I was wondering though, ‘How did you discover this?’

    And, ‘How can I also get the historical market capitalization for a stock?’

    Best Regards,
    Motes

  29. Hi Motes – maybe a bit tricky to get historical market cap. If I were after that I’d probably use the unadjusted closing price from this script and then multiply by the nearest date Shares Outstanding from MSN money as you can see in the example link below.

    http://moneycentral.msn.com/investor/invsub/results/statemnt.aspx?lstStatement=10YearSummary&Symbol=US%3aEXC&stmtView=Qtr

    Once you get comfortable doing this type of data retrieval via Matlab, you could quite easily automate the process by writing a script to retrieve the 10-year shares outstanding history also, and then even do a little linear interpolation to estimate what Shares Outstanding was on any individual date before multiplying by price per share to get market cap.

    Not perfect but maybe better than nothing. Hope that helps!

  30. Lumi,

    I thought I left a comment here couple days ago, but it is not here somehow. So I will try again. I downloaded your matlab problem it works great I really like it. I am wondering if you tried to get option data as well? I am interested to download the daily option data and don’t know where to start yet, can you share your experience if you have?

    Thanks a lot, great blog and Matlab program 😀

  31. It worked this time, even though my original message was much nicer 😀

    Anyway, it is a great tool works real well and I am thinking if I can use it to download option data. I have experience with Matlab and FFT before and want to do the same thing, financial freedom journey 😀

  32. Hi Baixiao – I think your previous comment was left on a different post, because I remember responding to it. ?!

    Anyway, I don’t know of any free source for historical options prices – I wish I did. Let me know if you come across any – and thanks for taking the time to write. Now get those wavelet transforms and FFT engines humming and make us proud!

    -lumi

  33. Do you have any experience with other software like jmp (it is a front end for SAS). I expect I could do the same thing as you are trying. I’m just starting. I have done regression with jmp and and done PCA/PCR with another program called unscrambler for spectral data.
    To start I think I would like to use the algorithms to write covered calls and or hedge a portfolio in a down market.
    I would think that it would be good to be able to run scripts on databases on thier sever and have subsets of data returned. Once you find specific coorelations and probabilities then you only need to access those data sets or use a service for that , correct?

  34. Jaime – de nada!

    Bill – the only other language I’ve used to do this is Java (via Netbeans). The one problem you might run into is I know of no free site to get historical option data from. But I’ve certainly done my share of correlation-style backtesting to try out things like pairs trading. Best of luck to you!

Leave a Reply

Your email address will not be published.