How to Download Historical Stock Data into Matlab

by Lumilog on November 26, 2006

Greetings! You’ve arrived here most likely because you’ve just realized that you’ve been Wasting Your Talents at your job using Matlab to research new medicines, design fuel-efficient engines, or develop next-generation wireless communication technologies when – duh! – you could put those Matlab skills to much better, MUCH better use turning the stock market into your own ATM. From your boat!

Well here it is – the gateway drug – a few simple lines of code that will let you download Yahoo’s free historical stock price data right into Matlab. Then you’re on your own as to what to do with it, but I’ll bet you’ve learned quite a bit of signal processing mojo over the years that you’re salivating to put to good use: particle filters, neural networks, wavelet transforms, …maybe even Viterbi decoders!

Assuming you already speak Matlab well, there are really just a handful of new things to learn.

First, it might help to know exactly where to download from. Yahoo Finance provides a convenient historical data CSV file for each stock currently trading. Type in any ticker, click on Historical Prices, and scroll down to the bottom…

Tricky Move #1 is to crack the URL genome so that you can generate your own URL on the fly. The URLs looks like this:

http://ichart.finance.yahoo.com/table.csv?s=AAPL&a=10&b=15&c=2005&d=01&e=17&f=2006&g=d&ignore=.csv

With a little (VERY little) trial & error, you might come to discover that after s= goes the ticker symbol, after a= the start month (minus 1), after b= the start day, c= the start year and so on. The final g= parameter lets you choose between getting historical stock information on a daily, weekly, or monthly basis. Construct your desired URL as a Matlab character string (let’s call it url_string).

Tricky Move #2 is to tunnel your way directly to a webpage from within Matlab, no browser required! This is done via Matlab’s java interface and your URL string like so:

Tricky Move #3 is to use that connection to read in the individual lines of the webpage’s source code into a buffer.

It’s really no harder than that – just put that last tidbit in a while loop and parse the buffer each time around to grab the data and store it in a matrix. It’s especially easy here since we’re working with CSV. Parsing HTML from other websites is a little trickier, but I’ll bet you’ll graduate quickly (gateway drug, I tell you).

Still sound too hard? OK, just click below to download my source code for free. See you in St. Maarten!

get_hist_stock_data.m

Just Florida. I’m still tweaking my algorithms…


{ 58 comments… read them below or add one }

Lefteris February 15, 2012 at 6:43 pm

Hello all,
first of all I’d like to thank Lumilog for this very interesting code he shared.
Also I wanted to ask if someone could explain a bit the part where he says that there must be a normalization between the the close and the adjusted close.. What is the relationship between them? And can you explain why?

Also does anyone have experience with time series correlations? Can you suggest any metrics to use??
Thanks a lot in advance..
Regards

Lumilog February 15, 2012 at 6:47 pm

Thanks for stopping by Lefteris. The normalization accounts for 2 things: stock splits and dividends. If you don’t normalize then stock splits could be misinterpreted by any quant algorithm you develop as 50% overnight losses for example. Similarly, if you don’t include the effect of dividends any computation of rate of return will be price moves only – not total returns. Plus stocks (theoretically) drop in price by the amount of the dividend before it’s paid. Hope that helps!

lumi

Lefteris February 16, 2012 at 2:43 pm

Hi again Lumi. First of all thanks a lot for your instant response. Yes I understood exactly what happens. I am an undergraduate computer engineer and now researching the area of stock market and prediction using large datasets. I have some questions concerning the correlation metrics used in order to determine if two time series are correlated or not. As the first step of my essay I am trying to build a Dynamic Bayesian Network that will map those metrics as probabilities that Xt-1 can determine Xt. I am thinking to use auto correlation but I have other thoughts too.. As I see you are a software engineer with knowledge of finance..so just what I need :)
Is there any way you can spend some time chatting with me so I can ask you for your guidance? Do you have skype or another means to reach you?
Thanks once more for your will to help.
Yours,
L.

Lumilog February 19, 2012 at 5:01 pm

Hi Lefteris – I only barely speak Bayesian but sent you an email so will wait to hear back from you that way.

- lumi

Sim Con March 27, 2012 at 3:33 am

If I can be allowed a plug for my own website, there are Excel spreadsheets and Mathcad worksheets for downloading stock data and Forex rates at http://investexcel.net/financial-web-services-kb/

Michael April 13, 2012 at 1:08 pm

Hi Lumi,

I had a problem and was just wondering if I am using a different Matlab version than you.

I put in the link for the url_string. However the line right below that where it should gather the stock symbol after the s, it gives me an error.

url_string = strcat(url_string, ‘&s=’, upper(stock_symbol) );

It is this line of code. Should this line execute without a problem?

Lumilog April 13, 2012 at 1:43 pm

Hi Michael. Are you passing the stock_symbol as a char string too? For example, the argument should be passed as ‘AAPL’, not AAPL.

Michael April 13, 2012 at 6:23 pm

Hi Lumi,

Yep that was my problem. Sorry for such a trivial question. I am fairly new to anything past matrices in Matlab!

Thanks again,

-Michael

Leave a Comment

{ 5 trackbacks }

Next post: