Source: Keith De Lellis Gallery via Bloomberg
Source: Keith De Lellis Gallery via Bloomberg

A portion of the financial internet spent a portion of yesterday discussing a pretty silly chart showing that the performance of the Dow from 2012 through now has been exactly like the performance of the Dow from 1928 through September 1929, and since those who do not learn from charts are doomed to keep drawing them, that means we're headed for an October 1929-style crash. Here is a link to the canonical presentation of the chart, but I have fallen into a bit of a rabbit hole and have recreated it myself, so here you can see my chart:1

That line looks a whole lot like that other line! Woooooooooo spooky.

You can read pretty much your choice of debunkings (here, here, here), which focus primarily on the fact that the axes of that chart -- both mine and the canonical one -- are manipulated in a fairly obvious way. The Dow went up way way more in 1928-1929 than it has in 2012-2014, and you only make the charts look the same by stretching out the current one something awful. Here is the sensible version of the chart, which is distinctly less spooky:2

As Business Insider puts it, "What is in stake for the S&P 500 in the unlikely event that it does end up following the 1929 pattern? A 24% decline from the January 15, 2014 peak — about half the size of 1929's initial 44% crash."

I take it as a given that no sensible person sees a graph that looks like another graph -- especially after having its y-axes manipulated -- and takes it as a guide to conducting his or her life. ("OMG a line looks like another line! Buy gold!") And I take it that you are a sensible person. So, debunking-wise, the end.

Nonetheless, there is an answer to this criticism, which is worth ... I mean, who's to say that anything is worth doing in this life, but I'm just going to arbitrarily assert that this answer is worth addressing, feel free to disagree, you know where the close tab button is. Anyway here's the answer, from MarketWatch's Mark Hulbert:

Another objection I heard two months ago was that there are entirely different scales on the left and right axes of the chart. The scale on the right, corresponding to the Dow’s movement in 1928 and 1929, extends from below 200 to more than 400—an increase of more than 100%. The left axis, in contrast, represents a percentage increase of less than 50%.

But there’s less to this objection than you might think. You can still have a high correlation coefficient between two data series even when their gyrations are of different magnitudes.

Because I am awful I thought, hmm, well, okay, what is the correlation coefficient between those two data series? That is a weirdly unanswerable question but here are three answers.

One, the correlation coefficient between those two data series is 97.7 percent. Which is high! Here, I even drew you a scatterplot, which looks pleasingly correlated-y:3

But there's, um, less to that scatterplot than you might think. Just for giggles I had Excel generate two series of 50 random prices and I got a 95.9 percent correlation between them on my first go. The trick is to make them look just a bit like stock prices, meaning that (1) each daily change is fairly small relative to the starting price (say a 0-3 percent change each way) and (2) over time they mostly go up.4 If you do that you'll get a high correlation. Even though they're random! Even though there's absolutely no reason that one series should predict the other in any way, because each series is just the independent product of an Excel random number generator.

What you won't get is a high correlation of returns, which is what people normally mean when they talk about "correlations" in the stock market. Simplistically speaking, today Apple trades at about $535, and Google trades at about $1,182, so a Google is around two Apples. Two months ago, Google was at around $1,069, and Apple was at around $560, so a Google was around two Apples. So the relationship between Apple's price and Google's price is pretty stable, and their prices are pretty correlated. But of course over those two months, Google went up and Apple went down. What you want is not the correlation of prices, but the correlation of returns: You want to know if Apple usually goes up when Google goes down, or the reverse.

If you look at the correlation of daily returns between the 1928-1930 Dow and the 2012-2014 Dow is about 11.3 percent. Basically no correlation. The chart looks sort of nice from far away, but if you look at it up close, an up day in 1929 "corresponds" to an up day in 2013 barely more than half the time.5 For comparison, over the last 12 months, the correlation of returns of Apple and Google has been about 13 percent; over the last two months it's been 10.7 percent (as Apple has been down and Google up, remember). The correlation of Google and JPMorgan over the last year has been 30 percent. JPMorgan and Goldman, 72 percent.6

What does that tell you? That the spooky Dow chart is nonsense, sure, but you knew that already. Maybe one other thing it tells you is that correlation of daily returns isn't a perfect measure either. I mean -- those two lines on the Dow chart do look similar. The fact that their correlation is so low doesn't mean that they haven't moved together over the course of eighteen months. It just means that they didn't move together on a lot of days over those eighteen months.

But over longer terms, they've moved more closely together. The correlation of weekly returns for those two Dow periods was 23.3 percent, so a bit more than the daily correlation. This makes sense: Both series are mostly up! (I mean, the 1929 one eventually goes down, but I'm not using that in the math because we're not "there" yet in 2014 -- the comparison stops at September 30, 1929). So while each day is up or down, pretty much randomly, each week is mostly up. Sometimes they're down, though, and I guess every so often both series are down in the same week.

This really is not meant in any way as a defense of using that Dow overlay chart as a predictor of the future, or even as a meaningful description of the past. It's just a way to talk about the fact that it can be hard to measure what prices are like what other prices. In particular, correlation of daily returns has a lot of prominent uses in finance, including in hedging and in risk measurement. And it doesn't tell you the whole story. Two assets can be largely uncorrelated in their daily price moves, but still have roughly similar trends over longer periods of time. If you care about the longer term, the correlation may mislead you.

But everything may mislead you, if you're easily misled. If you looked only at the correlation of daily returns between the Dow in 1928-1929 and 2012-2014, you'd never know that, in a certain light, the one chart looked a lot like the other chart. Of course you might be happier not knowing that.

1 How did I build this chart? Data -- DJI HP -- from Bloomberg obviously. I set the time scales at July 2, 2012 = February 20, 1928, which I think is how it's done canonically, I don't know. Then the scaling. I just graph the 1928 Dow straight, so the scale is 1928 Dow points. The 2012 Dow, I graph the equation y = 0.0468x - 402.84, where x = the 2012 Dow. And that equation is just the linear regression of the 2012 Dow against the 1928 one, so it's the cheatingest graph you can get. Just letting Excel show two different y-axes would get you a pretty cheating graph, but not quite as cheating as this.

2 Same time series as the first graph, but I set July 2012 = February 1928 = 1 and just normalize to that. So "1.25" represents 25 percent above the starting point. This is probably what you'd call the non-cheating version though I guess anyone can quibble with anything.

3 This is just each day's 2012-2014 Dow graphed against the "corresponding" day's 1928-1929 Dow. The R^2 in that graph is just the square of the correlation coefficient, or 95.5 percent. The regression equation is of course what I used to scale my first chart.

4 Series one started at 100 and each day added 5*rand() and subtracted 3*rand(). Series two started at 300 and each day added 15*rand and subtracted 6*rand(). Obviously if I were a real quant my simulated daily returns would be normally distributed percentages of that day's price but ehhhhhh. The point is that series 1 in expectation goes up about 1 percent a day (2.5 minus 1.5, since rand() is distributed around 0.5), while series 2 in expectation goes up about 1.5 percent a day (7.5 minus 3 is 4.5 points, and it's out of 300, for no good reason, just to get different numbers, I don't know). So their long-term trends are different, though both up, and their short-term price moves are independent and random.

Here's a "price" chart:

And a scatterplot:

I really did get these numbers on the first or possibly second time pressing the button, rand() is jumpy at updating. Repeated casual button-pushing got me one result as low as 69 percent but most in the 80s and 90s, with some exceeding 98 percent. Obviously not lognormal blah blah blah but random, is the point.

5 Meanwhile, my random number series have a correlation of returns of 8 percent, since you asked.

6 Correlation of daily returns, via a handy Bloomberg CORR function: