This is A Trading Tale. It deals with what is called Data Mining and thus the title. The subtitle is “Fools Gold” and I use it to help you remember the tale’s lesson. Fools Gold is the name given to a mineral which looks like gold but is actually worthless. The earliest origin of the term I can uncover is attributed to a Martin Frobisher and his return to England from a voyage to find the North West Passage in 1576 with a cargo of this supposed ‘gold mineral’. The term has come to denote any apparent treasure trove that turns out to be worthless. 450 years later we have our own version of fools gold—it’s called the financial markets. Just like the unknowledgeable prospectors of old thought they had hit the mother-lode, so do many of today’s data prospectors or data miners.
So what is Data Mining? Data Mining is a technique applied to many disciplines where the researcher examines data and tries to find patterns. Humans are great at finding patterns and with the aid of computers and fancy algorithms never have so many patterns been found in financial data. Test any hypothesis on a stream of data or a time series and you will find a pattern. What distinguishes the professional from the amateur pattern finder is discipline. The professional finds a pattern and then tries to prove the pattern is false. The amateur thinks they’ve discovered the next best thing to sliced bread and goes about trading this only known to them pattern until you guessed it—it stops working, they lose money and off they go in search of new patterns.
Investing is probably the greatest laboratory for the masses or amateur scientists to recognize patterns. In his book, The Alchemy of Finance, the legendary investor George Soros talks about the financial markets as a type of financial laboratory where a person can express their belief about how the world operates with their money. In its most basic form, every time someone buys or sells something they are telling the world—I have found a pattern and it tells me this is how I think the world works. The source of this belief is some type of data mining. It can come in many forms. If you fancy yourself a “Value Investor” all you are saying is that you look for a particular type of pattern. A “Growth Investor” looks for a different pattern. The “Tactical Asset Allocator” seeks something else altogether. This tale will show you a pattern that I’ve discovered that has worked so well since August 12, 1982 that the amateur trader might want to go out and trade it immediately. But beware, there are shifting patterns. Beware of the Fools Gold.
What is this pattern you might ask? It’s a simple pattern. You download data on the SP500 from August 12, 1982 through September 24, 2010 and ask your computer to perform the following or test for the following; Buy the SP500 at yesterday’s high if and only if it reaches yesterday’s high and sell it at the closing price today. This is a simple day-trade. The results are remarkable. A $100,000 investment to this trading system grew to over $33 million dollars over the roughly 28 year period. A simple Buy and Hold in the same SP500 during the same time frame grew to approximately $1.1 million. This approach made almost 30 times more than Buy and Hold. To the amateur this seems easier than taking candy from a baby. It’s a no-brainer. But let’s look at it through a different microscope. Let’s try to see what’s wrong with this.
The first thing I do whenever I try to disprove a hypothesis is look at a logical antithesis. It’s just what I like to do, others have their own techniques. So I ask the question. Ok buying yesterday’s high if and only if today’s price gets to yesterday’s high and selling at today’s close grows $100,000 to more than $33 million. What if I bought at yesterday’s low instead if and only if today’s price gets to yesterday’s low and sell at today’s close. When I did this the results were reinforcing. The same $100,000 grew to $2263 over the 28 year time frame. This is fantastic. Buying yesterday’s high and avoiding yesterday’s low over the last 28 years has worked. The next part is crucial. By now you have a hypothesis that you must test on what researchers call an out of sample data set. All this means is does your hypothesis work over a time frame that you didn’t include while developing your hypothesis.
To test my hypothesis I decided to run the same trading system or rules for the period January 2, 1962 through the magical August 12, 1982. The results were startling but as any professional would have anticipated. $100,000 invested in the top performing system from 1982-present, the one that grew to over $33 million lost money when run over the previous 20 years. In fact it was so bad that $100,000 was only worth $126 on August 12, 1982. This is a stunning 99.9% loss in your original capital. Not a very good system is it?
As always I am curious so I tested the logical antithesis of the system that performed so poorly over the last 28 years for the same time frame of January 2, 1962 through the magical August 12, 1982 to see how it performed. By now you can guess what happened. In this case $100,000 grew to over $24 million. The exact system that performed so poorly over the last 28 years was a gold mine over the previous 20 years.
I just depicted a classic case of data mining. How should someone interpret these results? Should you trade based on this information? The answer is up to you. What I think is most interesting is the extreme case. What if you could develop an approach where you can tell when market patterns are shifting? What if you knew to buy the lows from 1962 through 1982 and then buy the high from 1982 through today? In this case your roughly $24 million in 1982 would have grown 30 fold to approximately 8 billion dollars. Not too shabby for a 48 year period. Of course this is just fantasy and the classic data mining trap. What I find curious is the degree with which the SP500 changed characteristics since 1962. This change in character is at the heart of all research and why any approach that you develop towards investing must be rooted in sound principles. There is much work being done in this area that I characterize as disentanglement. At the root of disentanglement or market characterization is what you are trading which in this example was the SP500 and how it relates to other variables. If you can determine when a market will change character you have truly found gold. Otherwise, you need to stick with sound principles and trading techniques.
Earlier I said that I like to look at a logical antithesis before I look to do out of sample data testing. The reason I go there first is because there are so many data mining traps that you must be ever-diligent. For example, had someone back-tested the period 1996-2010 using the buy yesterday’s high rule they would have found a great approach. Had they then used the period 1982-1996 as their out of sample test they would have confirmed the result. They may have jumped right in and started trading that way based on this “Compelling” market truth. However, they have no idea if they were about to hit a period just like the one from 1962-1982. They have no idea if they are about to hit a period a year or two or more from now where the markets change character. So this is my caution or like I like to say, “Everything works, until it doesn’t.”
The last thought I have on this topic is best explained by referring to A Tale of Identity. While I believe that people can find trading approaches that are sound and have as their basis universal truths that take advantage of human frailty, it doesn’t mean they are capable of executing their discovery. Said differently, even if someone shows you a system that works it doesn’t mean it will work for you. You will only make money over a lifetime if the system or approach you use is real and if you can execute it. Anything other than this combination is either luck or spells disaster. My suggestion is to find someone that knows where the gold is.