Hi,
One should wonder: what it the purpose of this forum: is it trying to find the next new TA indicator that seems to correlate with price on a certain timeframe & ccy pair? Is it discussing new technologies in trading? Is it about theoretical background? I think the latest
article on the mql4.com website is a great way of finding trading rules that actually work, and this goes more into the direction of datamining than the average forum discussion about indicators etc.
I don't want this thread to be about trading rules, but about optimization, and optimization only.
The fact of the matter is that in-sample optimization is a matter of software ability: the better your technique (be it genetic algorithms to optimize technical trading rules or neural networks), the better you are able to fit your trading model to the in-sample data (the testing history). The problem for 99.99% of the systems is that once it goes out-of-sample (real trading on live data, or testing the optimized system on a new period), the performance drops back drastically.
Maybe we can find a quantitative methodology to express the difference between curve-fitting (overoptimization) and optimizing (adapting setting and rules to get a better performance, without capturing too much properties of the in-sample timeseries).
Now I once read somewhere that to avoid curve fitting, you should look at the sum of squares and once the graph starts turning back up, you should stop to avoid overoptimization. Something else I read is that you should test the variances (st dev squared) and once the difference between the variances of the in-sample and the test results on an out-of-sample timeseries is statistically different (with an ANOVA test), you have over-optimized.
I am looking forward to hearing feedback and opinions on quantitative (statistical) methods to avoid over-optimization!
(Mean square error, regression values, ...)