Search
× Search
Saturday, December 21, 2024

Archived Discussions

Recent member discussions

The Algorithmic Traders' Association prides itself on providing a forum for the publication and dissemination of its members' white papers, research, reflections, works in progress, and other contributions. Please Note that archive searches and some of our members' publications are reserved for members only, so please log in or sign up to gain the most from our members' contributions.

Best way for independent confirmation?

photo

 Joe Ellsworth, CTO, trading strategist and principal research scientist at Bayes Analytic, DBA

 Tuesday, November 18, 2014

I have been experimenting with a new predictive algorithm. When trained against 1 minute bar data for the fist 80% of the bars in 2014 it had over a 70% prediction accuracy when tested against the next 20%. It was attempting to predict whether a given stock would be at a higher price 120 minutes in the future. I am immediately suspicious about sucess like this in what is supposed to be a random data set. I can not disclose the details and chase investors who will want unique IP but I am interested in collaborating with others to prove or disprove the algorithm is doing what I think it is. Your ideas on how to structure this kind of collaboration so it is interesting to other engineers while protecting the secret sauce will be most appreciated. I ultimately want to use this and my prior 4 years of stock algorithms to attract a major investor with 4 million dollars. The plan is to spend 2 million on operations and development over a 4 year period and trade with 2 million while building a 3 to 5 year track record. I am currently targeting exit by acquisition which is where we all get the big paycheck. I have to guard the IP but enjoy interaction with other engineers.


Print

3 comments on article "Best way for independent confirmation?"

photo

 Bharath Rao, Entrepreneur

 Friday, November 21, 2014



Joe,

You are right. It's a very good idea to be suspicious of success rates like this. We are a research firm too. You can write to me at bharath@alphamatters.com and let me know your ideas about how we can collaborate.


photo

 Joe Ellsworth, CTO, trading strategist and principal research scientist at Bayes Analytic, DBA

 Friday, November 21, 2014



I am mostly interested in ideas to allow fellow quant professionals to validate the predictive accuracy of others without sharing the underlying secret sauce. There has to be a way to set up test harnesses that would make it difficult for honest professionals to accidentally cheat. You have any ideas?


photo

 Joe Ellsworth, CTO, trading strategist and principal research scientist at Bayes Analytic, DBA

 Saturday, November 22, 2014



OK here is an idea. What if I set up a web service where somebody could post a Training data set. They can remove the symbol name and change the bar dateTimes to a different year or different month as long as the relationships between the bars remains constant. They do need to be 1 minute bars in sequence to test this algorithm and I need about 60K real life bars for training.

I would build a test model for them and give them back a model id. This algorithm is predicting 19 minutes into the future but if I do not know which symbol or the exact time frames they were originally sourced from so it would be tough to cheat. I do want real market data as the market simulation data doesn't reliably produce the same results.

I run my internal splits with either 90% for training and 10% for testing or 80% for training and 20% for testing. I am currently working with 2014 for most of the testing but it has worked just as well for 2013 and 2012 where I have the 1 minute data. I don't have 1 minute data farther back to test.

They can then call the web service with the model-id, posting 18 bars to add data to the existing set and get a prediction for those 18 bars. Since I am predicting 20 minutes into the future I will return predictions for which bars will rise during that 20 minute period. It should be impossible for the software to look into the future. If they repeat this cycle sending the next 19 bars each time they can record my predictions and determine the precision and recall rates. My current 90/10 split has about 6500 rows so they would have to call the service 340 times to test a similar amount of data in a way that I could not possibly cheat. .

I could run it a mode which updates the model after each post which is closest to how we would use it for live trading or I can leave the model alone and just predict forward. One is more accurate while the other is faster.

Now the real question is why would other engineers do the work to help me test in this way? What benefit would they get? What benefit would they want? Will this kind of test help close the sale for the investors I want to attract? or will it be too technical for them to understand?

They could even strip the Bar DateTime off as long as the rows remain in the correct sequence in the CSV file. I would have to know to expect this or it would break my CSV parser. Seems like it should not violate their data contracts if they have removed the identifying symbol and dates from the data. I would promise to delete it after the test anyway.

I could even offer python, or Node.js code to read the CSV, make the posts and accumulate the results so all they have to provide is their CSV file. But then I could possibly cheat and post data through on a back channel (unless they audited my code).

Please login or register to post comments.

TRADING FUTURES AND OPTIONS INVOLVES SUBSTANTIAL RISK OF LOSS AND IS NOT SUITABLE FOR ALL INVESTORS
Terms Of UsePrivacy StatementCopyright 2018 Algorithmic Traders Association