Recurrent neural networks are well suited for temporal data. There is abundant work when training data are well defined sequences on the encoding and decoding side in the context of sequence-to-sequence modeling. Consider language based tasks such as sentiment analysis. A sentence is well defined and maps into the encoder. On the decoding side a single prediction is made (positive, neutral, negative). In financial data, a sequence can go back 10 order book updates, 131 of them, or even 1,000. The same issues is present on the decoding side; do we want to make the prediction for the next one second, one minute, or one hour in increments of 5 minutes?
The length of the input sequence remains a challenging problem and is subject to trial-and-error.
We were able to make advances on how to output only confident predictions in a dynamic fashion. In a very volatile market, a model should be able to reliably make only short term recommendations, while in a stable one, the confidence should increase and more predictions should be made. This is the trait of our new model.
Standard models have a fixed number of layers (think about the number of neurons in each time step). In a challenging market, one should spend more time exploring the patterns and learning while we can only skim and move on in easy times. There is no reason why a model should not follow the same strategy. Another family of models discussed are adaptive computational time that lead naturally to some of the challenges related to time series data. These models dynamically allocate the number of layers in each time and thus the hardness of computation in each time is controlled. First, data scientists do not need to fine tune the number of layers, and, second, the model allocates a lot of time to hard portions of a sequence and just one layer/neuron to easy parts.
All these novel aspects have been tested on a few financial data sets predicting prices of ETLs and commodities. The prediction power is drastically improved by using these enhancements. One vexing challenge remains in evaluation. How does better prediction of prices translate into actual trading and P&L? Stay tuned; to-be-seen.
The length of the input sequence remains a challenging problem and is subject to trial-and-error.
We were able to make advances on how to output only confident predictions in a dynamic fashion. In a very volatile market, a model should be able to reliably make only short term recommendations, while in a stable one, the confidence should increase and more predictions should be made. This is the trait of our new model.
Standard models have a fixed number of layers (think about the number of neurons in each time step). In a challenging market, one should spend more time exploring the patterns and learning while we can only skim and move on in easy times. There is no reason why a model should not follow the same strategy. Another family of models discussed are adaptive computational time that lead naturally to some of the challenges related to time series data. These models dynamically allocate the number of layers in each time and thus the hardness of computation in each time is controlled. First, data scientists do not need to fine tune the number of layers, and, second, the model allocates a lot of time to hard portions of a sequence and just one layer/neuron to easy parts.
All these novel aspects have been tested on a few financial data sets predicting prices of ETLs and commodities. The prediction power is drastically improved by using these enhancements. One vexing challenge remains in evaluation. How does better prediction of prices translate into actual trading and P&L? Stay tuned; to-be-seen.