Abstract
In this paper, we address the Optimal Trade Execution (OTE) problem over the limit order book mechanism, which is about how best to trade a given block of shares at minimal cost or for maximal return. To this end, we propose a deep reinforcement learning based solution. Though reinforcement learning has been applied to the OTE problem, this paper is the first work that explores deep reinforcement learning and achieves state of the art performance. Concretely, we develop a deep deterministic policy gradient framework that can effectively exploit comprehensive features of multiple periods of the real and volatile market. Experiments on three real market datasets show that the proposed approach significantly outperforms the existing methods, including the Submit & Leave (SL) policy (as baseline), the Q-learning algorithm, and the latest hybrid method that combines the Almgren-Chriss model and reinforcement learning.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Akbarzadeh, N., Tekin, C., van der Schaar, M.: Online learning in limit order book trade execution. IEEE Trans. Signal Process. 66(17), 4626–4641 (2018). https://doi.org/10.1109/TSP.2018.2858188
Almgren, R., Chriss, N.: Optimal execution of portfolio transactions. J. Risk 3, 5–40 (2001)
Barto, A.G., Mahadevan, S.: Recent advances in hierarchical reinforcement learning. Discrete Event Dyn. Syst. 13(4), 341–379 (2003)
Bertsimas, D., Lo, A.: Optimal control of execution costs - a study of government bonds with the same maturity date. J. Financ. Mark. 1, 1–50 (1998)
Cont, R., Larrard, A.D.: Price dynamics in a Markovian limit order market. SIAM J. Financ. Math. 4(1), 1–25 (2013)
Cont, R., Stoikov, S., Talreja, R.: A stochastic model for order book dynamics. Oper. Res. 58(3), 549–563 (2010)
Feng, Y., Palomar, D.P., Rubio, F.: Robust optimization of order execution. IEEE Trans. Signal Process. 63(4), 907–920 (2015)
Hendricks, D., Wilcox, D.: A reinforcement learning extension to the Almgren-Chriss framework for optimal trade execution. In: 2104 IEEE Conference on Computational Intelligence for Financial Engineering & Economics (CIFEr), pp. 457–464. IEEE (2014)
Huang, W., Lehalle, C.A., Rosenbaum, M.: Simulating and analyzing order book data: the queue-reactive model. J. Am. Stat. Assoc. 110(509), 107–122 (2015)
Jiang, Z., Xu, D., Liang, J.: A deep reinforcement learning framework for the financial portfolio management problem. arXiv preprint arXiv:1706.10059 (2017)
Johnson, J.D., Li, J., Chen, Z.: Reinforcement learning: an introduction-RS Sutton, AG Barto, MIT Press, Cambridge, MA 1998, 322 pp. ISBN 0-262-19398-1. Neurocomputing 35(1), 205–206 (2000)
Kearns, M., Nevmyvaka, Y.: Machine learning for market microstructure and high frequency trading. High Frequency Trading: New Realities for Traders, Markets, and Regulators (2013)
Liang, Z., Jiang, K., Chen, H., Zhu, J., Li, Y.: Deep reinforcement learning in portfolio management. arXiv preprint arXiv:1808.09940 (2018)
Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)
Nevmyvaka, Y., Feng, Y., Kearns, M.: Reinforcement learning for optimized trade execution. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 673–680. ACM (2006)
Palguna, D., Pollak, I.: Non-parametric prediction in a limit order book. In: 2013 IEEE Global Conference on Signal and Information Processing (GlobalSIP), p. 1139. IEEE (2013)
Palguna, D., Pollak, I.: Mid-price prediction in a limit order book. IEEE J. Sel. Topics Signal Process. 10(6), 1 (2016)
Perold, A.F.: The implementation shortfall. J. Portfolio Manag. 33(1), 25–30 (1988)
Rosenberg, G., Haghnegahdar, P., Goddard, P., Carr, P., Wu, K., De Prado, M.L.: Solving the optimal trading trajectory problem using a quantum annealer. IEEE J. Sel. Top. Signal Process. 10(6), 1053–1060 (2016)
Sherstov, A.A., Stone, P.: Three automated stock-trading agents: a comparative study. In: Faratin, P., RodrÃguez-Aguilar, J.A. (eds.) AMEC 2004. LNCS (LNAI), vol. 3435, pp. 173–187. Springer, Heidelberg (2006). https://doi.org/10.1007/11575726_13
Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., Riedmiller, M.: Deterministic policy gradient algorithms. In: Proceedings of the 31st International Conference on International Conference on Machine Learning, ICML 2014, vol. 32, pp. I-387–I-395. JMLR.org (2014)
Sutton, R.S., McAllester, D.A., Singh, S.P., Mansour, Y.: Policy gradient methods for reinforcement learning with function approximation. In: Advances in Neural Information Processing Systems, pp. 1057–1063 (2000)
Xiong, Z., Liu, X.Y., Zhong, S., Walid, A., et al.: Practical deep reinforcement learning approach for stock trading. In: NeurIPS Workshop on Challenges and Opportunities for AI in Financial Services: The Impact of Fairness, Explainability, Accuracy, and Privacy (2018)
Ye, Z., Huang, K., Zhou, S., Guan, J.: Gaussian weighting reversion strategy for accurate on-line portfolio selection. In: 2017 IEEE 29th International Conference on Tools with Artificial Intelligence (ICTAI), pp. 929–936 (2017)
Acknowledgement
This work was supported in part by Science and Technology Commission of Shanghai Municipality Project (#19511120700). Jihong Guan was partially supported by the Program of Science and Technology Innovation Action of Science and Technology Commission of Shanghai Municipality under Grant No. 17511105204 and the Special Fund for Shanghai Industrial Transformation and Upgrading under grant No. 18XI-05, Shanghai Municipal Commission of Economy and Informatization.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Ye, Z., Deng, W., Zhou, S., Xu, Y., Guan, J. (2020). Optimal Trade Execution Based on Deep Deterministic Policy Gradient. In: Nah, Y., Cui, B., Lee, SW., Yu, J.X., Moon, YS., Whang, S.E. (eds) Database Systems for Advanced Applications. DASFAA 2020. Lecture Notes in Computer Science(), vol 12112. Springer, Cham. https://doi.org/10.1007/978-3-030-59410-7_42
Download citation
DOI: https://doi.org/10.1007/978-3-030-59410-7_42
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59409-1
Online ISBN: 978-3-030-59410-7
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)