TY - GEN
T1 - Risk-Aware Linear Bandits with Application in Smart Order Routing
AU - Ji, Jingwei
AU - Xu, Renyuan
AU - Zhu, Ruihao
N1 - Publisher Copyright:
© 2022 Owner/Author.
PY - 2022/11/2
Y1 - 2022/11/2
N2 - Motivated by practical considerations in machine learning for financial decision-making, such as risk-aversion and large action space, we initiate the study of risk-aware linear bandits. Specifically, we consider regret minimization under the mean-variance measure when facing a set of actions whose reward can be expressed as linear functions of (initially) unknown parameters. We first propose the Risk-Aware Explore-then-Commit (RISE) algorithm driven by the variance-minimizing G-optimal design. Then, we rigorously analyze its regret upper bound to show that, by leveraging the linear structure, the algorithm can dramatically reduce the regret when compared to existing methods. Finally, we demonstrate the performance of the RISE algorithm by conducting extensive numerical experiments in a synthetic smart order routing setup. Our results show that the RISE algorithm can outperform the competing methods, especially when the decision-making scenario becomes more complex.
AB - Motivated by practical considerations in machine learning for financial decision-making, such as risk-aversion and large action space, we initiate the study of risk-aware linear bandits. Specifically, we consider regret minimization under the mean-variance measure when facing a set of actions whose reward can be expressed as linear functions of (initially) unknown parameters. We first propose the Risk-Aware Explore-then-Commit (RISE) algorithm driven by the variance-minimizing G-optimal design. Then, we rigorously analyze its regret upper bound to show that, by leveraging the linear structure, the algorithm can dramatically reduce the regret when compared to existing methods. Finally, we demonstrate the performance of the RISE algorithm by conducting extensive numerical experiments in a synthetic smart order routing setup. Our results show that the RISE algorithm can outperform the competing methods, especially when the decision-making scenario becomes more complex.
KW - algorithmic trading
KW - bandit
KW - machine learning theory
KW - mean-variance
KW - online learning
KW - regret analysis
KW - risk-aware decision-making
KW - smart order routing
UR - http://www.scopus.com/inward/record.url?scp=85142502536&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85142502536&partnerID=8YFLogxK
U2 - 10.1145/3533271.3561692
DO - 10.1145/3533271.3561692
M3 - Conference contribution
AN - SCOPUS:85142502536
T3 - Proceedings of the 3rd ACM International Conference on AI in Finance, ICAIF 2022
SP - 334
EP - 342
BT - Proceedings of the 3rd ACM International Conference on AI in Finance, ICAIF 2022
PB - Association for Computing Machinery, Inc
T2 - 3rd ACM International Conference on AI in Finance, ICAIF 2022
Y2 - 2 November 2022 through 4 November 2022
ER -