Abstract
We analyze a fixed-point algorithm for reinforcement learning (RL) of optimal portfolio mean-variance preferences in the setting of multivariate generalized autoregressive conditional-heteroskedasticity (MGARCH) with a small penalty on trading. A numerical solution is obtained using a neural network (NN) architecture within a recursive RL loop. A fixed-point theorem proves that NN approximation error has a big-oh bound that we can reduce by increasing the number of NN parameters. The functional form of the trading penalty has a parameter ϵ >0 that controls the magnitude of transaction costs. When ϵ is small, we can implement an NN algorithm based on the expansion of the solution in powers of ϵ. This expansion has a base term equal to a myopic solution with an explicit form, and a first-order correction term that we compute in the RL loop. Our expansion-based algorithm is stable, allows for fast computation, and outputs a solution that shows positive testing performance.
Original language | English (US) |
---|---|
Pages (from-to) | 16774-16792 |
Number of pages | 19 |
Journal | IEEE Access |
Volume | 11 |
DOIs | |
State | Published - 2023 |
Keywords
- Hetereoskedasticity
- MGARCH
- deep neural networks
- fixed-point algorithms
- reinforcement learning
ASJC Scopus subject areas
- General Computer Science
- General Materials Science
- General Engineering
- Electrical and Electronic Engineering