Unknown

Dataset Information

0

Asynchronous Deep Double Dueling Q-learning for trading-signal execution in limit order book markets.


ABSTRACT: We employ deep reinforcement learning (RL) to train an agent to successfully translate a high-frequency trading signal into a trading strategy that places individual limit orders. Based on the ABIDES limit order book simulator, we build a reinforcement learning OpenAI gym environment and utilize it to simulate a realistic trading environment for NASDAQ equities based on historic order book messages. To train a trading agent that learns to maximize its trading return in this environment, we use Deep Dueling Double Q-learning with the APEX (asynchronous prioritized experience replay) architecture. The agent observes the current limit order book state, its recent history, and a short-term directional forecast. To investigate the performance of RL for adaptive trading independently from a concrete forecasting algorithm, we study the performance of our approach utilizing synthetic alpha signals obtained by perturbing forward-looking returns with varying levels of noise. Here, we find that the RL agent learns an effective trading strategy for inventory management and order placing that outperforms a heuristic benchmark trading strategy having access to the same signal.

SUBMITTER: Nagy P 

PROVIDER: S-EPMC10561243 | biostudies-literature | 2023

REPOSITORIES: biostudies-literature

altmetric image

Publications

Asynchronous Deep Double Dueling Q-learning for trading-signal execution in limit order book markets.

Nagy Peer P   Calliess Jan-Peter JP   Zohren Stefan S  

Frontiers in artificial intelligence 20230925


We employ deep reinforcement learning (RL) to train an agent to successfully translate a high-frequency trading signal into a trading strategy that places individual limit orders. Based on the ABIDES limit order book simulator, we build a reinforcement learning OpenAI gym environment and utilize it to simulate a realistic trading environment for NASDAQ equities based on historic order book messages. To train a trading agent that learns to maximize its trading return in this environment, we use D  ...[more]

Similar Datasets

| S-EPMC8255954 | biostudies-literature
| S-EPMC3635219 | biostudies-literature
| S-EPMC8961101 | biostudies-literature
| S-EPMC5533438 | biostudies-literature
| S-EPMC6296528 | biostudies-literature
| S-EPMC6899644 | biostudies-literature
| S-EPMC4226548 | biostudies-literature
| S-EPMC8093280 | biostudies-literature
| S-EPMC5333634 | biostudies-literature
| S-EPMC6710819 | biostudies-literature