[go: up one dir, main page]

0% found this document useful (0 votes)
87 views8 pages

Deep Learning For Options Trading: An End-To-End Approach

This document presents a novel deep learning approach for options trading that directly learns optimal trading signals from market data, bypassing the need for traditional pricing models and assumptions about market dynamics. The authors demonstrate that their end-to-end framework significantly outperforms existing rules-based strategies in managing a portfolio of options, particularly when incorporating turnover regularization to enhance performance amidst high transaction costs. The study utilizes a comprehensive dataset of S&P 100 equity options and provides evidence of improved risk-adjusted returns over a decade of backtesting.

Uploaded by

Sujay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
87 views8 pages

Deep Learning For Options Trading: An End-To-End Approach

This document presents a novel deep learning approach for options trading that directly learns optimal trading signals from market data, bypassing the need for traditional pricing models and assumptions about market dynamics. The authors demonstrate that their end-to-end framework significantly outperforms existing rules-based strategies in managing a portfolio of options, particularly when incorporating turnover regularization to enhance performance amidst high transaction costs. The study utilizes a comprehensive dataset of S&P 100 equity options and provides evidence of improved risk-adjusted returns over a decade of backtesting.

Uploaded by

Sujay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Deep Learning for Options Trading:

An End-To-End Approach
Wee Ling Tan∗ Stephen Roberts Stefan Zohren
weeling@robots.ox.ac.uk sjrob@robots.ox.ac.uk stefan.zohren@eng.ox.ac.uk
Oxford-Man Institute of Quantitative Oxford-Man Institute of Quantitative Oxford-Man Institute of Quantitative
Finance Finance Finance
University of Oxford University of Oxford University of Oxford
Oxford, United Kingdom Oxford, United Kingdom Oxford, United Kingdom
arXiv:2407.21791v1 [q-fin.PM] 31 Jul 2024

ABSTRACT Merton [25], regard options as redundant assets. While these ap-
We introduce a novel approach to options trading strategies using proaches offer a tractable no-arbitrage pricing framework, they
a highly scalable and data-driven machine learning algorithm. In necessitate thorough specifications of underlying dynamics and
contrast to traditional approaches that often require specifications require assumptions such as a frictionless market and the ability of
of underlying market dynamics or assumptions on an option pric- market makers to perfectly hedge their exposures. Practically, these
ing model, our models depart fundamentally from the need for models are often vulnerable to model misspecification and are usu-
these prerequisites, directly learning non-trivial mappings from ally too simplistic to sufficiently account for empirically observed
market data to optimal trading signals. Backtesting on more than variations in option returns [6]. Recent works have challenged this
a decade of option contracts for equities listed on the S&P 100, framework, demonstrating that option prices are influenced by ad-
we demonstrate that deep learning models trained according to ditional risks beyond the underlying asset’s exposure [9, 13, 15],
our end-to-end approach exhibit significant improvements in risk- with several studies documenting the existence of mispricing in the
adjusted performance over existing rules-based trading strategies. options market [1, 12].
We find that incorporating turnover regularization into the models Despite the accelerating growth of the options market and the
leads to further performance enhancements at prohibitively high evidence of options mispricing, there is a surprising lack of research
levels of transaction costs. on a scalable machine learning-based strategy that is able to di-
rectly leverage the abundance of data to conduct options trading
KEYWORDS on behalf of an active investor. In this work, we address this gap
by introducing a novel class of deep learning models capable of
Options, Derivatives, Trading Strategies, Machine Learning, Mo-
managing and trading a portfolio of options in a highly data-driven
mentum, Mean-Reversion
approach.
Previous research has conventionally approached the complexi-
1 INTRODUCTION ties of managing portfolios of options by developing methods for
Options form a class of derivatives known as contingent claims, optimal hedging and accurately pricing options. Our model de-
granting one of the counterparties the right to transact an under- parts fundamentally from the need for these prerequisites, and
lying asset at a certain time in the future and at a specific price. are directly optimized to learn highly non-trivial mappings from
For instance, the buyer of a call (put) stock option pays a premium observed data to optimal trading decisions based on risk-adjusted
to the seller for the right to buy (sell) the underlying stock by a performance. The resulting end-to-end framework does not depend
specified expiration date and strike price. Unlike in cash markets on specific market dynamics, and can be extended broadly across
that typically exhibit linear payoffs, the ability to generate synthetic instruments where market data is available.
and highly non-linear exposures has made options an increasingly To illustrate our approach, we construct the framework by train-
important tool among investors as trading instruments alongside ing end-to-end neural networks of different complexities and show
other derivatives [4]. that our models are able to outperform existing rules-based strate-
The options market has continued to grow significantly over gies in managing a portfolio of options, demonstrating strong risk-
the last decade. According to the Options Clearing Corporation, adjusted returns over a backtest period of over a decade. We subse-
the average daily combined volume of US equity and non-equity quently extend our models to incorporate turnover regularization
options has steadily risen and roughly tripled over the past decade during optimization, leading to further performance enhancements
from about 16.3 million in 2013 to 44.2 million contracts in 2023 in the presence of high transaction costs.
[8]. Given the increasing popularity of options trading by market
participants conducted in modern electronic exchanges, there exists
an unprecedented opportunity to engage in large-scale data analysis 2 RELATED WORK
of options trading from the perspective of an active investor. To place our work within the context of existing research on options,
Classical option pricing models and their parametric variants, we briefly review the broad literature in three areas: replication and
stemming from the seminal works of Black and Scholes [5] and hedging, pricing and valuation, and predicting option returns. We
subsequently demonstrate how our work contributes to existing
∗ Corresponding Author systematic trading strategies within options markets.
Wee Ling Tan, Stephen Roberts, and Stefan Zohren

2.1 Options Literature volatility-based strategies, trend-based strategies often do not re-
Traditionally, options have been extensively studied due to their quire additional assumptions on any underlying option pricing
significance in both replication and hedging strategies. The works model or specifications of market dynamics.
of [7] and [22] apply reinforcement learning techniques towards
approximating policies for optimal hedging in the presence of trad- 2.3 Momentum, Mean-Reversion and
ing costs. While these works are pertinent to a product or liquidity Applications of Machine Learning
provider needing to accurately hedge its exposures to books of Trend-based strategies consists of primarily momentum and mean-
options and other complex derivatives, our analysis fundamentally reversion strategies. Momentum strategies operate on the principle
differs in its focus on options trading from the standpoint of an that asset returns exhibit a tendency to persist in the same direction
active investor seeking to profit from an options trading strategy and aim to trade in the direction of the trend [20, 26]. On the other
and whose main objective is not predominantly about neutraliz- hand, mean-reversion strategies adopt an opposite and contrarian
ing exposures. Furthermore, hedging techniques which are based view, taking on positions that bet on the eventual break down
on reinforcement learning often require generating or simulating and correction of overextended trends [11, 29]. Recently, machine
possible market paths to arrive at an optimal trading strategy. In learning models have been increasingly utilized in trend-based
contrast, the solution we offer does not require any assumption or trading strategies [24, 27, 28, 32, 35, 36].
simulation of market processes, and scales with available historical While momentum strategies have been extensively documented
data and compute. in a range of asset classes from cash equities [30] to futures mar-
There has also been significant emphasis placed on the need to kets [26], these strategies have received relatively less attention in
accurately price or value an option in order to facilitate optimal options markets. Heston et al. [15] document strong evidence of
hedging and trading, with notable contributions from the classi- the presence of momentum in options markets and propose a series
cal Black-Scholes-Merton option pricing model for European-style of rules-based momentum strategies. However, these rules-based
options [5, 25]. Non-parametric models have also been developed, strategies often require explicit specifications for the trading rule,
with [18] and [19] using neural networks to approximate the mar- often with insufficient evidence to justify selecting a certain rule
ket’s option pricing function. Conversely, our framework focuses over another. In light of these observations, we employ deep neural
on automatically extracting features and making trading decisions networks to automatically learn risk-adjusted trading rules in a
directly from available data, effectively circumventing the need for data-driven approach, and provide a thorough comparison between
engineering an option pricing or valuation model. these trend-based strategies.
Several studies have also employed machine learning for predict-
ing option returns, often framing the trading strategy as a standard
3 OVERVIEW OF DATASET
regression problem and subsequently determining the direction of
price movements based on forecasted returns. Bali et al. [1] use both Our dataset consists of option contracts sourced from the Option-
linear and nonlinear models and ex-ante option-based and stock- Metrics Ivy DB database, comprising end-of-day bid-ask prices,
based characteristics to predict monthly returns of delta-hedged implied volatility and Greeks of individual options. We indepen-
options, demonstrating high out-of-sample forecast accuracy. How- dently verify the accuracy of the provided implied volatility and
ever in the case of a trend-following strategy, [24] shows that mod- Greeks using a binomial tree model of Cox et al. [10] only for the
els that accurately predict returns do not necessarily guarantee a purpose of initial delta hedging (refer to Section 4), but otherwise
positive or superior strategy performance, given that the overall we do not assume any underlying option pricing model throughout
profitability of trading strategies is influenced by other factors in- this work. We focus our analysis on option contracts of equities
cluding the distribution of returns, position sizes and the presence listed on the S&P 100 Index, as these companies span a wide range
of risk adjustments like volatility targeting [14]. Taking this into of sectors, representing major large-cap optionable companies in
consideration, our work effectively integrates both trend prediction the US market, and are associated with higher option liquidity. We
and optimal position sizing simultaneously within a single end-to- obtain underlying stock prices from the Center for Research in
end function, eliminating the need to forecast option returns in Security Prices (CRSP), which we use for computing option money-
making trading decisions. ness and accounting for corporate actions such as stock splits. We
perform backtesting with the most recent market data (as of this
work) from 2010 to 2023, which notably includes the market selloff
following the COVID-19 pandemic.
We impose a series of data filters established in the literature to
2.2 Systematic Strategies and Risk Premia in ensure the consistency of our analysis. We consider only standard
Options Markets monthly option contracts that expire on the third Friday of the
Our work contributes to a series of research documenting system- month and exclude any special settlements due to corporate actions.
atic strategies originating from studies demonstrating additional We then exclude options that contain price observations that breach
risk factors [6, 9, 13, 15, 34], and the existence of mispricing [1, 12] American-style option bounds, having a bid price of zero, or where
within the options market. Considering the broad range of pos- the ask price is smaller or equal to the bid price, and require options
sible risk factors and strategies, we concentrate our analysis on to have a positive open interest on the day of portfolio formation.
systematic trend-based strategies. Given their simplicity of being In addition, we disregard options that contain one or more missing
based primarily on historical price trends and returns, and unlike observations between the day of portfolio formation and expiration.
Deep Learning for Options Trading: An End-To-End Approach

On the expiration day of each month and for each stock, we form options and therefore ignore the possibility of early assignment of
portfolios of static delta-neutral straddle options by selecting a pair short options.
of call and put contracts with identical strike prices that are closest (𝑖,straddle)
Here, 𝑋𝑡 ∈ [−1, 1] denotes the trading signal or position
to at-the-money (ATM) and expire in the following month. Since (𝑖,straddle)
for the straddle option of stock 𝑖 at day 𝑡, and 𝑟𝑡,𝑡 +1 is the
it is usually not possible to select options with moneyness exactly
realized returns of the straddle from 𝑡 to 𝑡 + 1. We find evidence
equal to 1.0 (moneyness of call = 𝑆/𝐾, put = 𝐾/𝑆 where 𝐾 = strike
that straddle options in the cross-section of individual constituents
price, 𝑆 = stock price), we select the pair of options that are closest
of the S&P 100 exhibit different levels of volatility using a Levene’s
to ATM within a moneyness range of 0.95 to 1.05.
test at a significance level of 1%. Given these differences, we include
Our resulting universe consists of 29984 options that are traded,
volatility targeting at the level of individual straddle options to
with a total of 603068 daily returns observations throughout the (𝑖,straddle)
backtest period. The returns of straddle options have positive means scale the realized returns 𝑟𝑡,𝑡 +1 by their volatility in order to
(1.41% monthly), large standard deviations (90.85% monthly), and target equal assignments of risk. We set the annualized volatility
(𝑖,straddle)
significant positive skewness as indicated by a low median (-15.61% target 𝜎tgt to be 15% and estimate the ex-ante volatility 𝜎𝑡
monthly). We provide an in-depth explanation on portfolio forma- with a 20-day exponentially weighted moving standard deviation
tion and returns computation in Section 4. of daily straddle returns.
All of the following trend-based benchmarks which we incor-
4 SYSTEMATIC OPTIONS TRADING porate in our work adhere to this general framework and are con-
(𝑖,straddle)
STRATEGIES cerned with constructing an accurate trading signal 𝑋𝑡 :
Let 𝑖 = 1, 2, · · · , 𝑁𝑡 denote individual underlying stocks. For a given Long Only (Short Only). This strategy takes a maximum long
portfolio of (1-month, ATM, static delta-neutral) straddle options (𝑖,straddle)
(short) position 𝑋𝑡 = 1 or −1 for all straddle options in
of these stocks that is rebalanced daily, the overall returns of a
the portfolio. Since the performance of a short only strategy is the
strategy that equally diversifies over 𝑁𝑡 straddles at day 𝑡 can be
exact opposite of a long only strategy, we focus only on the long
expressed as follows:
only strategy.
𝑁𝑡
!
STRATEGY 1 ∑︁ (𝑖,straddle) 𝜎tgt (𝑖,straddle) TSMOM (TSMR). Following the time-series momentum strat-
𝑟𝑡,𝑡 +1 = 𝑋 𝑟𝑡,𝑡 +1 (1)
𝑁𝑡 𝑖=1 𝑡 (𝑖,straddle)
𝜎 𝑡 egy (TSMOM) of Moskowitz et al. [26], we adopt the strategy with
a monthly horizon. The position taken for a straddle option is
whereby based on the sign of the option’s returns over the past 20 days:
(𝑖,straddle) (𝑖,straddle)
(𝑖,straddle) (𝑖,straddle) 𝑋𝑡 = sgn(𝑟𝑡 −20,𝑡 ). In the time-series mean reversion
(𝑖,straddle) 𝑝 − 𝑝𝑡
𝑟𝑡,𝑡 +1 = 𝑡 +1 (2) strategy (TSMR), we modify TSMOM to take on a negative load-
(𝑖,straddle) (𝑖,straddle) (𝑖,straddle)
𝑝𝑡 ing on past returns, where 𝑋𝑡 = −sgn(𝑟𝑡 −20,𝑡 ). This
(𝑖,straddle) (𝑖,call) (𝑖,call) (𝑖,put) (𝑖,put) strategy takes a contrarian approach by taking on a long (short)
𝑝𝑡 = 𝑤 norm 𝑝𝑡 + 𝑤 norm 𝑝𝑡
position for straddles with negative (positive) returns over the past
(𝑖,call) 𝑤 (𝑖,call) 20 days.
𝑤 norm =
𝑤 (𝑖,call) + 𝑤 (𝑖,put)
(𝑖,put) (𝑖,call) MACD (MACDMR). We use volatility normalised moving aver-
𝑤 norm = 1 − 𝑤 norm
age convergence divergence (MACD) indicators based on Baz et al.
(𝑖,put)
𝑤 (𝑖,call) = −Δ0 [2] in place of the sign of returns for estimating the trading signal:
(𝑖,call)
𝑤 (𝑖,put) = Δ0
MACD(𝑖, 𝑡, 𝑆, 𝐿) = 𝑚(𝑖, 𝑡, 𝑆) − 𝑚(𝑖, 𝑡, 𝐿)
On the day of portfolio formation (𝑡 = 0), we construct static delta- MACD(𝑖, 𝑡, 𝑆, 𝐿)
neutral straddle options by holding the call and put options with MACDnorm (𝑖, 𝑡, 𝑆, 𝐿) =
std(𝑝𝑡 −5:𝑡 )
(𝑖,put) (𝑖,call)
weights −Δ0 and Δ0 respectively, with Δ0 representing (𝑖,straddle) MACD norm (𝑖, 𝑡, 𝑆, 𝐿)
initial deltas. We subsequently normalize both weights to sum to 𝑌𝑡 =
std(MACDnorm (𝑖, 𝑡 − 20 : 𝑡, 𝑆, 𝐿))
one, resulting in a respective weightage of the call and put options 3
that is generally close to 50-50. In evaluating the price of the straddle, (𝑖,straddle) 1 ∑︁ (𝑖,straddle)
𝑋𝑡 = 𝜙 (𝑌𝑡 (𝑆𝑘 , 𝐿𝑘 )) (3)
(𝑖,call) (𝑖,put) 3
we take 𝑝𝑡 and 𝑝𝑡 to be the bid-ask midpoints of the call 𝑘=1
and put options.
Consistent with other works [13, 15], we focus on static delta- where MACD(𝑖, 𝑡, 𝑆, 𝐿) denotes the MACD value of the straddle
neutral straddles as these instruments are on average invariant to option of stock 𝑖 at day 𝑡 with a short time scale 𝑆 and long time
movements in the underlying. [33] demonstrate that performing scale 𝐿. 𝑚(𝑖, 𝑡, 𝑗) is defined as the exponentially weighted moving
one-time delta hedging at initiation neutralizes the overall direc- average of the straddle’s prices at day 𝑡, with a time scale 𝑗 corre-
tional risks associated with an option by about 70%. As such, we opt sponding to a half-life of 𝐻𝐿 = log(0.5)/log(1 − 1/𝑗). We combine
for delta hedging at initiation and following similar works, we dis- MACD signals in an equally weighted sum over multiple short
regard the early exercise premium embedded in the American-style and long time scales 𝑆𝑘 ∈ {2, 4, 8}, 𝐿𝑘 ∈ {8, 16, 32} to construct a
Wee Ling Tan, Stephen Roberts, and Stefan Zohren

−𝑦 2
(𝑖,straddle) 𝑦 exp( 4 ) model 𝑓 with trainable parameters 𝜽 :
position 𝑋𝑡 where 𝜙 (𝑦) =
0.89 . Likewise, we con-
(𝑖,straddle) (𝑖,straddle) (𝑖,straddle)
sider both the momentum 𝑋𝑡 (MACD) and mean reversion 𝑋𝑡 = 𝑓 (u𝑡 ;𝜽) (7)
(𝑖,straddle)
−𝑋𝑡 (MACDMR) strategies. We refer the reader to [2] for
(𝑖,straddle)
further details on the strategy implementation. In this framework, trading signals 𝑋𝑡 are directly com-
puted using the point-in-time snapshot of the option’s features,
TSHestonMOM (TSHestonMR). Following Heston et al. [15],
integrating both trend prediction and optimal position sizing within
we consider a strategy that constructs the trading signal for the
a single function 𝑓 . Unlike standard supervised learning paradigms,
straddle option of stock 𝑖 on the day of portfolio formation (𝑡 = 0),
a distinctive characteristic of our end-to-end framework is the un-
holding the position unchanged to expiry:
availability of ground-truth labels for the optimal trading signal
(𝑖,straddle) (𝑖,straddle) (𝑖,straddle)
𝑌0:𝑡 = 𝑟 0−𝑛M,0 (4) 𝑋𝑡 of a given option at any point in time. This necessitates
(𝑖,straddle) (𝑖,straddle) the learning of a non-trivial mapping from an option’s features to
𝑋 0:𝑡 = sgn(𝑌0:𝑡 ) (5) optimal trading signals. We discuss the choice of architectures and
(𝑖,straddle) optimization of these models in the following sections.
where 𝑟 0−𝑛M,0 is the average returns for the series of (1-month,
ATM, static delta-neutral) straddle options for stock 𝑖 over a look-
back period of 𝑛 months from the day of portfolio formation. A
5.2 Network Architectures
key distinction between TSMOM and TSHestonMOM is that the Given the problem of learning a mapping from option features to
momentum signals identified in the latter strategy are more accu- trading signals, it is not immediately clear which choice of archi-
rately referred to as cross-serial correlations, in that the options tecture would best suit an end-to-end model, warranting the need
(𝑖,straddle)
used to compute the 𝑛-period average returns 𝑟 0−𝑛M,0 are differ- to consider multiple options. Taking this into account, we examine
various choices of neural networks for 𝑓 .
ent from those used to construct the trading signal for the options
to be traded following 𝑡 = 0. In other words for the TSHeston- Linear. Beginning with the elementary case of a neural network
MOM, all types of momentum signals involve past returns of a comprising of a single fully connected layer:
set of options predicting the future returns on an entirely new
(𝑖,straddle) (𝑖,straddle)
set. Similar to [15], we consider multiple strategies corresponding 𝑋𝑡 = 𝑔(W⊤ u𝑡 + 𝑏) (8)
to average returns of monthly straddle returns over various look-
back periods ranging from 𝑛 = 1, 3, 6, 12 corresponding to monthly, (𝑖,straddle) (𝑖,straddle)
where W ∈ R𝑚 , u𝑡 B u𝑡 −𝜏+1:𝑡 ∈ R𝑚 with 𝑚 = 𝜏 𝑑,
quarterly, semiannual and annual returns. We again consider both 𝑏 ∈ R and 𝑔 = tanh is the activation function. The model computes a
(𝑖,straddle)
the momentum 𝑋 0:𝑡 (TSHestonMOM) and mean reversion linear combination of the input features prior to the tanh activation
(𝑖,straddle) function that outputs the trading signal to be within the finite
−𝑋 0:𝑡 (TSHestonMR) strategies.
range of [−1, 1]. To factor in the immediate temporal history of
CSHestonMOM (CSHestonMR). Based on the long-short ap- observations for making predictions, we concatenate features from
proach of [15] and [20], we implement a cross-sectional momentum the past 𝜏 = 5 days from time 𝑡 into a single input vector.
strategy that scores and ranks a stock based on its average straddle
returns computed as per TSHestonMOM. On the day of portfolio for- Multilayer Perceptron (MLP). We include an additional hid-
mation, the strategy utilizes a high-minus-low decile portfolio, tak- den layer to the Linear model, enhancing the model’s depth:
ing a maximum long and short position for the top and bottom 10%
(𝑖,straddle) (𝑖,straddle)
of ranked stocks respectively and holding the positions to expiry. 𝑋𝑡 = 𝑔[W [2]⊤ 𝜎 (W [1]⊤ u𝑡 + 𝑏 [1] ) + 𝑏 [2] ] (9)
(𝑖,straddle) (𝑖,straddle)
We first calculate raw momentum scores 𝑌0:𝑡 = 𝑟 0−𝑛M,0
with 𝑔 = 𝜎 = tanh corresponding to the activation functions of
according to Equation (4). Then, we rank stocks by their raw mo- each layer.
mentum scores:
(𝑖,straddle)
𝑋 0:𝑡 = {+1, −1, 0} (6) Convolutional Neural Networks (CNN). Modified for time
series data, CNNs have been designed to incorporate causal convolu-
(𝑖,straddle) tions that utilize only past information for forecasting [3], maintain-
where 𝑋 0:𝑡 = +1 for stocks ranked in the top 10% and −1 for
the bottom 10% ranked stocks, and 0 otherwise. We examine both ing the autoregressive ordering of temporal features. We consider
(𝑖,straddle) a 1-D autoregressive CNN:
the momentum 𝑋 0:𝑡 (CSHestonMOM) and mean reversion
(𝑖,straddle)
−𝑋 0:𝑡 (CSHestonMR) strategies. h𝑡
(𝑖,straddle)
= 𝑃𝜎 [W𝑐
[2]
∗ 𝜎 (W𝑐
[1]
∗ u𝑡
(𝑖,straddle) [1]
+ b𝑐 ) + b𝑐 ]
[2]

(𝑖,straddle) (𝑖,straddle)
5 DEEP LEARNING FOR OPTIONS TRADING 𝑋𝑡 = 𝑔[W [2]⊤ 𝜎 (W [1]⊤ h𝑡 + 𝑏 [1] ) + 𝑏 [2] ] (10)
5.1 General End-To-End Framework [𝑙 ] [𝑙 ]
where W𝑐 represent convolutional kernels with bias terms b𝑐
We frame the problem of generating optimal trading decisions and activation functions 𝑔 = 𝜎 = tanh, and ∗ represents the causal
(𝑖,straddle)
𝑋𝑡 for a portfolio of options with a model 𝑓 as an end- convolution operator. Subsequently, we perform average pooling
to-end framework. Given time 𝑡 and a straddle option of stock 𝑖, with 𝑃 prior to passing the activations h𝑡 to a fully connected neural
(𝑖,straddle)
we have an input u𝑡 ∈ R𝑑 of option features. We learn a network.
Deep Learning for Options Trading: An End-To-End Approach

(𝑖,straddle) (𝑖,straddle) √
Long Short-term Memory (LSTM). Recurrent neural networks I. Normalized Returns – we use 𝑟𝑡 −𝑘,𝑡 /(𝜎𝑡 𝑘),
(RNNs) have traditionally found applications in sequence modelling representing straddle returns normalized by daily volatil-
and time series forecasting [23]. Given the sequential nature of our ity estimates scaled to a time scale 𝑘 ∈ {1, 5, 10, 15, 20},
prediction task, it is natural to consider recurrent architectures. We which corresponds to daily, weekly, biweekly, triweekly
implement a single layer LSTM model [16] that takes in an input and monthly returns.
(𝑖,straddle) (𝑖,straddle)
sequence of option features u𝑡 B u𝑡 −𝜏+1:𝑡 ∈ R𝑚 with II. MACD Indicators – we take volatility normalised MACD
𝑚 = 𝜏 𝑑 where 𝜏 represents the length of a trajectory, and subdivide (𝑖,straddle)
signals 𝑌𝑡 (𝑆𝑘 , 𝐿𝑘 ) from Equation (3) with short
the time series into trajectories of 𝜏 = 20 during backpropagation. and long time scales 𝑆𝑘 ∈ {2, 4, 8} and 𝐿𝑘 ∈ {8, 16, 32}.
We omit the technical equations for the LSTM architecture for III. Option Momentum Features – we expand our set of
brevity. predictors to include the momentum features as defined in
Equation (4), taking the average returns of straddle options
5.3 Training Details for the stock over lookback periods of 𝑛 = 1, 3, 6, 12 months.
5.3.1 Loss Function. To facilitate the learning of a non-trivial map- IV. Core Option Features – to facilitate comparability with
ping from option features to optimal trading signals that effectively the trend-based benchmarks outlined in Section 4, and to
balances both risk and reward, we directly calibrate the models us- focus on the predictive power of the features above, we
ing the Sharpe ratio [31], a risk-adjusted performance metric. Given maintain a parsimonious set of core option features which
a set of contemporaneous option features and their respective trad- includes the log-moneyness (of both call and put options
ing signals DΩ = {(u𝑡
(𝑖,straddle)
, 𝑋𝑡
(𝑖,straddle) (𝑖,straddle)
= 𝑓 (u𝑡 ; 𝜽 ))} forming the straddle) and days to expiry (DTE, in years).
with Ω = {(𝑖, 𝑡) | 𝑖 = 1, · · · , 𝑁 , 𝑡 = 1, · · · ,𝑇 } denoting all straddle-
𝑡 Given that the moneyness and DTE of a straddle option
time pairs, we define the loss Lsharpe (𝜽 ) over DΩ as the annualized changes over time, these core contract features are nec-
Sharpe ratio: essary in identifying the option at particular stages of its
√ lifespan. Distinctively, we exclude other features such as
1 Í
| Ω | Ω 𝑅𝑖 (𝑡) × 252 option implied volatility or sensitivity measures such as
Lsharpe (𝜽 ) = − √︂ (11) Greeks which would require making an assumption of an
h i2
1 Í 𝑅 (𝑡) 2 − 1 Í 𝑅 (𝑡) underlying option pricing model such as the Black-Scholes
| Ω| Ω 𝑖 |Ω | Ω 𝑖
or the binomial model. Given our focus on delta-neutral
! straddles, we also exclude underlying stock characteristics
(𝑖,straddle) 𝜎tgt (𝑖,straddle)
𝑅𝑖 (𝑡) = 𝑋𝑡 𝑟𝑡,𝑡 +1 (12) and stock returns from our core set of features.
(𝑖,straddle)
𝜎𝑡
5.3.2 Optimization. Within each in-sample window, we perform a 6.3 Results and Discussion
train-validation split with the earlier 90% of data used for calibrat- We use the following annualized metrics to evaluate the out-of-
ing the models and the most recent 10% reserved for validation. To sample performance of all strategies:
calibrate the models, we perform backpropagation using minibatch
stochastic gradient descent with Adam [21], and trigger early stop- I. Profitability Measures – Expected Returns (E[Returns]),
ping with a patience of 25 epochs based on the validation loss. In Hit Rate
order to select optimal candidates for each machine learning model, II. Risk Measures – Volatility (Vol.), Downside Deviation,
we conduct hyperparameter optimization with 100 iterations of Maximum Drawdown (MDD)
random search. We refer the reader to Appendix A for the detailed III. Performance Ratios – Sharpe,
 Sortino
 and Calmar Ratios,
description of hyperparameter search ranges. Model calibration Average Profit over Loss Ave. P
Ave. L
was performed on a server equipped with an AMD EPYC7713 CPU
and multiple NVIDIA L40 GPUs. We present the aggregated out-of-sample performance metrics of
all strategies computed using the overall returns according to Equa-
6 PERFORMANCE EVALUATION tion (1). Firstly, we present the performance of all strategies from
6.1 Backtest Details their raw signal outputs in Table 1. We then apply to all strategies
(excluding unprofitable strategies) an additional layer of volatility
Following an expanding window approach, we train all models with
scaling to target an annualized volatility of 15% at the portfolio level
every block of 5 additional years. In each block, we fix the weights
and report the performance in Table 2 and plot their cumulative
and hyperparameters of the trained models and evaluate the models
returns in Figure 1. This adjustment at the portfolio level facilitates
out-of-sample in the following 5-year window. We perform model
comparison between individual strategy returns in line with our
calibration over multiple seeded runs and present the aggregated
15% volatility target. For each Heston portfolio (TSHestonMOM,
out-of-sample results in Section 6.3.
TSHestonMR, CSHestonMOM, CSHestonMR), we report results
for the best performing lookback period for the sake of brevity. In
6.2 Option Features this section, we report the performance of all strategies without
(𝑖,straddle)
To construct an input of option features u𝑡 ∈ R𝑑 as de- factoring in transaction costs to evaluate their raw predictive ability.
scribed in Equation (7), we include a combination of the predictors In Section 6.4, we include an analysis of the impact of transaction
used in the strategies outlined in Section 4: costs and the effect of turnover regularization.
Wee Ling Tan, Stephen Roberts, and Stefan Zohren

Table 1: Performance Metrics – Raw Signal Outputs

Downside Hit Ave. P


E[Return] Vol. MDD Sharpe Sortino Calmar Ave. L
Deviation Rate
Benchmarks
Long Only 0.055 0.103 0.058 0.188 0.534 0.940 0.291 0.451 1.341
TSMOM -0.076 0.091 0.077 0.504 -0.827 -0.982 -0.150 0.553 0.693
TSMR 0.057* 0.088 0.049 0.168 0.646 1.160 0.340 0.445 1.406*
MACD -0.038 0.056 0.047 0.308 -0.671 -0.800 -0.123 0.559 0.697
MACDMR 0.029 0.054 0.030 0.106 0.529 0.945 0.271 0.443 1.388
TSHestonMOM 0.002 0.053 0.035 0.150 0.043 0.065 0.015 0.495 1.027
TSHestonMR 0.023 0.048 0.031 0.085 0.476 0.741 0.266 0.484 1.163
CSHestonMOM -0.000 0.015 0.010 0.031 -0.005 -0.007 -0.002 0.507 0.973
CSHestonMR 0.009 0.015* 0.010* 0.028* 0.565 0.862 0.301 0.500 1.109
Deep Learning Models
Linear 0.021 0.018 0.014 0.036 1.258 1.672 0.754 0.561 1.073
MLP 0.016 0.017 0.013 0.034 0.954 1.262 0.504 0.580* 0.939
CNN 0.009 0.016 0.012 0.035 0.534 0.717 0.240 0.507 1.199
LSTM 0.026 0.019 0.014 0.031 1.399* 1.917* 0.850* 0.557 1.231
(bold and * denotes best performing strategy for each column)

Table 2: Performance Metrics – Rescaled to Target Volatility

Downside Hit Ave. P


E[Return] Vol. MDD Sharpe Sortino Calmar Ave. L
Deviation Rate
Benchmarks
Long Only 0.112 0.160 0.088 0.292 0.697 1.275 0.383 0.451 1.384
TSMR 0.123 0.161 0.087* 0.254 0.762 1.414 0.484 0.445 1.441*
MACDMR 0.106 0.162 0.088 0.264 0.655 1.207 0.402 0.443 1.424
TSHestonMR 0.098 0.157* 0.100 0.191* 0.626 0.984 0.514 0.484 1.193
CSHestonMR 0.091 0.159 0.104 0.318 0.573 0.878 0.286 0.500 1.112
Deep Learning Models
Linear 0.245 0.190 0.146 0.276 1.290 1.690 0.917 0.561 1.094
MLP 0.167 0.186 0.141 0.344 0.895 1.199 0.523 0.580* 0.929
CNN 0.119 0.217 0.163 0.435 0.551 0.736 0.284 0.507 1.216
LSTM 0.266* 0.200 0.144 0.274 1.329* 1.862* 0.974* 0.557 1.212

Table 3: Impact of Transactions Costs on Sharpe Ratio – Rescaled to Target Volatility

Transaction Costs (bps) 0.0 1.0 2.0 3.0 5.0 10.0 20.0 50.0
Benchmarks
Long Only 0.697 0.695 0.692 0.690 0.685 0.673 0.650 0.578
TSMR 0.762 0.757 0.752 0.747 0.737 0.713 0.663 0.514
MACDMR 0.655 0.646 0.638 0.629 0.612 0.568 0.481 0.219
TSHestonMR 0.626 0.621 0.615 0.610 0.600 0.574 0.522 0.367
CSHestonMR 0.573 0.570 0.567 0.563 0.557 0.541 0.508 0.410
Deep Learning Models
LSTM 1.329* 1.310* 1.291* 1.272* 1.235* 1.140 0.952 0.388
LSTM + TC Reg. 1.282 1.270 1.259 1.247 1.223 1.164* 1.045* 0.689*
Deep Learning for Options Trading: An End-To-End Approach

10.0
Long Only
Linear
MLP
6.3 CNN
LSTM
TSMR
MACDMR
Cumulative Returns (Log Scale)

TSHestonMR
4.0 CSHestonMR

2.5

1.6

1.0

2015 2016 2017 2018 2019 2020 2021 2022 2023


Date
Figure 1: Cumulative Returns - Rescaled to Target Volatility

From Table 1, we find that the Long Only straddle portfolio was clearly from Figure 1 that Long Only exhibited sharp gains during
a profitable strategy over the backtest period. We find this observa- the COVID-19 market selloff at the start of 2020, demonstrating
tion interesting, running contrary to [17] who show that retail and the profitability of long straddles during periods of high market
institutional investors executing short volatility strategies tend to volatility. While we observe that the deep learning models expe-
perform well. Furthermore, a Long Only options portfolio would rienced brief drawdowns during the selloff, performance of the
typically benefit from limited downside exposures as opposed to models swiftly recovered following the market rebound.
Short Only. Turning our attention to trend-based strategies, we
observe that mean-reversion portfolios (TSMR, MACDMR, TSHe- 6.4 Transaction Costs and Turnover
stonMR, CSHestonMR) exhibit positive performances compared to Regularization
their opposite momentum counterparts (TSMOM, MACD, TSHe-
stonMOM, CSHestonMOM), which were generally unprofitable Some of the key challenges of rebalancing an options portfolio
over the backtest period. We note that both TSMR and CSHestonMR include market microstructure considerations arising from market
exhibited only slight performance improvements over Long Only. liquidity and bid-ask spreads, which can result in high transaction
We obtain similar results for the Heston portfolios as reported in costs. To examine the impact of transaction costs on the profitabil-
[15], who document significant reversals in option returns at short- ity of all strategies, we first compute Sharpe ratios adjusted for
term horizons, and our best performing Heston models correspond transaction costs by taking into account turnover-adjusted returns:
mostly with strategies adopting the shortest 𝑛 = 1 month lookback 𝑁𝑡  (𝑖,straddle) (𝑖,straddle) 
period. Moving on to our deep learning models, apart from the 1 ∑︁ 𝑋 𝑋
STRATEGY
𝑟˜𝑡,𝑡 +1 = 𝑅𝑖 (𝑡) − 𝑐 · 𝜎tgt 𝑡 − 𝑡 −1
CNN, we observe that the Linear, MLP and LSTM exhibit a clear 𝑁𝑡 𝑖=1 (𝑖,straddle) (𝑖,straddle)
𝜎𝑡 𝜎𝑡 −1
disparity in performance above all other strategies as seen from (13)
their higher performance ratios. where 𝑐 represents a measure of average transaction costs in basis
Referring to Table 2, with the implementation of volatility tar- points. Focusing on the LSTM model, we observe from Table 3
geting at the portfolio level, we observe modest improvements in that the model maintains superior risk-adjusted performance over
performance across all profitable benchmarks. Portfolio volatility the best performing benchmark - TSMR up to transaction costs of
targeting resulted in minimal changes to the deep learning models, 𝑐 = 20 bps, deteriorating at higher transaction costs of 𝑐 = 50 bps.
allowing them to retain a large gap in their performance ratios Following the methodology detailed in [24, 32], we modify the
over the benchmarks. In particular, we see that the Linear and training loss to utilize turnover-adjusted returns as defined in Equa-
LSTM models exhibited the best performances, outperforming the tion (13), which in effect results in optimizing the Sharpe ratio while
benchmarks by roughly twice in their Sharpe ratios of 1.290 and regularizing for the turnover generated by the trading signals of the
1.329 respectively. We note that the simplest Linear model was able LSTM. Based on Table 3, we see that turnover regularization further
to perform the MLP, most likely due to our introduction of an L1 enhances the performance of the LSTM for high transaction costs
regularization penalty only for the Linear model during training. of 𝑐 = 10 to 50 bps, allowing the regularized model to outperform
Examining the period during the COVID-19 market selloff, we see other strategies at prohibitively high levels of transaction costs.
Wee Ling Tan, Stephen Roberts, and Stefan Zohren

7 CONCLUSIONS 8.1735
[17] Jianfeng Hu, Antonia Kirilova, Seongkyu Park, and Doojin Ryu. 2023. Who
We present a general end-to-end framework for trading options Profits from Trading Options? Management Science 70, 7 (2023), 4742–4761.
using a highly data-driven machine learning algorithm, adopting https://doi.org/10.1287/mnsc.2023.4916
[18] James M Hutchinson, Andrew W Lo, and Tomaso Poggio. 1994. A Nonparametric
the point of view of an active investor seeking to profit from options Approach to Pricing and Hedging Derivative Securities Via Learning Networks.
trading. Departing from conventional approaches that typically rely The Journal of Finance 49, 3 (1994), 851–889. https://doi.org/10.2307/2329209
on specific market dynamics or option pricing models, we train end- [19] Codrut, -Florin Ivas, cu. 2021. Option Pricing using Machine Learning. Expert
Systems with Applications 163 (2021), 113799. https://doi.org/10.1016/j.eswa.
to-end neural networks to directly learn mappings from options 2020.113799
data to optimal trading positions, removing the need to simulate [20] Narasimhan Jegadeesh and Sheridan Titman. 1993. Returns to Buying Winners
market processes, price options or predict option returns. Back- and Selling Losers: Implications for Stock Market Efficiency. The Journal of
Finance 48, 1 (1993), 65–91. https://doi.org/10.1111/j.1540-6261.1993.tb04702.x
testing our approach on portfolios of delta-neutral equity options, [21] Diederik P Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Opti-
our models demonstrate significant improvements in risk-adjusted mization. arXiv:1412.6980 (2014).
[22] Petter N Kolm and Gordon Ritter. 2019. Dynamic Replication and Hedging: A
performance when calibrated using the Sharpe ratio. Crucially, Reinforcement Learning Approach. The Journal of Financial Data Science 1, 1
our framework is agnostic to specific underlying market assump- (2019), 159–171. https://doi.org/10.3905/jfds.2019.1.1.159
tions, potentially allowing for further extensions to a broader set [23] Bryan Lim and Stefan Zohren. 2021. Time Series Forecasting With Deep Learning:
A Survey. Philosophical Transactions of the Royal Society A 379, 2194 (2021),
of derivatives or complex instruments where data is available. 20200209. https://doi.org/10.1098/rsta.2020.0209
[24] Bryan Lim, Stefan Zohren, and Stephen Roberts. 2019. Enhancing Time-Series
Momentum Strategies Using Deep Neural Networks. The Journal of Financial
ACKNOWLEDGMENTS Data Science 1, 4 (2019), 19–38. https://doi.org/10.3905/jfds.2019.1.015
We would like to thank the Oxford-Man Institute of Quantitative [25] Robert C Merton. 1973. Theory of Rational Option Pricing. The Bell Journal of
Economics and Management Science 4, 1 (1973), 141–183. https://doi.org/10.2307/
Finance for providing compute resources. Wee Ling Tan thanks 3003143
Bryan Lim, Martin Luk, Guillaume Andrieux and Hans Buehler for [26] Tobias J Moskowitz, Yao Hua Ooi, and Lasse Heje Pedersen. 2012. Time Series
Momentum. Journal of Financial Economics 104, 2 (2012), 228–250. https:
their insightful comments. //doi.org/10.1016/j.jfineco.2011.11.003
[27] Daniel Poh, Bryan Lim, Stefan Zohren, and Stephen Roberts. 2021. Building Cross-
Sectional Systematic Strategies by Learning to Rank. The Journal of Financial
REFERENCES Data Science 3, 2 (2021), 70–86. https://doi.org/10.3905/jfds.2021.1.060
[1] Turan G Bali, Heiner Beckmeyer, Mathis Moerke, and Florian Weigert. 2023. [28] Daniel Poh, Bryan Lim, Stefan Zohren, and Stephen Roberts. 2022. Enhancing
Option Return Predictability with Machine Learning and Big Data. The Review Cross-Sectional Currency Strategies by Context-Aware Learning to Rank with
of Financial Studies 36, 9 (2023), 3548–3602. https://doi.org/10.1093/rfs/hhad017 Self-Attention. The Journal of Financial Data Science 4, 3 (2022), 89–107. https:
[2] Jamil Baz, Nicolas Granger, Campbell R Harvey, Nicolas Le Roux, and Sandy //doi.org/10.3905/jfds.2022.1.099
Rattray. 2015. Dissecting Investment Strategies in the Cross Section and Time [29] James M Poterba and Lawrence H Summers. 1988. Mean Reversion in Stock
Series. SSRN 2695101 (2015). Prices: Evidence and Implications. Journal of Financial Economics 22, 1 (1988),
[3] Mikolaj Binkowski, Gautier Marti, and Philippe Donnat. 2018. Autoregressive 27–59. https://doi.org/10.1016/0304-405X(88)90021-9
Convolutional Neural Networks for Asynchronous Time Series. In International [30] K Geert Rouwenhorst. 1998. International Momentum Strategies. The Journal of
Conference on Machine Learning. PMLR, 580–589. Finance 53, 1 (1998), 267–284.
[4] Fischer Black. 1975. Fact and Fantasy in the Use of Options. Financial Analysts [31] William F Sharpe. 1994. The Sharpe Ratio. The Journal of Portfolio Management
Journal 31, 4 (1975), 36–41. https://doi.org/10.2469/faj.v31.n4.36 21, 1 (1994), 49–58. https://doi.org/10.3905/jpm.1994.409501
[5] Fischer Black and Myron Scholes. 1973. The Pricing of Options and Corporate [32] Wee Ling Tan, Stephen Roberts, and Stefan Zohren. 2023. Spatio-Temporal
Liabilities. Journal of Political Economy 81, 3 (1973), 637–654. http://www.jstor. Momentum: Jointly Learning Time-Series and Cross-Sectional Strategies. The
org/stable/1831029 Journal of Financial Data Science 5, 3 (2023), 107–129. https://doi.org/10.3905/
[6] Matthias Büchner and Bryan Kelly. 2022. A Factor Model for Option Returns. jfds.2023.1.130
Journal of Financial Economics 143, 3 (2022), 1140–1161. https://doi.org/10.1016/ [33] Meng Tian and Liuren Wu. 2023. Limits of Arbitrage and Primary Risk-Taking
j.jfineco.2021.12.007 in Derivative Securities. The Review of Asset Pricing Studies 13, 3 (2023), 405–439.
[7] Hans Buehler, Lukas Gonon, Josef Teichmann, and Ben Wood. 2019. Deep https://doi.org/10.1093/rapstu/raad003
Hedging. Quantitative Finance 19, 8 (2019), 1271–1291. https://doi.org/10.1080/ [34] Aurelio Vasquez. 2017. Equity Volatility Term Structures and the Cross Section
14697688.2019.1571683 of Option Returns. Journal of Financial and Quantitative Analysis 52, 6 (2017),
[8] Options Clearing Corporation. 2024. OCC - Historical Volume Statis- 2727–2754. https://doi.org/10.1017/S002210901700076X
tics. https://www.theocc.com/market-data/market-data-reports/volume-and- [35] Kieran Wood, Sven Giegerich, Stephen Roberts, and Stefan Zohren. 2023. Trading
open-interest/historical-volume-statistics. with the Momentum Transformer: An Intelligent and Interpretable Architecture.
[9] Joshua D Coval and Tyler Shumway. 2001. Expected Option Returns. The Journal arXiv:2112.08534, Risk (2023).
of Finance 56, 3 (2001), 983–1009. https://doi.org/10.1111/0022-1082.00352 [36] Kieran Wood, Stephen Roberts, and Stefan Zohren. 2022. Slow Momentum
[10] John C Cox, Stephen A Ross, and Mark Rubinstein. 1979. Option Pricing: A with Fast Reversion: A Trading Strategy Using Deep Learning and Changepoint
Simplified Approach. Journal of Financial Economics 7, 3 (1979), 229–263. https: Detection. The Journal of Financial Data Science 4, 1 (2022), 111–129. https:
//doi.org/10.1016/0304-405X(79)90015-1 //doi.org/10.3905/jfds.2021.1.081
[11] Werner FM De Bondt and Richard Thaler. 1985. Does the Stock Market Overreact?
The Journal of Finance 40, 3 (1985), 793–805. https://doi.org/10.2307/2327804
[12] Assaf Eisdorfer, Ronnie Sadka, and Alexei Zhdanov. 2022. Maturity Driven A HYPERPARAMETER OPTIMIZATION
Mispricing of Options. Journal of Financial and Quantitative Analysis 57, 2 (2022),
514–542. https://doi.org/10.1017/S002210902100003X
[13] Amit Goyal and Alessio Saretto. 2009. Cross-section of Option Returns and
Table 4: Hyperparameter Search Range
Volatility. Journal of Financial Economics 94, 2 (2009), 310–326. https://doi.org/
10.1016/j.jfineco.2009.01.001
[14] Campbell R Harvey, Edward Hoyle, Russell Korgaonkar, Sandy Rattray, Matthew Hyperparameters Search Grid
Sargaison, and Otto Van Hemert. 2018. The Impact of Volatility Targeting. The
Journal of Portfolio Management 45, 1 (2018), 14–33. https://doi.org/10.3905/jpm. Minibatch Size 32, 64, 128, 256
2018.45.1.014 Dropout Rate 0.1, 0.2, 0.3, 0.4, 0.5
[15] Steven L Heston, Christopher S Jones, Mehdi Khorram, Shuaiqi Li, and Haitao
Mo. 2023. Option Momentum. The Journal of Finance 78, 6 (2023), 3141–3192.
Hidden Layer Size 5, 10, 20, 40, 80, 160
https://doi.org/10.1111/jofi.13279 Learning Rate 10 −5, 10 −4, 10 −3, 10 −2, 10 −1, 100
[16] Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Max Gradient Norm 10 −4, 10 −3, 10 −2, 10 −1, 100, 101
Neural Computation 9, 8 (1997), 1735–1780. https://doi.org/10.1162/neco.1997.9.

You might also like