Data Science For Financial Markets
Data Science For Financial Markets
Table of Contents
Introduction (https://www.kaggle.com/code/lusfernandotorres/data-science-for-
financial-markets#introduction)
yfinance (https://www.kaggle.com/code/lusfernandotorres/data-science-for-
financial-markets#yfinance)
Quantstats (https://www.kaggle.com/code/lusfernandotorres/data-science-for-
financial-markets#quantstats)
PyPortfolioOpt (https://www.kaggle.com/code/lusfernandotorres/data-science-
for-financial-markets#pyportfolio)
TA (https://www.kaggle.com/code/lusfernandotorres/data-science-for-financial-
markets#talib)
Histograms (https://www.kaggle.com/code/lusfernandotorres/data-science-for-
financial-markets#histograms)
Kurtosis (https://www.kaggle.com/code/lusfernandotorres/data-science-for-
financial-markets#kurtosis)
Skewness (https://www.kaggle.com/code/lusfernandotorres/data-science-for-
financial-markets#skewness)
Prior (https://www.kaggle.com/code/lusfernandotorres/data-science-for-
financial-markets#prior)
Views (https://www.kaggle.com/code/lusfernandotorres/data-science-for-
financial-markets#views)
Confidences (https://www.kaggle.com/code/lusfernandotorres/data-science-
for-financial-markets#confidences)
4 | Backtesting (https://www.kaggle.com/code/lusfernandotorres/data-science-for-
financial-markets#backtesting)
Conclusion (https://www.kaggle.com/code/lusfernandotorres/data-science-for-
financial-markets#conclusion)
Introduction
Data Science is a rapidly growing field that combines the power of statistical and
computational techniques to extract valuable insights and knowledge from data. It brings
together multiple disciplines such as mathematics, statistics, computer science, and
domain-specific knowledge to create a multi-faceted approach to understanding complex
data patterns.
The goal of Data Science is to provide a complete picture of data and transform it into
actionable information that can inform business decisions, scientific breakthroughs, and
even public policy. With the increasing amount of data being generated every day, Data
Science is becoming an increasingly vital part of our data-driven world.
When it comes to financial markets, Data Science can be applied in various ways, such as:
1. Predictive Models: Data Science professionals can use historical data to create
predictive models that can identify trends and make predictions about future market
conditions.
2. Algorithmic Trading: The use of algorithms that execute buy and sell orders
autonomously, based on mathematical models through the analysis of price, volume, and
volatility, among many others.
3. Portfolio Optimization: Algorithms and other mathematical models can be used to
optimize portfolios, aiming for maximization of returns and risk minimization.
4. Fraud Detection: Data scientists can use machine learning algorithms to identify
fraudulent activities in financial transactions.
5. Risk Management: Data science can be used to quantify and manage various types
of financial risks, including market risk, credit risk, and operational risk.
6. Customer Analysis: Financial institutions can use data science to analyze customer
data and gain insights into customer behavior and preferences, which can be used to
improve customer engagement and retention.
In this notebook, I aim to demonstrate how Data Science, aswell as Python, can be
powerful tools in extracting crucial insights from financial markets. I will demonstrate
how these tools can be leveraged to build and optimize portfolios, develop effective
trading strategies, and perform detailed stock analysis. This will showcase the versatility
and usefulness of Data Science and Python in the finance industry and provide a valuable
resource for those interested in utilizing these techniques to make informed investment
decisions.
Essential Libraries
While developing this notebook, we will use four essential libraries specifically designed
for handling financial data.
I will provide a brief introduction to each library and guide you through the steps
required to install them in any Python environment.
yfinance
yfinance is probably the most popular Python library to extract data from financial
markets! It allows you to obtain and analyze historical market data from Yahoo!Finance.
It offers an easy-to-use API that allows users to fetch data for any publicly traded
company, index, ETF, crypto and forex.
yfinance also provides tools for adjusting the data for dividends and splits, as well as for
visualizing the data in different ways. Its simple interface and reliable data, makes it an
excellent tool for analysis of financial data and explains why it is one of the most used
library for traders and investors alike.
You can copy the code cell below in any Python environment to install it.
In [1]:
# Installing yfinance
!pip install yfinance
Quantstats
Quantstats is a Python library used for quantitative financial analysis. This library
provides various tools to obtain financial data from different sources, conduct technical
and fundamental analyses, and create and test different investment strategies. It is also
possible to use visualization tools to analyze individual stocks and portfolios. It is a
simple and easy tool for any type of quantitative finance-oriented analysis, and that's why
it will be essential for this study.
In [2]:
# Installing Quantstats
!pip install quantstats
In the code cell below, you find how to install PyPortfolioOpt in your Python
environment.
In [3]:
# installing PyPortfolioOpt
!pip install pyportfolioopt
The TA (Technical Analysis) library is a powerful tool for conducting technical analysis
using Python. It provides a wide range of technical indicators, such as moving averages,
Bollinger bands, MACD, and the Relative Strength Index to analyze market trends,
momentum, and volatility.
The TA library is extremely easy to use and allows users to customize their analysis based
on their preferred indicators and parameters. With its extensive range of technical
analysis tools, it is a valuable resource for both traders and analysts looking to gain
valuable insights on market behavior.
Here's how you can install the TA library in your own Python environment:
In [4]:
# Installing the TA (Technical Analysis) library
!pip install ta
Now that you've had a brief introduction to the most essential financial libraries in this
notebook, we can move on to importing all the specific libraries we'll be using.
In [5]:
# Importing Libraries
# Data visualization
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
# Financial data
import quantstats as qs
import ta
import yfinance as yf
Daily Returns
The first thing we're going to look at is the daily returns. A stock's daily return is the
percentual change in price over a single day. You calculate it by subtracting the difference
between the stock's closing price on one day and its closing price the day before, dividing
the result by the closing of the day before, and multiplying it by 100.
For instance, if a stock closes at 100 dollars on Monday, and it closes at 102 dollars on
Tuesday, its daily return would be calculated as:
This shows that the stock increased in value by 2% over the course of one day. On the
other hand, if the stock had closed at 98 dollars on Tuesday, the daily return would be
calculated as:
Which means that the stock has decreased in value by 2% over the course of one day.
Daily returns are relevant for investors because they provide a quick way to check the
performance of a stock over a short period.
With Quantstats, it's possible to plot daily returns charts, which are graphical
representations of the daily percentage changes in stocks, allowing investors to visualize
the ups and downs of the stock's daily performance over time and extract information on
volatility and consistency of returns.
In [6]:
# Getting daily returns for 4 different US stocks in the same time window
aapl = qs.utils.download_returns('AAPL')
aapl = aapl.loc['2010-07-01':'2023-02-10']
tsla = qs.utils.download_returns('TSLA')
tsla = tsla.loc['2010-07-01':'2023-02-10']
dis = qs.utils.download_returns('DIS')
dis = dis.loc['2010-07-01':'2023-02-10']
amd = qs.utils.download_returns('AMD')
amd = amd.loc['2010-07-01':'2023-02-10']
We now have the daily returns from July 1st, 2010, to February 10th, 2023, for four
different US stocks from distinct industries, Apple, Tesla, The Walt Disney Company, and
AMD.
We can now plot the daily returns chart for each of them using Quantstats.
In [7]:
# Converting timezone
aapl.index = aapl.index.tz_convert(None)
tsla.index = tsla.index.tz_convert(None)
dis.index = dis.index.tz_convert(None)
amd.index = amd.index.tz_convert(None)
In [8]:
# Plotting Daily Returns for each stock
print('\n')
print('\nApple Daily Returns Plot:\n')
qs.plots.daily_returns(aapl)
print('\n')
print('\n')
print('\nTesla Inc. Daily Returns Plot:\n')
qs.plots.daily_returns(tsla)
print('\n')
print('\n')
print('\nThe Walt Disney Company Daily Returns Plot:\n')
qs.plots.daily_returns(dis)
print('\n')
print('\n')
print('\nAdvances Micro Devices, Inc. Daily Returns Plot:\n')
qs.plots.daily_returns(amd)
Apple Daily Returns Plot:
On the other hand, Disney's and Apple's stocks seem more stable and predictable
investment options at first glance.
Cumulative Returns
To calculate a stock's cumulative return, the first thing to do is to determine the stock's
initial price and its final price at the end of the specified period. Then subtract the initial
price from the final price, add any dividends or other income received, and divide the
result by the initial price. This gives us the cumulative return as a decimal, which can be
multiplied by 100 to express it as a percentage.
It's important to note that cumulative return takes into account the effects of
compounding, meaning that any gains from a previous period are reinvested and
contribute to additional gains in future periods, which can result in a larger cumulative
return than the simple average of the individual returns over the specified period.
Below, we can see line charts displaying the cumulative return for each one of the stocks
we've downloaded since July, 2010.
In [9]:
# Plotting Cumulative Returns for each stock
print('\n')
print('\nApple Cumulative Returns Plot\n')
qs.plots.returns(aapl)
print('\n')
print('\n')
print('\nTesla Inc. Cumulative Returns Plot\n')
qs.plots.returns(tsla)
print('\n')
print('\n')
print('\nThe Walt Disney Company Cumulative Returns Plot\n')
qs.plots.returns(dis)
print('\n')
print('\n')
print('\nAdvances Micro Devices, Inc. Cumulative Returns Plot\n')
qs.plots.returns(amd)
Apple Cumulative Returns Plot
Of course, when analyzing stocks data, we don't make an investment merely looking only
at the cumulative returns. It's crucial to look at other indicators and evaluate the risks of
the investment. Besides, 650% returns are still significant, and in the stock market, slow
but steady growth can be just as valuable as explosive returns.
A variety of strategies must be taken into account in order to build a robust portfolio.
Histograms
Histograms of daily returns are valuable to help investors to identify patterns, such as the
range of daily returns of an asset over a certain period, indicating its level of stability and
volatility.
In [10]:
# Plotting histograms for daily returns
print('\n')
print('\nApple Daily Returns Histogram')
qs.plots.histogram(aapl, resample = 'D')
print('\n')
print('\nTesla Inc. Daily Returns Histogram')
qs.plots.histogram(tsla, resample = 'D')
print('\n')
print('\nThe Walt Disney Company Daily Returns Histogram')
qs.plots.histogram(dis, resample = 'D')
print('\n')
print('\nAdvances Micro Devices, Inc. Daily Returns Histogram')
qs.plots.histogram(amd, resample = 'D')
Apple Daily Returns Histogram
Disney's stocks have more balanced returns with values ranging from -15% to 15%, while
most returns are closer to the mean.
Through histograms, we can extract some valuable statistics such as kurtosis and
skewness.
Kurtosis
A high kurtosis value for daily returns may indicate frequent fluctuations in price that
deviate significantly from the average returns of that investment, which can lead to
increased volatility and risk associated with the stock.
A kurtosis value above 3.0 defines a leptokurtic distribution, characterized by outliers and
more values that are distant from the average, which reflects in the histogram as
stretching of the horizontal axis. Stocks with a leptokurtic distribution are generally
associated with a higher level of risk but also offer the potential for higher returns due to
the substantial price movements that have occurred in the past.
In the image below, it's possible to see the difference between a negative kurtosis on the
left and a positive kurtosis on the right. The distribution on the left displays a lower
probability of extreme values and a lower concentration of values around the mean, while
the distribution on the right shows a higher concentration of values near the mean, but
also the existence (and thus a higher probability of occurrence) of extreme values.
Kurtosis measures the concentration of observations in the tails versus the center of a
distribution. In finance, a high level of excess kurtosis, or "tail risk," represents the
chance of a loss occurring as a result of a rare event. This type of risk is important for
investors to consider when making investment decisions, as it may impact the potential
returns and stability of a particular stock.
In [11]:
# Using quantstats to measure kurtosis
print('\n')
print("Apple's kurtosis: ", qs.stats.kurtosis(aapl).round(2))
print('\n')
print("Tesla's kurtosis: ", qs.stats.kurtosis(tsla).round(2))
print('\n')
print("Walt Disney's kurtosis: ", qs.stats.kurtosis(dis).round(3))
print('\n')
print("Advances Micro Devices' kurtosis: ", qs.stats.kurtosis(amd).round
(3))
However, AMD has the highest kurtosis, with a value of 17.125, which indicates that AMD
is subject to an extremely high level of volatility and tail risk, with a large concentration of
extreme price movements. On the other hand, Disney has a kurtosis of 11.033, which is
still higher than a typical value for a normal distribution, but not as extreme as AMD's.
Skewness
Skewness is a metric that quantifies the asymmetry of returns. It reflects the shape of the
distribution and determines if it is symmetrical, skewed to the left, or skewed to the right.
Below, it is possible to see two different asymmetrical distributions. On the left, it shows
an example of a positively skewed distribution, with a long right tail, indicating a
substantial probability of extremely positive daily returns compared to a normal
distribution. On the other hand, a negatively skewed distribution would most likely
resemble the distribution on the right, with a long tail representing more frequency of
outliers on the negative side of returns.
Where x represents the set of returns data, μ represents the mean of the returns, and σ
represents the standard deviation of the returns. This formula results in a single
numerical value that summarizes the skewness of returns.
In [12]:
# Measuring skewness with quantstats
print('\n')
print("Apple's skewness: ", qs.stats.skew(aapl).round(2))
print('\n')
print("Tesla's skewness: ", qs.stats.skew(tsla).round(2))
print('\n')
print("Walt Disney's skewness: ", qs.stats.skew(dis).round(3))
print('\n')
print("Advances Micro Devices' skewness: ", qs.stats.skew(amd).round(3))
Apple, Tesla, and Disney are just slightly skewed, and Disney's slight skewness can be
seen by looking at the range of the x-axis of its histogram, where it is pretty much
balanced between -15% and 15%.
AMD stocks are strongly skewed, which can also be easily identified by looking at the
range between -20% and 50% in its histogram. AMD has a lot of outliers on the positive
tail, which could've been a good thing for those who bought its shares but it also suggests
higher volatility and risk to this investment.
Standard Deviation
Standard deviation is a widely used statistical metric that quantifies the variability of the
dataset. When applied to a stock's daily returns, it can indicate the risk level associated
with investing in that particular stock. A stock exhibiting high daily return volatility,
characterized by a high standard deviation, is considered riskier when compared to one
with low daily return volatility, represented by a low standard deviation.
−−−−−−−−−−−−−−−
1 N 2
σ = √ ∑ (xi − x̄)
N −1 i=1
Where x represents the set of returns data, x̄ is the mean of the returns data, and N is the
number of observations. Standard deviation enables investors to assess the risk level and
to compare the volatility of different stocks. For instance, if two assets have similar
average returns, but one has a higher standard deviation, it is usually considered a riskier
investment. Hence, standard deviation serves as a useful tool in helping investors to make
informed decisions regarding their investment choices and portfolio management.
In [13]:
# Calculating Standard Deviations
print('\n')
print("Apple's Standard Deviation from 2010 to 2023: ", aapl.std().round
(3))
print('\n')
print("\nTesla's Standard Deviation from 2010 to 2023: ", tsla.std().roun
d(3))
print('\n')
print("\nDisney's Standard Deviation from 2010 to 2023: ", dis.std().roun
d(3))
print('\n')
print("\nAMD's Standard Deviation from 2010 to 2023: ", amd.std().round
(3))
Based on the values above, we can say that Apple and Disney are less volatile than Tesla
and AMD, suggesting that Apple and Disney are safer investment options, exhibiting
more stable price fluctuations in the market.
Pairplots and Correlation Matrix
Correlation analysis in the stock market allows us for interesting investment strategies. A
widely known strategy in the market is called Long-Short, which is the act of buying
shares of a company, while selling shares of another company, believing that both assets
will have opposite directions in the market. That is, when one goes up, the other goes
down. To develop Long-Short strategies, investors rely on correlation analysis between
stocks.
Correlation analysis is not only useful for Long-Short strategies, but it's also crucial to
avoid systemic risk, which is described as the risk of the breakdown of an entire system
rather than simply the failure of individual parts. To make it simple, if your portfolio has
stocks that are highly correlated, or are all in the same industry, if something happens to
that specific industry, all of your stocks may lose market value and it can cause greater
financial losses.
Pairplots and correlation matrices are useful tools to visualize correlation among assets.
In the correlation matrix, values range between -1 and 1, where -1 represents a perfect
negative correlation and 1 represents a perfect positive correlation. Keep in mind that,
when assets are positively correlated, they tend to go up and down simultaneously in the
market, while the opposite is true for those that are negatively correlated.
In [14]:
# Merging daily returns into one dataframe
merged_df = pd.concat([aapl, tsla, dis, amd], join = 'outer', axis = 1)
merged_df.columns = ['aapl', 'tsla', 'dis', 'amd']
merged_df # Displaying dataframe
Out[14]:
The dataframe above has dates serving as the index and each stock is represented as a
column, displaying their respective returns for each specific day. This dataframe will be
used to calculate the correlation between these stocks and to create a pairplot
visualization.
In [15]:
# Pairplots
sns.pairplot(merged_df, kind = 'reg')
plt.show()
In [16]:
# Correlation Matrix
corr = merged_df.corr()
mask = np.zeros_like(corr, dtype=np.bool)
mask[np.triu_indices_from(mask)] = True
sns.heatmap(corr, annot=True, mask = mask)
plt.show()
The stronger correlation among the assets above is between Disney and Apple. However,
a correlation of 0.42 is not a strong one.
It's important to note that there is not any negative correlation among the assets above,
which indicates that none of them acts to limit losses. In the financial market, a hedge is
an investment position intended to offset potential losses by investing in assets that may
have a negative correlation with the others in a portfolio. Many investors buy gold to
serve as protection for riskier investments, such as stocks, and when the market as a
whole goes into a bear market, the gold tends to increase in value, limiting potential
losses for the overall portfolio.
Beta and Alpha
Beta and Alpha are two key metrics used in finance to evaluate the performance of a stock
relative to the overall market. Beta is a measure of a stock's volatility compared to the
market. A Beta of 1 means that the stock is as volatile as the market, a Beta greater than 1
indicates higher volatility than the market, and a Beta less than 1 suggests lower volatility.
Alpha, on the other hand, is a measurement of a stock's excess return relative to its
expected performance based on its Beta. A positive Alpha indicates that a stock has
outperformed its expected performance based on its Beta, while a negative Alpha suggests
underperformance. By analyzing the Beta and Alpha values of stocks, investors can get a
better understanding of the risk and potential returns of the stock compared to the
market, and make informed investment decisions accordingly.
To determine Beta and Alpha, we require data from the SP500, which acts as the
benchmark, to fit a linear regression model between the stocks and the index. This will
enable us to extract the Beta and Alpha values of the stocks.
Out[17]:
Date
2010-07-01 04:00:00 -0.003240
2010-07-02 04:00:00 -0.004662
2010-07-06 04:00:00 0.005359
2010-07-07 04:00:00 0.031331
2010-07-08 04:00:00 0.009413
...
2023-02-06 05:00:00 -0.006140
2023-02-07 05:00:00 0.012873
2023-02-08 05:00:00 -0.011081
2023-02-09 05:00:00 -0.008830
2023-02-10 05:00:00 0.002195
Name: Close, Length: 3176, dtype: float64
In [18]:
# Removing indexes
sp500_no_index = sp500.reset_index(drop = True)
aapl_no_index = aapl.reset_index(drop = True)
tsla_no_index = tsla.reset_index(drop = True)
dis_no_index = dis.reset_index(drop = True)
amd_no_index = amd.reset_index(drop = True)
In [19]:
sp500_no_index # Daily returns for the SP500
Out[19]:
0 -0.003240
1 -0.004662
2 0.005359
3 0.031331
4 0.009413
...
3171 -0.006140
3172 0.012873
3173 -0.011081
3174 -0.008830
3175 0.002195
Name: Close, Length: 3176, dtype: float64
In [20]:
aapl_no_index # Daily returns for Apple stocks without index
Out[20]:
0 -0.012126
1 -0.006197
2 0.006844
3 0.040381
4 -0.002242
...
3171 -0.017929
3172 0.019245
3173 -0.017653
3174 -0.006912
3175 0.002456
Name: Close, Length: 3176, dtype: float64
We can use the Scikit-Learn's Linear Regression model to extract Beta and Alpha from
the analyzed stocks.
In [21]:
# Fitting linear relation among Apple's returns and Benchmark
X = sp500_no_index.values.reshape(-1,1)
y = aapl_no_index.values.reshape(-1,1)
linreg = LinearRegression().fit(X, y)
beta = linreg.coef_[0]
alpha = linreg.intercept_
print('\n')
print('AAPL beta: ', beta.round(3))
print('\nAAPL alpha: ', alpha.round(3))
In [22]:
# Fitting linear relation among Tesla's returns and Benchmark
X = sp500_no_index.values.reshape(-1,1)
y = tsla_no_index.values.reshape(-1,1)
linreg = LinearRegression().fit(X, y)
beta = linreg.coef_[0]
alpha = linreg.intercept_
print('\n')
print('TSLA beta: ', beta.round(3))
print('\nTSLA alpha: ', alpha.round(3))
linreg = LinearRegression().fit(X, y)
beta = linreg.coef_[0]
alpha = linreg.intercept_
print('\n')
print('Walt Disney Company beta: ', beta.round(3))
print('\nWalt Disney Company alpha: ', alpha.round(4))
In [24]:
# Fitting linear relation among AMD's returns and Benchmark
X = sp500_no_index.values.reshape(-1,1)
y = amd_no_index.values.reshape(-1,1)
linreg = LinearRegression().fit(X, y)
beta = linreg.coef_[0]
alpha = linreg.intercept_
print('\n')
print('AMD beta: ', beta.round(3))
print('\nAMD alpha: ', alpha.round(4))
where Rp is the average return of the investment, Rf is the risk-free rate of return, and σp
is the standard deviation of the returns. The average excess return is the difference
between the average return of the investment and the risk-free rate of return, typically
represented by a government bond. The standard deviation is a measurement of the
volatility of returns.
A higher Sharpe ratio indicates that an investment provides higher returns for a given
level of risk compared to other investments with a lower Sharpe ratio. In general, a
Sharpe ratio greater than 1 is considered good, while a Sharpe ratio less than 1 is
considered poor. A Sharpe ratio of 1 means that the investment's average return is equal
to the risk-free rate of return.
In general, a Sharpe ratio under 1.0 is considered bad, equal to 1.0 is considered
acceptable or good, 2.0 or higher is rated as very good, and 3.0 or higher is considered
excellent.
In [25]:
# Calculating Sharpe ratio
print('\n')
print("Sharpe Ratio for AAPL: ", qs.stats.sharpe(aapl).round(2))
print('\n')
print("Sharpe Ratio for TSLA: ", qs.stats.sharpe(tsla).round(2))
print('\n')
print("Sharpe Ratio for DIS: ", qs.stats.sharpe(dis).round(2))
print('\n')
print("Sharpe Ratio for AMD: ", qs.stats.sharpe(amd).round(2))
Apple and Tesla have the highest Sharpe ratios among the stocks analyzed, 0.97 and 0.95,
respectively, indicating that these investments offer a better risk-return relationship.
However, none of the stocks have a Sharp ratio above 1, which may suggest that these
investments' average returns are beneath the risk-free rate of return.
It's important to note that the Sharpe ratio is an annual metric and, since the beginning of
2022, the market, in general, has been bearish, with prices going down over the past year.
Initial Conclusions
Some initial conclusions can be drawn via the analysis of the metrics above:
. Apple and Tesla have the best Sharpe ratios, which indicates a better risk-return
relationship;
. Tesla has the highest returns of them all, but it's also more volatile than Apple and
Disney;
. Apple has higher returns and low volatility compared to the other assets. It has the best
Sharpe ratio, low beta, low standard deviation, and low asymmetry of returns;
. AMD is the riskier and more volatile investment option of the four. Its returns
distribution is highly asymmetric, it has a high standard deviation value and high beta;
. Disney stocks may be a good option for investors that are sensitive to risk, considering
they had a steady and stable return over the period.
It's possible to say that, from all the assets analyzed, Apple offers the best risk-return
relationship, with high rentability and lower risk than the other options.
2 | Building and Optimizing Portfolios
What is a Portfolio?
To build a portfolio, investors must select a combination of assets that are expected to
perform well under different economic and market conditions. The allocation of funds to
each asset is determined by the investor's risk tolerance and investment goals. This
process involves analyzing the investor's financial situation, objectives, time horizon, and
risk tolerance, as well as researching and analyzing individual securities and market
trends. Portfolios are dynamic and should be reviewed and adjusted periodically to reflect
changes in market conditions and in the investor's financial situation, or goals.
The weights in a portfolio refer to the percentage of the total value allocated to each
individual asset. Allocating weights is a critical aspect of portfolio building because it
determines the level of risk and return characteristics of the portfolio. The weight
assigned to an asset reflects the investor's confidence in the asset's ability to generate
returns and their willingness to accept its associated risk. Weights can be determined by
analyzing an asset's historical performance, future growth prospects, sector exposure, and
diversification benefits. Portfolio managers may use various techniques, such as modern
portfolio theory and factor-based investing, to determine optimal weightings. Getting the
weightings right is crucial to achieving the desired outcomes and is a key factor in the
success of any investment strategy.
To start exploring portfolio construction and optimization, we ought to build a portfolio
consisting of the four stocks that have been analyzed so far, with an initial weighting of
25% each.
In [26]:
weights = [0.25, 0.25, 0.25, 0.25] # Defining weights for each stock
portfolio = aapl*weights[0] + tsla*weights[1] + dis*weights[2] + amd*weig
hts[3] # Creating portfolio multiplying each stock for its respective weigh
t
portfolio # Displaying portfolio's daily returns
Out[26]:
Date
2010-07-01 04:00:00 -0.020338
2010-07-02 04:00:00 -0.041286
2010-07-06 04:00:00 -0.040348
2010-07-07 04:00:00 0.028905
2010-07-08 04:00:00 0.026538
...
2023-02-06 05:00:00 -0.007087
2023-02-07 05:00:00 0.018110
2023-02-08 05:00:00 -0.001937
2023-02-09 05:00:00 -0.001783
2023-02-10 05:00:00 -0.022371
Name: Close, Length: 3176, dtype: float64
With Quantstats you can easily create a report to compare the portfolio's performance
and its level of risk with a benchmark, which in this case is the SP500. The platform
provides various metrics and useful visualizations to analyze the portfolio's performance
and risk.
In [27]:
# Generating report on portfolio performance from July 1st, 2010 to Februar
y 10th, 2023
qs.reports.full(portfolio, benchmark = sp500)
Performance Metrics
Strategy Benchmark
------------------------- ---------- -----------
Start Period 2010-07-01 2010-07-01
End Period 2023-02-10 2023-02-10
Risk-Free Rate 0.0% 0.0%
Time in Market 100.0% 100.0%
Beta 1.28 -
Alpha 0.17 -
Correlation 73.94% -
Treynor Ratio 2682.16% -
None
5 Worst Drawdowns
Strategy Visualization
We have a range of metrics and plots available to look at. Firstly, the Cumulative Return
of the portfolio is higher than the benchmark, at 3,429.9% compared to 296.86% for the
SP500. The Sharpe Ratio and Sortino Ratio of the portfolio are also higher, indicating
that it generates better returns for the level of risk taken. In addition, the portfolio has
higher expected daily, monthly and annual returns than the SP500, and its best day,
month, and year outperforms the benchmark's best day, month, and year.
However, the portfolio's maximum drawdown is greater than the benchmark, at -52.21%
compared to -33.92%. This indicates that the portfolio has experienced larger losses at
times than the benchmark. The annualized volatility of the portfolio is also higher, at
around 30.59% compared to the benchmark's 17.96%.
While the portfolio has higher returns on its best day, month, and year, it also has bigger
losses on its worst day, month, and year compared to the benchmark. The beta of 1.28
shows that the portfolio is about 28% more volatile than the overall market, and its
73.94% correlation indicates a strong positive relationship among the four stocks,
suggesting that they tend to move in the same direction, which could increase the
systemic risk of the portfolio.
Overall, the portfolio has generated impressive returns, but it also comes with a higher
degree of risk and volatility. This prompts the question of whether or not it's possible to
optimize the portfolio to reduce risk and volatility while also increasing returns.
Optimizing Portfolio
Portfolio optimization is the process of selecting the optimal combination of assets and
weights to maximize returns and minimize risk. This process involves selecting the most
appropriate weights for each asset,by taking into account the historical performance of
the assets, their correlations with each other, and other relevant factors such as market
conditions and economic outlook. The main goal is to create a well-diversified portfolio
that balances risk and returns, and that aligns with the investor's risk tolerance.
To start the optimization process, we must have a pandas dataframe containing the
adjusted closing prices of the stocks, with dates as index, and each columns representing
each stock. This dataframe will serve as input to optimize the weighting of the stocks in
the portfolio
In [28]:
# Getting dataframes info for Stocks using yfinance
aapl_df = yf.download('AAPL', start = '2010-07-01', end = '2023-02-11')
tsla_df = yf.download('TSLA', start = '2010-07-01', end = '2023-02-11')
dis_df = yf.download('DIS', start = '2010-07-01', end = '2023-02-11')
amd_df = yf.download('AMD', start = '2010-07-01', end = '2023-02-11')
[*********************100%***********************] 1 of 1 completed
[*********************100%***********************] 1 of 1 completed
[*********************100%***********************] 1 of 1 completed
[*********************100%***********************] 1 of 1 completed
In [29]:
# Extracting Adjusted Close for each stock
aapl_df = aapl_df['Adj Close']
tsla_df = tsla_df['Adj Close']
dis_df = dis_df['Adj Close']
amd_df = amd_df['Adj Close']
In [30]:
# Merging and creating an Adj Close dataframe for stocks
df = pd.concat([aapl_df, tsla_df, dis_df, amd_df], join = 'outer', axis =
1)
df.columns = ['aapl', 'tsla', 'dis', 'amd']
df # Visualizing dataframe for input
Out[30]:
The dataframe above will be used as input for the algorithms to optimize the portfolio
In [31]:
# Importing libraries for portfolio optimization
from pypfopt.efficient_frontier import EfficientFrontier
from pypfopt import risk_models
from pypfopt import expected_returns
Markowitz Mean-Variance Optimization Model
First, we need to have expected returns for each of the assets in the portfolio.
PyPortfolioOpt provides the expected_returns module, which calculates expected returns
for the assets by computing the arithmetic mean of their daily percentage changes. The
module assumes that daily prices are available as input and produces expected annual
returns as output. More information on this topic is available here
(https://pyportfolioopt.readthedocs.io/en/latest/ExpectedReturns.html).
Secondly, we need to choose a risk model that quantifies the level of risk in each asset.
The most commonly used risk model is the covariance matrix, which describes the
volatilities of assets and the degree to which they are co-dependent. Choosing an
appropriate risk model is critical, because it can help to reduce risk by making many
uncorrelated bets. PyPortfolioOpt offers a range of risk models to choose from, including
the annualized sample covariance matrix of daily returns, semicovariance matrix, and
exponentially-weighted covariance matrix. Further information on risk models can be
found here (https://pyportfolioopt.readthedocs.io/en/latest/RiskModels.html#risk-
models)
In [32]:
# Calculating the annualized expected returns and the annualized sample cov
ariance matrix
mu = expected_returns.mean_historical_return(df) #expected returns
S = risk_models.sample_cov(df) #Covariance matrix
In [33]:
# Visualizing the annualized expected returns
mu
Out[33]:
aapl 0.268385
tsla 0.475549
dis 0.114966
amd 0.209862
dtype: float64
In [34]:
# Visualizing the covariance matrix
S
Out[34]:
The PyPortfolioOpt library provides the EfficientFrontier class, which takes the
covariance matrix and expected returns as inputs. The weights variable stores the
optimized weights for each asset based on the specified objective, which in this case is the
maximization of the Sharpe ratio, achieved by using the max_sharpe method.
PyPortfolioOpt offers various other optimization objectives, such as weights optimized for
minimum volatility, maximum returns for a given target risk, maximum quadratic utility,
and many others. To read more on optimization objectives, click here
(https://pyportfolioopt.readthedocs.io/en/latest/MeanVariance.html).
In [35]:
# Optimizing for maximal Sharpe ratio
ef = EfficientFrontier(mu, S) # Providing expected returns and covariance
matrix as input
weights = ef.max_sharpe() # Optimizing weights for Sharpe ratio maximizati
on
Out[35]:
OrderedDict([('aapl', 0.70828), ('tsla', 0.29172), ('dis', 0.0), ('am
d', 0.0)])
After running the optimizer, it resulted in an optimized weighting for a portfolio where
70.83% of its allocation is invested in Apple stocks, and the remaining 29.17% invested in
Tesla stocks. No allocation was made to Disney or AMD.
With the optimized weights in hand, we can construct a new portfolio and use Quantstats
to compare its performance to that of the previously constructed portfolio.
In [36]:
# Creating new portfolio with optimized weights
new_weights = [0.70828, 0.29172]
optimized_portfolio = aapl*new_weights[0] + tsla*new_weights[1]
optimized_portfolio # Visualizing daily returns
Out[36]:
Date
2010-07-01 04:00:00 -0.031481
2010-07-02 04:00:00 -0.041054
2010-07-06 04:00:00 -0.042102
2010-07-07 04:00:00 0.022988
2010-07-08 04:00:00 0.029061
...
2023-02-06 05:00:00 -0.005359
2023-02-07 05:00:00 0.016701
2023-02-08 05:00:00 -0.005863
2023-02-09 05:00:00 0.003844
2023-02-10 05:00:00 -0.012936
Name: Close, Length: 3176, dtype: float64
In [37]:
# Displaying new reports comparing the optimized portfolio to the first por
tfolio constructed
qs.reports.full(optimized_portfolio, benchmark = portfolio)
Performance Metrics
Strategy Benchmark
------------------------- ---------- -----------
Start Period 2010-07-01 2010-07-01
End Period 2023-02-10 2023-02-10
Risk-Free Rate 0.0% 0.0%
Time in Market 100.0% 100.0%
Beta 0.86 -
Alpha 0.07 -
Correlation 86.12% -
Treynor Ratio 5623.07% -
None
5 Worst Drawdowns
Strategy Visualization
Based on the report above, the optimized portfolio appears to have performed better than
the original portfolio. Here are some key conclusions that can be drawn by looking at the
metrics and plots in the report:
. CAGR: The compounded annual growth rate (CAGR) of the optimized portfolio is
higher at 36.18% compared to 32.62% for the original portfolio. This suggests that the
optimized portfolio has generated a higher rate of return per year over the entire
investment period.
. Sharpe Ratio: The optimized portfolio has a slightly higher Sharpe ratio of 1.17
compared to 1.08 for the original portfolio, indicating that it has generated a better risk-
adjusted return.
. Drawdown: The maximum drawdown for the optimized portfolio is lower at -45.96%
compared to -52.21% for the original portfolio. This means that the optimized portfolio
experienced lower losses during the worst period of performance.
. Recovery Factor: The recovery factor for the optimized portfolio is much higher at
105.12 compared to 65.7 for the original portfolio, which suggests that the optimized
portfolio was able to recover from drawdowns more quickly and generate higher returns
after experiencing losses.
. Win Rates: The optimized portfolio has slightly higher win rates for win days, win
months, win quarters, and win years, indicating that it had a higher probability of
generating positive returns over these periods.
. Beta: The optimized portfolio's beta of 0.86 indicates that the optimized portfolio is less
volatile than the overall market, and much less volatile than the previously built portfolio.
. Annual Volatility: The optimized portfolio has a slightly lower annual volatility than
the original portfolio, with 30.52% compared to 30.59%, respectively.
In 1992, Fischer Black and Robert Litterman introduced the Black-Litterman Allocation
Model, which takes a Bayesian approach to asset allocation. It combines a prior estimate
of returns with the investor's particular views on his/her expected returns to generate an
optimal allocation. Multiple sources of information can be used to establish the prior
estimate of returns, and the model allows investors to provide a confidence level for their
views, which is then used to optimize allocation.
The Black-Litterman formula calculates a weighted average between the prior estimate of
returns and the views, with the weighting determined by the level of confidence for each
view.
Prior
A commonly used approach for determining a prior estimate of returns involves relying
on the market's expectations, which are reflected in the asset's market capitalization.
To do this, we first need to estimate the level of risk aversion among market participants,
represented by a parameter known as delta, which we calculate using the closing prices
of the SP500. The higher the value for delta, the greater the market's risk aversion.
With this information, we can calculate the prior expected returns for each stock based on
its market capitalization,the delta, and the covariance matrix S, which we've obtained
before optimizing our portfolio with the Markowitz Mean-Variance Model. These prior
expected returns gives us a starting point for the expected returns before we incoporate
any of our views as investors.
Views
In the Black-Litterman model, investors can express their views as either absolute or
relative. Absolute views involve statements like "APPL will return 10%", while relative
views are represented by statements such as "AMZN will outperform AMD by 10%".
These views must be specified in the vector Q and mapped into each asset via the picking
matrix P.
1. TSLA
2. AAPL
3. NVDA
4. MSFT
5. META
6. AMZN
7. AMD
8. HD
9. GOOGL
10. BRKa
And then consider two absolute views and two relative views, such as:
The views vector would be formed by taking the numbers above and specifying them as
below:
The picking matrix would then be used to link the views of the 8 mentioned assets above
to the universe of 10 assets, allowing us to propagate our expectations into the model:
P = np.array([
[1,0,0,0,0,0,0,0,0,0],
[0,1,0,0,0,0,0,0,0,0],
[0,0,0,0,-1,0,0,1,0,0],
[0,0,0,-0.5,0,-0.5,0,0,0.5,0.5],
])
Absolute views have a single 1 in the column corresponding to the asset's order in the
asset universe, while relative views have a positive number in the outperforming asset
column, and a negative number in the underperforming asset column. Each row for
relative views in P must sum up to 0, and the order of views in Q must correspond to the
order of rows in P.
Confidences
The confidence matrix is used to help to define the allocations in each stock. It can be
implemented using the Idzorek's method, allowing investors to specify their confidence
level in each of their views as a percentage. The values in the confidence matrix range
from 0 to 1, where 0 indicates a low level of confidence in the view, and 1 indicates a high
level of confidence.
By using the confidence matrix, investors can better understand the potential impact of
their views on their allocations. For example, if an investor has a high level of confidence
in their view on a particular asset, they may choose to allocate a larger portion of their
portfolio to that asset. On the other hand, if an investor has a low level of confidence in
their view, they may choose to allocate a smaller portion of their portfolio or avoid the
asset altogether.
For more information on the Black-Litterman Allocation Model, I highly suggest you read
this session (https://pyportfolioopt.readthedocs.io/en/latest/BlackLitterman.html) on
the PyPortfolioOpt documentation.
In [38]:
# Mapping assets
assets = ['AAPL', 'TSLA', 'DIS', 'AMD']
In [39]:
# Obtaining market cap for stocks
market_caps = data.get_quote_yahoo(assets)['marketCap']
market_caps # Visualizing market caps for stocks
Out[39]:
AAPL 2518055124992
TSLA 624150380544
DIS 176707338240
AMD 155483013120
Name: marketCap, dtype: int64
In [40]:
# Obtaining closing prices for the SP500
market_prices = yf.download("^GSPC",start = '2010-07-01', end = '2023-02-
11')['Adj Close']
market_prices # Visualizing closing prices for the SP500
[*********************100%***********************] 1 of 1 completed
Out[40]:
Date
2010-07-01 1027.369995
2010-07-02 1022.580017
2010-07-06 1028.060059
2010-07-07 1060.270020
2010-07-08 1070.250000
...
2023-02-06 4111.080078
2023-02-07 4164.000000
2023-02-08 4117.859863
2023-02-09 4081.500000
2023-02-10 4090.459961
Name: Adj Close, Length: 3176, dtype: float64
In [41]:
# Obtaining market-implied risk aversion, the delta
delta = black_litterman.market_implied_risk_aversion(market_prices)
delta # Visualizing delta
Out[41]:
3.3668161617990653
In [42]:
# Visualizing Covariance Matrix
S
Out[42]:
In [43]:
# Changing columns and index to uppercase so it matches market_caps
S.index = S.index.str.upper()
S.columns = S.columns.str.upper()
S
Out[43]:
Out[44]:
AAPL 0.269523
TSLA 0.384137
DIS 0.141240
AMD 0.293677
dtype: float64
Now that we have our prior estimates for each stock, we can now provide the model our
views on these stocks and our confidence levels in our views.
In [45]:
# APPL will raise by 5%
# TSLA will raise by 10%
# AMD will outperform Disney by 15%
In [46]:
# Linking views to assets
P = np.array([
[1,0,0,0], # APPL = 0.05
[0,1,0,0], # TSLA = 0.10
[0,0,-1,1] # AMD > DIS by 0.15
])
In [47]:
# Providing confidence levels
# Closer to 0.0 = Low confidence
# Closer to 1.0 = High confidence
confidences = [0.5,
0.4,
0.8]
In [48]:
# Creating model
bl = BlackLittermanModel(S, # Covariance Matrix
pi = prior, # Prior expected returns
Q = Q, # Vector of views
P = P, # Matrix mapping the views
omega = 'idzorek', # Method to estimate uncertain
ty level of the views based on historical data
view_confidences = confidences) # Confidences
In [49]:
rets = bl.bl_returns() # Calculating Expected returns
ef = EfficientFrontier(rets, S) # Optimizing asset allocation
In [50]:
ef.max_sharpe() # Optimizing weights for maximal Sharpe ratio
weights = ef.clean_weights() # Cleaning weights
weights # Printing weights
Out[50]:
OrderedDict([('AAPL', 0.63718),
('TSLA', 0.18636),
('DIS', 0.01442),
('AMD', 0.16204)])
In [51]:
# Building Black-Litterman portfolio
black_litterman_weights = [0.62588,
0.19951,
0.016,
0.15861]
black_litterman_portfolio = aapl*black_litterman_weights[0] + tsla*black_
litterman_weights[1] + dis*black_litterman_weights[2] + amd*black_litterm
an_weights[3]
After obtaining prior expected returns and providing our views, as well as our confidence
levels, we have an optimized portfolio with the following weights for each asset:
Apple: 62.59%
Tesla: 19.95%
Disney: 1.6%
AMD: 15.86%
In [52]:
# Black-Litterman Portfolio daily returns
black_litterman_portfolio
Out[52]:
Date
2010-07-01 04:00:00 -0.021734
2010-07-02 04:00:00 -0.033732
2010-07-06 04:00:00 -0.030528
2010-07-07 04:00:00 0.030036
2010-07-08 04:00:00 0.019225
...
2023-02-06 05:00:00 -0.010763
2023-02-07 05:00:00 0.018628
2023-02-08 05:00:00 -0.008738
2023-02-09 05:00:00 -0.001324
2023-02-10 05:00:00 -0.012131
Name: Close, Length: 3176, dtype: float64
We might now go on and compare the Black-Litterman optimized portfolio to the original
portfolio
In [53]:
# Comparing Black-Litterman portfolio to the original portfolio
qs.reports.full(black_litterman_portfolio, benchmark = portfolio)
Performance Metrics
Strategy Benchmark
------------------------- ---------- -----------
Start Period 2010-07-01 2010-07-01
End Period 2023-02-10 2023-02-10
Risk-Free Rate 0.0% 0.0%
Time in Market 100.0% 100.0%
Beta 0.91 -
Alpha 0.04 -
Correlation 94.09% -
Treynor Ratio 4497.23% -
None
5 Worst Drawdowns
Strategy Visualization
By using the Black-Litterman Allocation Model, we were able to improve our investment
portfolio's performance metrics compared to the original portfolio, where each asset was
allocated a uniform weight of 25%. The Black-Litterman optimized portfolio
outperformed the original portfolio in several key metrics. First, it generated higher
cumulative return and CAGR, indicating a stronger overall performance. Additionally, the
Sharpe and Sortino ratios were higher, demonstrating greater risk-adjusted returns.
Moreover, the Black-Litterman portfolio had a lower maximum drawdown and annual
volatility compared to the original portfolio, implying less downside risk and more
stability to the optimized portfolio. In terms of expected returns, the Black-Litterman
portfolio had higher daily, monthly, and yearly returns, and the Daily Value-at-Risk was
lower, indicating a lower risk of significant losses in a day.
The Black-Litterman portfolio had a lower averaged drawdown and higher recovery
factor, meaning that it can bounce back faster from losses, and the beta of the optimized
portfolio was much lower than that of the original portfolio, indicating lower overall
market risk. Overall, the Black-Litterman optimized portfolio achieved higher returns at
lower risks.
Both the Markowitz Mean-Variance Model and the Black-Litterman Allocation Model
effectively enhanced the performance and reduced the risk of our original portfolio by
optimizing the allocation weights of Apple, Tesla, Disney, and AMD stocks.
The Markowitz optimization resulted in a portfolio that primarily invested in Apple and a
smaller portion in Tesla, with no allocation in Disney and AMD. On the other hand, the
Black-Litterman optimization allocated funds into all four stocks, but still favored Apple
with the majority of the allocation.
The preference for Apple in both optimizations is not coincidental. Our initial analysis did
reveal that Apple had the highest Sharpe ratio, lowest beta, and demonstrated superior
performance with lower risk compared to the other stocks.
It's also interesting to compare the two optimized portfolios. The Markowitz optimized
portfolio outperformed the Black-Litterman portfolio in terms of cumulative returns,
CAGR, Sharpe ratio, Profit Factor, Recovery Factor, and overall performance. On the
other hand, the Black-Litterman portfolio demonstrated some advantages, such as lower
maximum and average drawdowns, lower annual volatility, and better performance on its
worst day and worst month, although the Markowitz optimization still had lower losses
than the Black-Litterman optimization did on its worst year.
In conclusion, portfolio optimization is a very important step to improve the risk-return
relationship of a portfolio by adjusting its asset allocation. By using various mathematical
models and optimization techniques, it's possible to efficiently improve performance and
reduce the exposure to risk.
While there are different approaches to portfolio optimization, including the Markowitz
Mean-Variance model and the Black-Litterman allocation model, there is no one-size-
fits-all solution. The choice of the model to use depends on the investor's risk tolerance,
investment goals, and overall market conditions.
Technical analysis is a very popular method used for many traders and investors to
evaluate stocks and other assets based on historical price and volume data. It is an
approach used to identify trends, or lack of trends, and help traders and investors to
make decisions based on what they believe the future price movements will be. The
underlying assumption of technical analysis is that past patterns and price movements
tend to repeat themselves, so those can be used to predict future movements. Therefore,
technical analysts examine charts and look for opportunities in patterns and indicators.
In technical analysis, price charts can display a variety of data, including open, high, low,
and close prices, as well as various timeframes such as daily, weekly, and monthly charts.
Volume is another important factor in technical analysis because it can provide insight
into the strength of a trend or an indicator.
Technical analysts use a variety of indicators, which are basically statistical calculations
based on price and volume that can provide additional information about trends and
potential opportunities. Some of the most commonly used indicators in technical analysis
include moving averages, relative strength index (RSI), Bollinger Bands, and many
others.
Despite its popularity among traders, the use of technical analysis may be controversial.
Some critics argue that technical analysis relies too heavily on subjective interpretations
of chart patterns and that it lacks a clear theoretical foundation. They also argue that
technical analysis is prone to false signals and that traders who rely on technical analysis
may miss out on important fundamental factors that can influence the price of stocks.
Fundamental analysis, on the other hand, focuses on analyzing a company's financial and
economic fundamentals, such as revenue, earnings, and market share. Unlike technical
analysis, fundamental analysis is based on the belief that a company's intrinsic value can
be determined by analyzing these factors. Fundamental analysts look at financial
statements, industry trends, and other relevant data to make investment decisions.
It can be said that technical analysis tend to be more appropriate for short-term trading,
whereas fundamental analysis may be better suited for long-term investing. Fundamental
analysis provides investors with a more comprehensive understanding of a company's
financial health and long-term growth prospects.
It can also be said that, while humans tend to operate better with fundamental analysis,
as it requires a deep understanding of the underlying factors that drive a company's value,
computers may operate better with technical analysis, as it relies heavily on quantitative
data that can be analyzed quickly and efficiently. An evidence to that is the fact that the
use of automated trading bots that trade based on technical analysis has become
increasingly popular in recent years. These bots use algorithms to identify patterns and
trends in price data and make trades based on technical signals.
In Python, there are quite a few libraries that are suited for both technical and
fundamental analysis, and we're going to explore how to use these indicators, not only in
conventional ways but also to feed machine learning models.
Technical Indicators
There is a wide range of technical indicators used by traders and investors alike, each
with its own unique set of benefits and drawbacks. Some of the most commonly used
indicators include moving averages, which can be either simple or exponential, as well as
Bollinger Bands, RSI, ATR, historic volatility, and many others.
To help you better understand these indicators, I will briefly introduce some of the most
commonly used ones and display them on a candlestick chart.
Moving Averages
Moving averages are commonly used to smooth out price fluctuations and identify trends
in the price action of stocks, commodities, forex, crypto, and many others.
A moving average is calculated by taking the average price of an asset over a certain
period of time. This period can be as nine days or as long as 200 days, depending on the
trader's preference and the asset being analyzed. The resulting line that represents the
moving average is then plotted on a chart, and traders use this line to gauge the direction
and strength of the trend.
There are two types of moving averages: simple moving averages (SMA) and exponential
moving averages (EMA). SMA calculates the average price of the asset over the specified
period of time and gives equal weight to each data point.So, for instance, a 20-day SMA
would take the sum of the closing prices over the past 20 days and divide by 20 to get the
average price.
EMA, on the other hand, gives more weight to recent price action. The formula for
calculating EMA involves using a multiplier that gives more weight to the most recent
price data. The formula for calculating a 9-day EMA, for example, would be:
Traders often use moving averages to identify dynamic support and resistance levels, as
well as to spot potential trend changes. When the price of an asset is above its moving
average, it is considered to be in an uptrend, while a price below the moving average is
considered to be in a downtrend. Traders also look for crossovers between different
moving averages, which can signal a change in trend direction, such as the Golden Cross
(when the 50-day moving average crosses above the 200-day moving average) and Death
Cross (when the 50-day moving average crosse below the 200-day moving average).
Moving averages can also be used to set stop loss and take profit levels for trades. For
example, a trader may place a stop loss order below the moving average to limit their
losses if the trend reverses.
Bollinger Bands
Bollinger Bands are an exteremely popular tool used by traders and analysts to measure
volatility and to identify potential buy and sell opportunities.
The Bollinger Bands consist of three lines on a chart: a simple moving average (SMA) in
the middle,typically a 20-day moving average, and two bands that are set at a distance of
two standard deviations away from the SMA. When the market is volatile, the bands
widen, and when the market is less volatile, the bands contract. The distance between the
bands can therefore be used as an indicator of volatility.
Usually, traders use these bands to identify possible buying and selling entries. When the
price touches or crosses below the lower band, it may be considered oversold and a
potential buy signal. Conversely, when the price touches or crosses above the upper band,
it may be considered overbought and a potential sell signal.
To obtain the Bollinger Bands values, there are three important steps:
2. Calculate the standard deviation of the stocks's price over the past 20 periods.
3.Calculate the upper and lower bands by adding or subtracting two standard deviations
from the 20-day SMA.
The formula to calculate the upper band is:
The RSI is another popular technical analysis tool used by traders and analysts in
financial markets. It measures the strength of price action and can be used to identify
possible buy and sell opportunities.
The RSI is calculated by comparing the average gains and losses of a stock over a specific
period of time. The formula for calculating the RSI involves two steps:
First, you calculate the Relative Strength (RS) of the stock, which is the ratio of the
average gains to the average losses over time, which is typically the last 14 days.
Average Gain
RS =
Average Loss
In order to obtain the average gain or loss, the difference between the closing price of the
current day and the previous day is taken. If that difference is positive, then it is
considered a gain, but if that difference is negative, it is considered a loss.
The next step is to obtain the RSI using the RS value. The RSI ranges from 0 to 100 and is
plotted on a chart.
100
RSI = 100 −
1+RS
When the RSI is above 70, it is considered an overbought region and could signal a
potential opportunity for selling. When it is below the value of 30, it is considered to be
on an oversold region and could signal a potential opportunity for buying low.
The RSI is also popularly used to identify divergences between the RSI and price, which
can indicate the loss of momentum, i.e. strenght, in the current trend and suggest a
possible reversal.
Average True Range (ATR)
The Average True Range (ATR) is yet another commong indicator used to measure the
volatility of price action and help traders to identify potential buy and sell opportunities.
It is calculated by comparing the true range of a stock over a specific period of time. The
true range is the maximum value of the three calculations below:
1.The difference between the current high and the previous close.
2. The difference between the current low and the previous close.
The difference between the current high and the current low.
Before obtaining the ATR, we need to first obtain the true range through the following
formula:
Where:
The ATR ranges from 0 to infinity, and the higher its value, the higher may be the
volatility in price action, while the opposite is true for a lower ATR value.
Traders and analysts may use the ATR to set stop-loss and take-profit levels, as well as to
identify potential trend reversals. For example, if the ATR value is increasing, it could
indicate that a trend reversal is likely to occur.
In [54]:
# Downloading Apple Stocks
aapl = yf.download('AAPL', start = '2010-07-01', end = '2023-02-11')
[*********************100%***********************] 1 of 1 completed
In [55]:
aapl # Displaying data
Out[55]:
2023-
152.570007 153.100006 150.779999 151.729996 151.498688 69858300
02-06
2023-
150.639999 155.229996 150.639999 154.649994 154.414230 83322600
02-07
2023-
153.880005 154.580002 151.169998 151.919998 151.688400 64120100
02-08
2023-
153.779999 154.330002 150.419998 150.869995 150.639999 56007100
02-09
2023-
149.460007 151.339996 149.220001 151.009995 151.009995 57450700
02-10
In [56]:
aapl.dtypes # Printing data types
Out[56]:
Open float64
High float64
Low float64
Close float64
Adj Close float64
Volume int64
dtype: object
After obtaining historical data of Apple stocks with yfinance, we may use Plotly to plot a
Candlestick chart containing information on price and volume.
In [57]:
# Plotting candlestick chart without indicators
fig = make_subplots(rows=2, cols=1, shared_xaxes=True, vertical_spacing=
0.05, row_heights = [0.7, 0.3])
fig.add_trace(go.Candlestick(x=aapl.index,
open=aapl['Open'],
high=aapl['High'],
low=aapl['Low'],
close=aapl['Adj Close'],
name='AAPL'),
row=1, col=1)
# Plotting annotation
fig.add_annotation(text='Apple (AAPL)',
font=dict(color='white', size=40),
xref='paper', yref='paper',
x=0.5, y=0.65,
showarrow=False,
opacity=0.2)
# Configuring layout
fig.update_layout(title='AAPL Candlestick Chart From July 1st, 2010 to Fe
bruary 10th, 2023',
yaxis=dict(title='Price (USD)'),
height=1000,
template = 'plotly_dark')
180
160
140
120
Price (USD)
100
80
Apple (AAPL)
60
40
20
1.5B
e
Above, it's possible to see a daily candlestick chart containing price and volume of Apple
stocks traded from July 1st, 2010 to February 10th, 2023. The candlestick chart makes it
easy to see information on the opening, closing, high, and low prices of each trading day,
as well as the overall trend in the last 13 years.
The use of the 'Adj Close' column instead of the 'Close' column to plot the chart above is
due to the adjustment of historical prices for dividends and stock splits. The 'Adj Close'
value represents the closing price adjusted for these factors, which allows for a more
accurate representation of the stock's true price over time.
The volume bars below the candlestick chart display the financial volume of shares traded
for each day.
After taking a brief look at a candlestick chart without indicators, we can go on and add a
few indicators to our Apple stocks dataframe and display them on the candlestick chart.
In [58]:
# Adding Moving Averages
aapl['EMA9'] = aapl['Adj Close'].ewm(span = 9, adjust = False).mean() # E
xponential 9-Period Moving Average
aapl['SMA20'] = aapl['Adj Close'].rolling(window=20).mean() # Simple 20-Pe
riod Moving Average
aapl['SMA50'] = aapl['Adj Close'].rolling(window=50).mean() # Simple 50-Pe
riod Moving Average
aapl['SMA100'] = aapl['Adj Close'].rolling(window=100).mean() # Simple 100
-Period Moving Average
aapl['SMA200'] = aapl['Adj Close'].rolling(window=200).mean() # Simple 200
-Period Moving Average
Date
2022-
141.399994 148.720001 140.550003 148.029999 147.804321 111380900 146
11-30
2022-
148.210007 149.130005 146.610001 148.309998 148.083893 71250400 146
12-01
2022-
145.960007 148.000000 145.649994 147.809998 147.584656 65447400 147
12-02
2022-
147.770004 150.919998 145.770004 146.630005 146.406464 68826400 146
12-05
2022-
147.070007 147.300003 141.919998 142.910004 142.692139 64727200 146
12-06
2022-
142.190002 143.369995 140.000000 140.940002 140.725143 69721100 145
12-07
2022-
142.360001 143.520004 141.100006 142.649994 142.432526 62128300 144
12-08
2022-
142.339996 145.570007 140.899994 142.160004 141.943283 76097000 143
12-09
2022-
142.699997 144.500000 141.059998 144.490005 144.269730 70462700 144
12-12
2022-
149.500000 149.970001 144.240005 145.470001 145.248230 93886200 144
12-13
2022-
145.350006 146.660004 141.160004 143.210007 142.991684 82291200 144
12-14
2022-
141.110001 141.800003 136.029999 136.500000 136.291901 98931900 142
12-15
2022-
136.690002 137.649994 133.729996 134.509995 134.304932 160156900 140
12-16
2022-
135.110001 135.199997 131.320007 132.369995 132.168198 79592600 139
12-19
2022-
131.389999 133.250000 129.889999 132.300003 132.098312 77432800 137
12-20
2022-
132.979996 136.809998 132.750000 135.449997 135.243500 85928000 137
12-21
2022-
134.350006 134.559998 130.300003 132.229996 132.028412 77852100 136
12-22
2022-
130.919998 132.419998 129.639999 131.860001 131.658981 63814900 135
12-23
2022-
131.380005 131.410004 128.720001 130.029999 129.831772 69007800 134
12-27
2022-
129.669998 131.029999 125.870003 126.040001 125.847855 85438400 132
12-28
2022-
127.989998 130.479996 127.730003 129.610001 129.412415 75703700 131
12-29
2022-
128.410004 129.949997 127.430000 129.929993 129.731918 77034200 131
12-30
2023-
130.279999 130.899994 124.169998 125.070000 124.879326 112117500 130
01-03
Open High Low Close Adj Close Volume EMA
Date
2023-
126.889999 128.660004 125.080002 126.360001 126.167366 89113600 129
01-04
2023-
127.129997 127.769997 124.760002 125.019997 124.829399 80962700 128
01-05
2023-
126.010002 130.289993 124.889999 129.619995 129.422394 87754700 128
01-06
2023-
130.470001 133.410004 129.889999 130.149994 129.951584 70790800 128
01-09
2023-
130.259995 131.259995 128.119995 130.729996 130.530701 63896200 129
01-10
2023-
131.250000 133.509995 130.460007 133.490005 133.286499 69458900 130
01-11
2023-
133.880005 134.259995 131.440002 133.410004 133.206619 71379600 130
01-12
2023-
132.029999 134.919998 131.660004 134.759995 134.554550 57809700 131
01-13
2023-
134.830002 137.289993 134.130005 135.940002 135.732758 63646600 132
01-17
2023-
136.820007 138.610001 135.029999 135.210007 135.003876 69672800 132
01-18
2023-
134.080002 136.250000 133.770004 135.270004 135.063782 58280400 133
01-19
2023-
135.279999 138.020004 134.220001 137.869995 137.659805 80223600 134
01-20
2023-
138.119995 143.320007 137.899994 141.110001 140.894882 81760300 135
01-23
2023-
140.309998 143.160004 140.300003 142.529999 142.312714 66435100 136
01-24
2023-
140.889999 142.429993 138.809998 141.860001 141.643738 65799300 137
01-25
2023-
143.169998 144.250000 141.899994 143.960007 143.740540 54105100 139
01-26
2023-
143.160004 147.229996 143.080002 145.929993 145.707520 70555800 140
01-27
2023-
144.960007 145.550003 142.850006 143.000000 142.781998 64015300 140
01-30
2023-
142.699997 144.339996 142.279999 144.289993 144.070023 65874500 141
01-31
2023-
143.970001 146.610001 141.320007 145.429993 145.208282 77663600 142
02-01
2023-
148.899994 151.179993 148.169998 150.820007 150.590088 118339000 143
02-02
2023-
148.029999 157.380005 147.830002 154.500000 154.264465 154357300 145
02-03
2023-
152.570007 153.100006 150.779999 151.729996 151.498688 69858300 147
02-06
Open High Low Close Adj Close Volume EMA
Date
2023-
150.639999 155.229996 150.639999 154.649994 154.414230 83322600 148
02-07
2023-
153.880005 154.580002 151.169998 151.919998 151.688400 64120100 149
02-08
2023-
153.779999 154.330002 150.419998 150.869995 150.639999 56007100 149
02-09
2023-
149.460007 151.339996 149.220001 151.009995 151.009995 57450700 149
02-10
In [60]:
# Plotting Candlestick charts with indicators
fig = make_subplots(rows=4, cols=1, shared_xaxes=True, vertical_spacing=
0.05,row_heights=[0.6, 0.10, 0.10, 0.20])
# Candlestick
fig.add_trace(go.Candlestick(x=aapl.index,
open=aapl['Open'],
high=aapl['High'],
low=aapl['Low'],
close=aapl['Adj Close'],
name='AAPL'),
row=1, col=1)
# Moving Averages
fig.add_trace(go.Scatter(x=aapl.index,
y=aapl['EMA9'],
mode='lines',
line=dict(color='#90EE90'),
name='EMA9'),
row=1, col=1)
fig.add_trace(go.Scatter(x=aapl.index,
y=aapl['SMA20'],
mode='lines',
line=dict(color='yellow'),
name='SMA20'),
row=1, col=1)
fig.add_trace(go.Scatter(x=aapl.index,
y=aapl['SMA50'],
mode='lines',
line=dict(color='orange'),
name='SMA50'),
row=1, col=1)
fig.add_trace(go.Scatter(x=aapl.index,
y=aapl['SMA100'],
mode='lines',
line=dict(color='purple'),
name='SMA100'),
row=1, col=1)
fig.add_trace(go.Scatter(x=aapl.index,
y=aapl['SMA200'],
mode='lines',
line=dict(color='red'),
name='SMA200'),
row=1, col=1)
# Bollinger Bands
fig.add_trace(go.Scatter(x=aapl.index,
y=aapl['BB_UPPER'],
mode='lines',
line=dict(color='#00BFFF'),
name='Upper Band'),
row=1, col=1)
fig.add_trace(go.Scatter(x=aapl.index,
y=aapl['BB_LOWER'],
mode='lines',
line=dict(color='#00BFFF'),
name='Lower Band'),
row=1, col=1)
fig.add_annotation(text='Apple (AAPL)',
font=dict(color='white', size=40),
xref='paper', yref='paper',
x=0.5, y=0.65,
showarrow=False,
opacity=0.2)
# Volume
fig.add_trace(go.Bar(x=aapl.index,
y=aapl['Volume'],
name='Volume',
marker=dict(color='orange', opacity=1.0)),
row=4, col=1)
# Layout
fig.update_layout(title='AAPL Candlestick Chart From July 1st, 2010 to Fe
bruary 10th, 2023',
yaxis=dict(title='Price (USD)'),
height=1000,
template = 'plotly_dark')
180
160
140
120
Price (USD)
100
80
60
Apple (AAPL)
40
20
100
RSI
50
6
ATR
4
2
Finally, the Candlestick chart above is a full representation of the price and volume over
time, along with additional indicators such as moving averages, Bollinger Bands, Relative
Strength Index, and ATR.
The amount of indicators on screen may appear overwhelming at first glance, but this is
done on purpose! Don't forget that with Plotly's interactive features, you have the
flexibility to customize the way you visualize the chart to suit your preferences by
selecting the time period and indicators you wish to view.
In the first chart below, we have displayed only the 9-period exponential moving average
and the 20-period simple moving average on the price row. This simplifies the visual
representation of the data and allows you to focus only on these two indicators.
In the next chart, we have chosen to display only the Bollinger Bands with the 20-period
simple moving average on the price row.
Feel free to experiment with the Candlestick chart as you like!
You've seen that, with Python, it is extremely easy not only to calculate, but also to
visualize a wide variety of technical indicators to analyze stocks, as well as other assets.
Beyond that, these indicators can be used as features for machine learning models, and
Python's variety makes it easy to backtest strategies based on technical analysis and see
how well we could perform buying and selling stocks based on these strategies, which is
something we will approach later on in this notebook.
Fundamental Indicators
Just as we have a broad range of technical indicators, the same applies to fundamental
indicators used by investors when they're looking for relevant information when deciding
whether to invest in a company or not.
The most commonly used fundamental indicators include earnings per share (EPS),
price-to-earnings ratio (P/E ratio), return on equity (ROE), debt-to-equity ratio, and
dividend yield. These indicators provide insights into the company's profitability,
valuation, efficiency, financial leverage, and dividend payments. Overall, fundamental
analysis involves examining a company's financial and economic data to evaluate its
financial health and potential for growth.
I'll also briefly present to you some fundamental indicators, so we can make comparison
on the stocks we've been working on for now.
Earnings per Share (EPS)
Earnings per share (EPS) is an indicator that measures a company's net profit per the
number of stocks available in the stock market. It is calculated by dividing the net profit
by the total number of shares. The EPS provides a way for investors to evaluate a
company's profitability on a per-share basis. A higher EPS indicates a company that is
more profitable, which may lead to increased demand for its shares and a higher stock
price in the future.
The P/E ratio indicator serves to compare a company's stock price to its earnings per
share (EPS). It is calculated by dividing the current stock price by the EPS. The P/E ratio
provides investors with insight into how much they are paying for each dollar of earnings.
A higher P/E ratio indicates that investors are willing to pay more for each dollar of
earnings, which may suggest that they expect the company to grow in the future.
Return on Equity (ROE)
The ROE indicator measures a company's profitability by calculating the amount of net
income generated as a percentage of the company's shareholder equity. It is calculated by
dividing the net income by the shareholder equity. The ROE provides a way for investors
to evaluate how effectively a company is using its equity to generate profits. A higher ROE
indicates a more efficient use of shareholder equity, which can lead to increased demand
for shares and higher stock price, as well as increase in company's profits in the future.
Debt-to-Equity Ratio
The dividend yield is an indicator that measures the annual dividend income generated
by a company's stock relative to its current market price. It is calculated by dividing the
annual dividend per share by the current market price per share. The dividend yield
provides a way for investors to evaluate the income potential of a company's stock. A
higher dividend yield indicates that the company is paying out a larger portion of its
profits as dividends, which can be attractive to investors looking for passive income.
Important Note: As of March 2023 it seems that yfinance, as well as other libraries, are having the
Exception: yfinance failed to decrypt Yahoo data response issue. I've tried other methods to obtain
these indicators, but none of them worked, or the data obtained didn't seem to be reliable at all. I'll
manually add the indicators and print them below, however, I'll leave a code cell demonstrating how
you would usually execute this extraction using yfinance, so you can reproduce it when this issue gets
solved.
In [61]:
# Getting AAPL data
# aapl = yf.Ticker("AAPL")
# aapl_eps = aapl.info['trailingEps']
# aapl_pe_ratio = aapl.info['trailingPE']
# aapl_roe = aapl.info['returnOnEquity']*100
# aapl_dy = aapl.info['dividendYield']*100
aapl_eps = 5.89
aapl_pe_ratio = 26.12
aapl_roe = 147.94
aapl_dy = 0.60
# Printing data
print('\n')
print('Apple (AAPL) Fundamental Indicators: ')
print('\n')
print('Earnings per Share (EPS): ',aapl_eps)
print('Price-to-Earnings Ratio (P/E): ', aapl_pe_ratio)
print('Return on Equity (ROE): ', aapl_roe,"%")
print('Dividend Yield: ', aapl_dy,"%")
print('\n')
print('AMD Fundamental Indicators: ')
print('\n')
print('Earnings per Share (EPS): ',amd_eps)
print('Price-to-Earnings Ratio (P/E): ', amd_pe_ratio)
print('Return on Equity (ROE): ',amd_roe,"%")
print('Dividend Yield: ', amd_dy,"%")
print('\n')
Apple (AAPL) Fundamental Indicators:
To briefly draw some conclusion on the indicators above, it can be stated that, compared
to AMD, AAPL has a higher EPS, indicating that they are generating a higher net profit
per share. AAPL also has a lower P/E ratio, meaning that the stock is less expensive per
unit of earnings. It also has an incredibly high ROE, which suggests that they are doing a
good job of managing shareholders' equity to generate profits. On the other hand, AMD
has a relatively low ROE, indicating that the company is not as profitable when compared
to AAPL. AAPL also has a modest dividend yield of 0.6%, indicating that they pay
dividends to shareholders, while AMD does not pay any dividends at all.
Overall, it can be inferred that the indicators above suggest that AAPL is a more profitable
company with a more mature business model than AMD.
4 | Backtesting
Below, we're going to backtest the RSI strategy and the moving average crossover
strategy, which are two different types of trading strategies, on the EUR/USD currency
pair, one of the most traded pair in Forex, in three different timeframes.
RSI Backtesting
The Relative Strength Index (RSI), as mentioned before, can be used to identify
overbought and oversold regions, which allows traders to either sell or buy when it is
above 70 or below 30.
This kind of strategy can be seen as countertrend strategy, as traders are looking for good
entry points to open a position that goes against the current trend.
For this backtesting, we will buy whenever the RSI goes below 30 and sell whenever it
goes above 70.
Consider that this backtesting is very simple, just to give you a direction to look for when
you perform your own backtests. Usually, it is also important to consider the taxes,
slippage, and other costs you'll have while trading. Here, we're roughly talking about
gross profits and losses, so keep this in mind.
For this backtesting, let's consider an initial capital of 100.00 dollars. Then, we're going
to print the initial capital, the number of trades, and the final capital after the backtest.
Hourly Data
In [63]:
# Loading hourly data for the EUR/USD pair
end_date = dt.datetime.now() # Defining the datetime for March 21st
start_date = end_date - dt.timedelta(days=729) # Loading hourly data for t
he last 729 days
[*********************100%***********************] 1 of 1 completed
Out[63]:
Datetime
2021-03-22
1.194458 1.194600 1.193887 1.194030 1.194030 0
19:00:00+00:00
2021-03-22
1.194030 1.194458 1.193602 1.193602 1.193602 0
20:00:00+00:00
2021-03-22
1.193602 1.194315 1.193460 1.193887 1.193887 0
21:00:00+00:00
2021-03-22
1.193745 1.194172 1.193460 1.193745 1.193745 0
22:00:00+00:00
2021-03-22
1.193745 1.194172 1.193317 1.193602 1.193602 0
23:00:00+00:00
... ... ... ... ... ... ...
2023-03-21
1.077935 1.078516 1.076890 1.077470 1.077470 0
15:00:00+00:00
2023-03-21
1.077354 1.077935 1.076542 1.077122 1.077122 0
16:00:00+00:00
2023-03-21
1.077238 1.077470 1.076426 1.076658 1.076658 0
17:00:00+00:00
2023-03-21
1.076658 1.077006 1.076079 1.077006 1.077006 0
18:00:00+00:00
2023-03-21
1.076774 1.077354 1.076542 1.076890 1.076890 0
19:00:00+00:00
Out[64]:
Datetime
2021-03-22
1.194458 1.194600 1.193887 1.194030 1.194030 0 NaN
19:00:00+00:00
2021-03-22
1.194030 1.194458 1.193602 1.193602 1.193602 0 NaN
20:00:00+00:00
2021-03-22
1.193602 1.194315 1.193460 1.193887 1.193887 0 NaN
21:00:00+00:00
2021-03-22
1.193745 1.194172 1.193460 1.193745 1.193745 0 NaN
22:00:00+00:00
2021-03-22
1.193745 1.194172 1.193317 1.193602 1.193602 0 NaN
23:00:00+00:00
2023-03-21
1.077935 1.078516 1.076890 1.077470 1.077470 0 71.456319
15:00:00+00:00
2023-03-21
1.077354 1.077935 1.076542 1.077122 1.077122 0 68.781848
16:00:00+00:00
2023-03-21
1.077238 1.077470 1.076426 1.076658 1.076658 0 65.274986
17:00:00+00:00
2023-03-21
1.076658 1.077006 1.076079 1.077006 1.077006 0 66.648418
18:00:00+00:00
2023-03-21
1.076774 1.077354 1.076542 1.076890 1.076890 0 65.714443
19:00:00+00:00
# Adding annotation
fig.add_annotation(text='EUR/USD 1HR',
font=dict(color='white', size=40),
xref='paper', yref='paper',
x=0.5, y=0.65,
showarrow=False,
opacity=0.2)
# Configuring subplots
fig.update_xaxes(rangeslider_visible=False)
fig.update_yaxes(title_text='Price', row = 1)
fig.update_yaxes(title_text='RSI', row = 2)
fig.show()
EUR/USD Hourly Candlestick Chart from 2021 to 2023
1.2
1.15
1.1
Price
EUR/USD 1HR
1.05
0.95
80
60
In [66]:
# Defining the parameters for the RSI strategy
rsi_period = 14
overbought = 70
oversold = 30
print('\n')
print(f"Number of trades: {num_trades}")
print(f"Initial capital: ${initial_capital}")
print(f"Final capital: ${final_capital:.2f}")
print(f"Total return: {total_return:.2f}%")
print('\n')
fig.add_trace(go.Scatter(x=hourly_eur_usd.index,
y=hourly_eur_usd['portfolio_value'].round(2),
mode='lines',
line=dict(color='#00BFFF'),
name='Portfolio Value'))
fig.show()
Number of trades: 359
Initial capital: $100
Final capital: $100.70
Total return: 0.70%
102
101.5
101
Value ($)
100.5
100
99.5
Daily Data
In [67]:
# Loading daily EUR/USD pair data for the last eight years
daily_eur_usd = yf.download('EURUSD=X', start='2015-03-13', end='2023-03-
13', interval='1d')
[*********************100%***********************] 1 of 1 completed
In [68]:
daily_eur_usd # Daily dataframe
Out[68]:
Date
2015-03-13 1.062598 1.063000 1.048630 1.062631 1.062631 0
2015-03-16 1.048449 1.061400 1.048449 1.048163 1.048163 0
2015-03-17 1.057284 1.064900 1.055298 1.057295 1.057295 0
2015-03-18 1.059591 1.064600 1.058050 1.059805 1.059805 0
Out[69]:
Date
2015-03-13 1.062598 1.063000 1.048630 1.062631 1.062631 0 NaN
2015-03-16 1.048449 1.061400 1.048449 1.048163 1.048163 0 NaN
2015-03-17 1.057284 1.064900 1.055298 1.057295 1.057295 0 NaN
2015-03-18 1.059591 1.064600 1.058050 1.059805 1.059805 0 NaN
# Annotation
fig.add_annotation(text='EUR/USD 1D',
font=dict(color='white', size=40),
xref='paper', yref='paper',
x=0.5, y=0.65,
showarrow=False,
opacity=0.2)
# Configuring subplots
fig.update_xaxes(rangeslider_visible=False)
fig.update_yaxes(title_text='Price', row = 1)
fig.update_yaxes(title_text='RSI', row = 2)
fig.show()
EUR/USD Daily Candlestick Chart from 2015 to 2023
1.25
1.2
1.15
Price
1.1
EUR/USD 1D
1.05
0.95
80
60
In [71]:
# Defining the parameters for the RSI strategy
rsi_period = 14
overbought = 70
oversold = 30
print('\n')
print(f"Number of trades: {num_trades}")
print(f"Initial capital: ${initial_capital}")
print(f"Final capital: ${final_capital:.2f}")
print(f"Total return: {total_return:.2f}%")
print('\n')
fig.add_trace(go.Scatter(x=daily_eur_usd.index,
y=daily_eur_usd['portfolio_value'].round(2),
mode='lines',
line=dict(color='#00BFFF'),
name='Portfolio Value'))
fig.show()
Number of trades: 55
Initial capital: $100
Final capital: $97.99
Total return: -2.01%
104
103
102
101
Value ($)
100
99
98
Weekly Data
In [72]:
# Weekly time frame
weekly_eur_usd = yf.download('EURUSD=X', start='2015-03-13', end='2023-03
-13', interval='1wk')
[*********************100%***********************] 1 of 1 completed
In [73]:
# Calculating the RSI with the TA library
weekly_eur_usd["rsi"] = ta.momentum.RSIIndicator(weekly_eur_usd["Adj Clos
e"], window=14).rsi()
weekly_eur_usd
Out[73]:
Date
2015-03-09 1.062598 1.063000 1.046003 1.048163 1.048163 0 NaN
2015-03-16 1.048449 1.091405 1.048449 1.082661 1.082661 0 NaN
2015-03-23 1.082603 1.104520 1.076950 1.088495 1.088495 0 NaN
# Annotations
fig.add_annotation(text='EUR/USD 1W',
font=dict(color='white', size=40),
xref='paper', yref='paper',
x=0.5, y=0.65,
showarrow=False,
opacity=0.2)
# Configuring subplots
fig.update_xaxes(rangeslider_visible=False)
fig.update_yaxes(title_text='Price', row = 1)
fig.update_yaxes(title_text='RSI', row = 2)
fig.show()
EUR/USD Weekly Candlestick Chart from 2015 to 2023
1.25
1.2
1.15
Price
1.1
EUR/USD 1W
1.05
0.95
80
70
60
In [75]:
# Defining the parameters for the RSI strategy
rsi_period = 14
overbought = 70
oversold = 30
print('\n')
print(f"Number of trades: {num_trades}")
print(f"Initial capital: ${initial_capital}")
print(f"Final capital: ${final_capital:.2f}")
print(f"Total return: {total_return:.2f}%")
print('\n')
fig.add_trace(go.Scatter(x=weekly_eur_usd.index,
y=weekly_eur_usd['portfolio_value'].round(2),
mode='lines',
line=dict(color='#00BFFF'),
name='Portfolio Value'))
fig.show()
Number of trades: 8
Initial capital: $100
Final capital: $100.32
Total return: 0.32%
100.2
100
Value ($)
99.8
99.6
99.4
Several conclusions can be drawn from the results above. First, the hourly timeframe is
characterized by high volatility, leading to frequent swings in the RSI between 30 and 70,
and resulting in a significantly higher number of trades compared to the daily and weekly
timeframes.
Overall, the backtesting results indicate that the strategy was not very successful, with the
portfolio losing 2.01% of its initial value on the daily timeframe and generating only small
returns of 0.32% and 0.31% on the weekly and hourly timeframes, respectively. It's
important to note, however, that this backtesting did not account for slippage and other
additional costs, which could have a significant impact on the overall profitability of the
strategy. Therefore, it's important to carefully consider these factors when applying this
kind of strategy in real-world scenarios.
Moving Average Crossover Backtesting
The moving average crossover strategy, in contrast to the RSI strategy, is a trend-
following strategy. This strategy is based on two different moving averages, one
representing a shorter period and another representing a longer period.
When the shorter moving average crosses above the longer moving average, it's a signal to
buy, and when it crosses below the longer moving average, it's a signal to sell.
For this backtesting, I used an exponential moving average of 9 periods for the shorter
moving average and a simple moving average of 20 periods for the longer moving
average. The choice of an exponential moving average for the shorter period is because it
gives more weight to recent prices, making it more responsive to market changes
compared to the simple moving average.
Once again, we'll consider an initial capital of 100.00 dollars for this backtesting.
Hourly Data
In [76]:
# Creating the 9-period exponential moving average with the TA library
hourly_eur_usd['ema9'] = ta.trend.ema_indicator(hourly_eur_usd['Adj Clos
e'], window=9)
Out[76]:
2021-03-22
1.194458 1.194600 1.193887 1.194030 NaN NaN
19:00:00+00:00
2021-03-22
1.194030 1.194458 1.193602 1.193602 NaN NaN
20:00:00+00:00
2021-03-22
1.193602 1.194315 1.193460 1.193887 NaN NaN
21:00:00+00:00
2021-03-22
1.193745 1.194172 1.193460 1.193745 NaN NaN
22:00:00+00:00
2021-03-22
1.193745 1.194172 1.193317 1.193602 NaN NaN
23:00:00+00:00
... ... ... ... ... ... ...
2023-03-21
1.077935 1.078516 1.076890 1.077470 1.076459 1.073970
15:00:00+00:00
2023-03-21
1.077354 1.077935 1.076542 1.077122 1.076591 1.074195
16:00:00+00:00
2023-03-21
1.077238 1.077470 1.076426 1.076658 1.076605 1.074409
17:00:00+00:00
2023-03-21
1.076658 1.077006 1.076079 1.077006 1.076685 1.074634
18:00:00+00:00
2023-03-21
1.076774 1.077354 1.076542 1.076890 1.076726 1.074871
19:00:00+00:00
# 9 EMA
fig.add_trace(go.Scatter(x=hourly_eur_usd.index,
y=hourly_eur_usd['ema9'],
mode='lines',
line=dict(color='yellow'),
name='EMA 9'))
# 20 SMA
fig.add_trace(go.Scatter(x=hourly_eur_usd.index,
y=hourly_eur_usd['sma20'],
mode='lines',
line=dict(color='green'),
name='SMA 20'))
# Annotation
fig.add_annotation(text='EUR/USD 1HR',
font=dict(color='white', size=40),
xref='paper', yref='paper',
x=0.5, y=0.65,
showarrow=False,
opacity=0.2)
# Layout
fig.update_layout(title='EUR/USD Hourly Candlestick Chart from 2021 to 20
23',
yaxis=dict(title='Price'),
height=1000,
template = 'plotly_dark')
fig.update_xaxes(rangeslider_visible=False)
fig.show()
EUR/USD Hourly Candlestick Chart from 2021 to 2023
1.2
1.15
EUR/USD 1HR
1.1
Price
1.05
1
In [78]:
# Defining the parameters for the moving average crossover strategy
short_ma = 'ema9'
long_ma = 'sma20'
fig.add_trace(go.Scatter(x=hourly_eur_usd.index,
y=hourly_eur_usd['portfolio_value'].round(2),
mode='lines',
line=dict(color='#00BFFF'),
name='Portfolio Value'))
fig.show()
Number of trades: 777
Initial capital: $100
Final capital: $101.48
Total return: 1.48%
101
100
Value ($)
99
98
97
Daily Data
In [79]:
daily_eur_usd['ema9'] = ta.trend.ema_indicator(daily_eur_usd['Adj Clos
e'], window=9)
Out[79]:
Date
2015-03-13 1.062598 1.063000 1.048630 1.062631 NaN NaN
fig.add_trace(go.Scatter(x=daily_eur_usd.index,
y=daily_eur_usd['ema9'],
mode='lines',
line=dict(color='yellow'),
name='EMA 9'))
fig.add_trace(go.Scatter(x=daily_eur_usd.index,
y=daily_eur_usd['sma20'],
mode='lines',
line=dict(color='green'),
name='SMA 20'))
fig.add_annotation(text='EUR/USD 1D',
font=dict(color='white', size=40),
xref='paper', yref='paper',
x=0.5, y=0.65,
showarrow=False,
opacity=0.2)
fig.update_xaxes(rangeslider_visible=False)
fig.show()
EUR/USD Daily Candlestick Chart from 2015 to 2023
1.25
1.2
1.15 EUR/USD 1D
Price
1.1
1.05
1
In [81]:
# Defining the parameters for the moving average crossover strategy
short_ma = 'ema9'
long_ma = 'sma20'
fig.add_trace(go.Scatter(x=daily_eur_usd.index,
y=daily_eur_usd['portfolio_value'].round(2),
mode='lines',
line=dict(color='#00BFFF'),
name='Portfolio Value'))
fig.show()
Number of trades: 131
Initial capital: $100
Final capital: $93.95
Total return: -6.05%
101
100
99
98
Value ($)
97
96
95
94
Weekly Data
In [82]:
weekly_eur_usd['ema9'] = ta.trend.ema_indicator(weekly_eur_usd['Adj Clos
e'], window=9)
Out[82]:
fig.add_trace(go.Scatter(x=weekly_eur_usd.index,
y=weekly_eur_usd['ema9'],
mode='lines',
line=dict(color='yellow'),
name='EMA 9'))
fig.add_trace(go.Scatter(x=weekly_eur_usd.index,
y=weekly_eur_usd['sma20'],
mode='lines',
line=dict(color='green'),
name='SMA 20'))
fig.add_annotation(text='EUR/USD 1WK',
font=dict(color='white', size=40),
xref='paper', yref='paper',
x=0.5, y=0.65,
showarrow=False,
opacity=0.2)
fig.update_xaxes(rangeslider_visible=False)
fig.show()
EUR/USD Weekly Candlestick Chart from 2015 to 2023
1.25
1.2
1.1
1.05
1
In [84]:
# Defining the parameters for the moving average crossover strategy
short_ma = 'ema9'
long_ma = 'sma20'
fig.add_trace(go.Scatter(x=weekly_eur_usd.index,
y=weekly_eur_usd['portfolio_value'].round(2),
mode='lines',
line=dict(color='#00BFFF'),
name='Portfolio Value'))
fig.show()
Number of trades: 24
Initial capital: $100
Final capital: $93.20
Total return: -6.80%
100
99
98
97
Value ($)
96
95
94
93
Although the plots of the total portfolio values for the RSI strategies appeared to be
stationary, the portfolio evolution for the moving average crossover strategy exhibits
some sort of trend, especially in the hourly data. As can be seen, the portfolio experienced
gradual losses from April 2021 to October 2021, which then turned to exponential losses.
However, since May 2022, the strategy has been generating gains again and has been
performing quite well in recent months, at least on the 1 hour timeframe.
Despite this, the strategy generated only a 1.37% return after 777 trades on the hourly
chart, without accounting for taxes, slippage, and other costs. On the daily timeframe, the
strategy resulted in 131 trades with a total loss of 6.05%, closing at 93.95 dollars.
Similarly, on the weekly timeframe, the strategy presents a total loss of 6.80%. And, of
course, it is important to mention that additional costs and factors like slippage could
further impact the final returns.
Overall, the backtests presented here are intended to encourage you to perform your own
tests. I highly suggest that you include additional factors such as slippage, broker costs,
and taxes in your analysis, and try combining multiple strategies simultaneously.
In this notebook, we have only tested the RSI and the moving average crossover strategies
individually. You can, however, combine these strategies, or use the RSI in conjunction
with other indicators such as Bollinger Bands, to develop more complex strategies.
Ultimately, the key to successful backtesting is to continuously refine and optimize your
strategies based on your own experiences and conditions.
link code
Conclusion
This notebook provided an introduction to using Python and Data Science for Financial
Markets. You learned how to analyze financial assets, build portfolios, and optimize them
for the best risk-adjusted returns. You also learned how to evaluate portfolios and
strategies, calculate technical indicators, and plot them on a candlestick chart with Plotly.
Furthermore, you learned how to obtain some of the most widely used fundamental
indicators and backtest technical strategies on historic data.
However, this is just the beginning of your journey into the world of financial data
analysis. To avoid overwhelming you with information, I've decided to conclude this
introductory notebook here.
But don't worry! Soon, I'll post the second part of this notebook, which will be entirely
dedicated to using Machine Learning algorithms for financial markets. So, stay tuned for
that!