0% found this document useful (0 votes)

14 views10 pages

Predicting Bitcoin Prices Using Machine Learning

Uploaded by

sinisterrr81

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views10 pages

Predicting Bitcoin Prices Using Machine Learning

Uploaded by

sinisterrr81

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

entropy

Article
Predicting Bitcoin Prices Using Machine Learning
Athanasia Dimitriadou 1 and Andros Gregoriou 2, *

1 College of Business, Law and Social Sciences, University of Derby, Lonsdale House, Quaker Way,
Derby DE1 3HD, UK
2 School of Business and Law, University of Brighton, Elm House, Lewes Road, Brighton BN2 4AT, UK
* Correspondence: a.gregoriou@brighton.ac.uk

Abstract: In this paper we predict Bitcoin movements by utilizing a machine-learning framework.

We compile a dataset of 24 potential explanatory variables that are often employed in the finance
literature. Using daily data from 2nd of December 2014 to July 8th 2019, we build forecasting models
that utilize past Bitcoin values, other cryptocurrencies, exchange rates and other macroeconomic
variables. Our empirical results suggest that the traditional logistic regression model outperforms
the linear support vector machine and the random forest algorithm, reaching an accuracy of 66%.
Moreover, based on the results, we provide evidence that points to the rejection of weak form
efficiency in the Bitcoin market.

Keywords: Bitcoin; machine learning; linear support vector machine; random forest

1. Introduction
Does Bitcoin respond to financial, cryptocurrency, and macroeconomic shocks? Should
Bitcoin follow the efficient market hypothesis? Do the other cryptocurrencies affect the
volatility of Bitcoin prices? Bitcoin emerged in 2009 as the world’s first cryptocurrency,
attracting new investors due to high returns. This is reflected by the returns of Bitcoin, as
quoted on Coinbase, increasing by more than 120% from 2016 to 2017, reaching USD20.000
Citation: Dimitriadou, A.; Gregoriou,
from USD900 for the purchase of a single Bitcoin token. In early 2017, the market capital-
A. Predicting Bitcoin Prices Using
ization of Bitcoin grew significantly from around USD18 billion to nearly USD600 billion at
Machine Learning. Entropy 2023, 25, the end of that year. As an investment asset, Bitcoin was originally in the retail sector but
777. https://doi.org/10.3390/ has now become the benchmark for all other digital currencies that have emerged, such as
e25050777 Ethereum, XRP and Litecoin, among others.
Prior research has compared Bitcoin to gold due to its low correlation with other
Academic Editors: Andreia Dionísio,
financial assets [1]. In a similar vein to gold, Bitcoin can be used to hedge against inflation
Paulo Ferreira, Dora Almeida and
or economic uncertainty, using futures contracts (Bakkt) and unregulated cryptocurrency
Isabel Vieira
derivatives exchanges, such as BitMex, Huobi and OKex [2,3]. The motivation behind this
Received: 12 April 2023 is that Bitcoin has a fixed supply, so it does not suffer from the devaluation problem of
Revised: 6 May 2023 paper money that occurs through quantitative easing.
Accepted: 7 May 2023 Although there are also some studies that focus on predicting stock market price
Published: 10 May 2023 movements, it is important to consider the cryptocurrency market, which, according to
Ferreira et al. [4], is characterized by high volatility, no closed trading periods, relatively
smaller capitalization, and high market data availability. However, in an efficient market [5],
prices of securities in financial markets fully reflect all variable information. Given that the
Copyright: © 2023 by the authors.
future is unknown, prices should follow a random walk; that is, future changes in stock
Licensee MDPI, Basel, Switzerland.
(security) prices should, for all practical purposes, be unpredictable. In the weak-form
This article is an open access article
distributed under the terms and
efficiency case, future returns cannot be predicted based on past price changes. Therefore,
conditions of the Creative Commons
in the long run, one cannot outperform the market by using publicly available information.
Attribution (CC BY) license (https://
However, Bitcoin and other digital assets are not backed by any tangible assets. In
creativecommons.org/licenses/by/ general, Bitcoin and other cryptocurrencies are known to react to certain public market
4.0/).

Entropy 2023, 25, 777. https://doi.org/10.3390/e25050777 https://www.mdpi.com/journal/entropy

Entropy 2023, 25, 777 2 of 10

announcements [6,7]. In this regard, the cryptocurrencies market is highly efficient, with
prices reflecting accessible real-world information almost instantly.
Various types of modeling methodologies have been applied in an attempt to forecast
Bitcoin prices. Among the most prominent techniques are: random forest [8], artificial
neural networks [9,10], bayesian neural networks [11], and deep learning chaotic neural
networks [12]. However, irrational and unexpected factors such as sentiment have been
favored more in empirical research on the Bitcoin market [13–15]. Kraaijeveld and de
Smedt [14] study to what extent public Twitter sentiment can be used to predict price
returns for the nine largest cryptocurrencies, including Bitcoin. Nevertheless, some re-
searchers have examined unexpected US monetary policy announcements, considering
that these exercize a significant impact on Bitcoin [16], while others establish that there is a
connection between cryptocurrencies and news, more broadly through macroeconomics
news announcements. Corbet et al. [16] report that positive news after employment and
durable good announcements results in a decrease in Bitcoin returns. However, an increase
in the percentage of negative news surrounding these announcements is linked with an
increase in Bitcoin returns.
Akyildirim et al. [17] focus on the prediction of cryptocurrency returns by collecting the
twelve most liquid daily cryptocurrencies using machine-learning classification algorithms,
including the support vector machine (SVM), logistic regression models, artificial neutral
networks, and random forest. They find an average classification accuracy close to 50%
for all these techniques. Finally, they observe that the SVM gives superior and more
consistent results compared to those of logistic regression, artificial neural networks, and
random forest classification algorithms. Jaquart et al. [18] also apply machine-learning
techniques to predict high-frequency (one minute to 60 min) Bitcoin prices over the period
4 March 2019 to 10 December 2019. They discover that all tested models make statistically
viable predictions, forecasting the binary market movement with accuracies ranging from
50.9% to 56.0%. Chen et al. [19] apply several machine-learning methods to forecast
high-frequency (5-min intervals) Bitcoin prices. The authors collected daily data between
17 July 2017 and 17 January 2018. For daily forecasting, they observe that statistical methods
and machine learning achieve 66% and 65% accuracy, respectively, which outperforms
benchmark methods.
In our study, we attempt to uncover the potential relationship between cryptocurren-
cies and other financial variables using a machine-learning framework on weekly data. To
accomplish this, we compile a pool of 24 potential regressors based on economic theory
and prior research. Using three different techniques, an SVM model with a linear kernel
and a random forest algorithm, we examine the directional forecasting performance of our
models in comparison to the commonly used logistic regression model. The innovation of
our work stems from the application of state-of-art machine-learning methodology and the
empirical identification of a relationship between Bitcoin and other cryptocurrencies and
macroeconomic variables. We also specifically test the relationship between Bitcoin prices,
exchange rates, and interest rates as a possible empirical validation of the Efficient Market
Hypothesis (EMH) under a machine-learning framework.
The results of the empirical investigation provide evidence that the returns on Bitcoin
are independent of returns on other cryptocurrencies or macroeconomic determinants.
This reveals that Bitcoin is a special asset independent of monetary policy or other digital
currencies. According to this, investors could be able to utilize Bitcoin as a hedge against
regulatory frameworks affecting interest rates and inflation. Given its ability to act as
a hedge and its resistance to quantitative easing due to its limited supply, Bitcoin has
the potential to flourish and strongly influence alternative investments for several years
to come.
The remainder of the paper is organized in the following way: In the next section, we
describe the data. Section 3 outlines the methodology that we use in our research. Section 4
reports our empirical findings. Section 5 summarizes and concludes.
Entropy 2023, 25, 777 3 of 10

2. Data
We developed a binary classifier based on SVM to predict the stock price movements
of Bitcoin. The data was collected daily from Coinlore.com, a website providing high-
frequency cryptocurrency data. The macroeconomic variables and interest rates were
obtained from the Federal Reserve Bank of St. Louis (FRED), and the collection of selected
exchange rates were acquired from Yahoo finance. The data spans from the 2nd of De-
cember 2014 to 8th July 2019. We compiled a dataset of 24 variables, which included the
economic policy uncertainty (EPU) index as a factor, given that an increase in the EPU will
change investors’ sentiments for the worse, according to Yen and Cheng [20] (Panel A). We
included various exchange rates, such as EUR, GBP, JPY, and AUD, against the domestic
country’s USD exchange rate to check whether these currencies affect Bitcoin movements
(Panel B). We also assembled the main interest rates that were used as benchmarks for
the US and the European economy (Panel C). Moreover, following the literature review
that attributes Bitcoin’s movements, we considered that other cryptocurrencies [21] could
influence Bitcoin’s volatility (Panel D). Finally, in Panel E, we created three different vari-
ables: the momentum for each 5, 10 and 15 days from the start of the dataset, giving more
information to the model.
Overall, more than 700 observations were collected, but because the stock exchange is
closed on weekends and there were many missing values, we applied a filtering process to
the data. After we filtered the data, the final sample consisted of 239 observations. Financial
returns ( rt ) = ∆Pt − ∆Pt−1 were calculated with P denoting the closing prices of each
variable in our sample. All the variables in our data, along with summary statistics, are
displayed in Table 1. The JPY/USD exchange rate and the cryptocurrency Deutsche eMark
(DEM), with values of 0.000436 and 0.000019 respectively, appear to have the smallest
positive standard deviations that are close to zero. This indicates that these two factors
have the lowest volatility. For the target (output), we modeled the return of Bitcoin, using a
binary-dependent variable coded as 0 or 1, where 0 indicates that the return of the Bitcoin
value is negative (the value decreased from the previous day) and the 1 indicates that the
return of the BTC is positive (the value increased from the previous day).

Table 1. Descriptive statistics of 18 cryptocurrencies and exchange rates.

Variables Name Std Mean Skew Kurt

TARGET Bitcoin 0.491265 0.401674 0.403676 −1.852619
Panel A: Macroeconomic Variables
Economic Policy Uncertainty Index
USEPUINDXD 44.333749 87.418201 1.409511 4.538997
for United States
Panel B: Exchange Rates
EUR/USD EUR/USD 0.045994 1.133789 0.455684 −0.134286
GBP/USD GBP/USD 0.106192 1.370556 0.548099 −1.066357
JPY/USD JPY/USD 0.000436 0.008880 0.010883 −0.234481
AUD/USD AUD/USD 0.031169 0.749203 0.203230 −0.407285
Panel C: Interest Rates
3-Month Treasury Bill Secondary
TB3MS 44.33288 87.412762 1.409956 4.540290
Market Rate, Discount Basis
Market Yield on U.S. Treasury
DFII10 Securities at 10-Year Constant 0.414795 2.325826 0.085033 −0.639574
Maturity
Entropy 2023, 25, 777 4 of 10

Table 1. Cont.

Variables Name Std Mean Skew Kurt

Panel D: Cryptocurrencies
BTC Real Bitcoin Real Price 3783.646 3371.3188 1.320277 1.466443
DOGE Dogecoin 3781.398 3375.7582 1.320594 1.469247
MAID MaidSafeCoin 0.197630 0.195004 1.814227 −0.328242
XRP XRP 0.349752 0.231999 3.025756 13.524215
NVC Novacoin 1.998961 1.882093 1.768265 2.927803
NMC Namecoin 0.937335 0.955977 2.516421 8.185247
LTC Litecoin 60.35639 43.998117 2.018088 4.684777
GLC Goldcoin 0.071203 0.053806 2.404576 7.640745
DASH Dash 218.0163 142.67451 2.471539 6.954688
DEM Deutsche eMark 0.000019 0.000009 3.678487 15.999390
ABY ArtByte 0.005136 0.002809 3.464867 15.451262
DIME Dimecoin 0.011868 0.009364 3.020989 10.973465
ORB Orbitcoin 0.163214 0.139930 2.294509 6.639723
GRS Groestlcoin 0.380965 0.246300 2.019766 4.325836
Panel E: Momentum Variables
MOM5 Momentum 5-Days 1.146598 0.246300 0.412184 −0.195015
MOM10 Momentum 10-Days 1.591595 3.979079 0.349793 −0.117683
MOM15 Momentum 15-Days 2.102353 6.016736 0.311525 −0.487390

3. Methodology
3.1. Logistic Regression Model
Undertaking directional forecasting requires that the dependent variable be binary
and take two states: 0 or 1, expressing the next negative and positive Bitcoin return
values, respectively. The basic drawback of the ordinary least squares (OLS) regression
methodology is that the nature of the dependent variable makes OLS regression results
irrelevant due to the heteroskedasticity of the estimated errors and the hypothesis violations
in the asymptotic efficiency of the estimated coefficients. To solve this issue, we estimated
x β
the probability Pi = E(yi = 1| xi ) = e ixi β that the dependent variable is equal to 1. Given the
1+ e
conversion of the dependent variable to binary, the logarithm of the probability of being in
state 1 to state 0 is obtained from the following equation, which is called the “logit,” where
xi is the vector of the independent regressors and β is a vector of the estimated coefficients.

Pi
Li = L n = xi β T
1 − Pi
If the estimated Li is above 1, we classify it as belonging to class 1, while if it is below
1, we classify it in class 0.

3.2. Support Vector Machine

Data classification and regression tasks usually include the use of the SVM, a super-
vised machine-learning methodology. It has gained great popularity due to its ability to
provide highly accurate prediction results without making a priori assumptions concerning
the phenomenon under investigation. Finding the ideal hyperplane that maximizes the
distance between the two classes and the highest level of accuracy enables the SVM to
classify the data into two classes [22]. A tiny minority of data points known as support
vectors (SV) that were found using a minimization technique define the hyperplane. This
Entropy 2023, 25, 777 5 of 10

process is shown visually in Figure 1. In our study, the initial dataset is split into two
subsamples: the training set and the testing set. The training step, when the hyperplane is
established, receives 80% of the data. The remaining 20% of the total sample is used in the
Entropy 2023, 25, x FOR PEER REVIEW 5 of 10
testing set, where the generalization ability of the model is tested on the small part of the
dataset that was set aside during the training set.

Hyperplaneselection
Figure1.1.Hyperplane
Figure selectionand
andsupport
supportvectors.
vectors.The
Thepronounced
pronouncedred redcircles
circlesrepresent
representthe
theSVs,
SVs,
thusdefining
thus defining the
the margins
margins with
withthe
thedashed
dashedlines.
lines.The
Thedotted line
dotted describes
line the the
describes separating hyperplane.
separating hyper-
plane.
The hyperplane is defined as:
The hyperplane is defined as:
𝑁
N
ŵ = ∑ ai yi xi
𝑤
̂ = ∑ 𝑎i=𝑖 𝑦1𝑖 𝑥𝑖
T
𝑖=1 xi − yi , i ∈ V
b̂= ŵ
𝑏̂ = 𝑤 𝑇
̂ 𝑥𝑖 − 𝑦𝑖 , i ∈ V
where {i:0<<𝑦yi << C}
whereVV=={i:0 C} is
is the
the set
set of
of support
support vector
vector indices.
indices.
𝑖
TheSVM
The SVMwithwithaalinear
linearkernel
kernelhas
hasbecome
becomewidespread,
widespread,givengiventhat
thatititpossesses
possessesfaster
faster
training and classification speeds with significantly fewer memory
training and classification speeds with significantly fewer memory requirements than requirements than
nonlinear cores due to the SBM’s compact representation of the decision
nonlinear cores due to the SBM’s compact representation of the decision function. In our function. In our
research, we also examine the linear kernel where it detects the separating hyperplane
research, we also examine the linear kernel where it detects the separating hyperplane in
in the original dimensional space of the dataset. The mathematical representation of the
the original dimensional space of the dataset. The mathematical representation of the Ra-
Radian Basis Function (RBF) kernel is the following:
dian Basis Function (RBF) kernel is the following:
−γk1x−𝑥 22
: KK( x(𝑥11, ,x𝑥22))==𝑒e−𝛾‖𝑥
RBF:
RBF 1 −2x‖2 k

Over-fitting
Over-fittingisisaacommon
commonissue issuethat
thatappears
appearsininthethetraining
trainingset,
set,where
wherethe themodel
model
“learns”
“learns” toto accurately describethe
accurately describe thetraining
trainingdata,
data,while
whilegiving
giving worse
worse performance
performance to the
to the test
test
set.set.
ThisThis concern
concern is described
is described in the
in the literature
literature as as
thethe “low
“low bias–high
bias–high variance”
variance” [23,24].
[23,24]. To
To avoid
avoid over-fitting,
over-fitting, weweuseuse a cross-validation
a cross-validation framework,
framework, displayed
displayed in Figure
in Figure 2. The2.initial
The
initial
trainingtraining set isinto
set is split splitninto n equal-sized
equal-sized parts. parts. The training
The training step isstep is performed
performed n times, n using
times,a
using a different sample for testing, and the rest of the model is repeated
different sample for testing, and the rest of the model is repeated in n − 1 parts, each in n − 1 parts,
time
each timeone
holding holding onetest
part for part for test purposes.
purposes. This process This
is process
reiteratedis n
reiterated
times withn times
the samewithsettheof
same set of parameters until all parts of the test process have passed, evaluating
parameters until all parts of the test process have passed, evaluating the average accuracy the aver-
age accuracy
of the modelof the model performance
performance for that set offor that set of hyperparameters
hyperparameters in the
in all n parts of all test.
n parts of
Based
the test. Based on our study, we use a 5-fold cross-validation procedure
on our study, we use a 5-fold cross-validation procedure 5 times, applying and evaluating5 times, applying
and evaluating
its accuracy onits
theaccuracy
sample ofon20%the sample of 20% of the data.
of the data.
Entropy 2023, 25, 777 6 of 10
Entropy 2023, 25, x FOR PEER REVIEW 6 of 10

Figure 2. Overview of a 3-fold cross-validation training scheme. It shows that each fold is used as a
Figure 2. Overview of a 3-fold cross-validation training scheme. It shows that each fold is used as a
testing sample, while the remaining folds are used for training the model for each parameter’s value
testing sample, while the remaining folds are used for training the model for each parameter’s value
combination.
combination.
3.3. Random
3.3. Random Forests
Forests
Random forest
Random forestisisananensemble
ensemble technique
technique that combines
that combines the the
ideaidea
of decision
of decision trees trees
with
the bootstrapping
with the bootstrapping and andaggregating
aggregating procedure
procedureto create a diversified
to create a diversifiedpool poolof individual
of individ-
regression
ual regressionsystems [25].[25].
systems TheThe
randomrandom forest algorithm
forest algorithm is referred to in
is referred to the
in theliterature by
literature
many researchers as a method commonly used to avoid overfitting
by many researchers as a method commonly used to avoid overfitting issues that may issues that may arise
in decision
arise trees by
in decision treescombining
by combiningmultiple decision
multiple trees into
decision treesa into
setup called called
a setup random forest
random
[26,27]. Each tree is constructed from a random set of features where
forest [26,27]. Each tree is constructed from a random set of features where there is a there is a replacement
subsample ofsubsample
replacement size n, the of same
sizeasn, in
thethe initial
same dataset.
as in The dataset.
the initial observations that were notthat
The observations se-
lectednot
were in selected
the bootstrapping process form
in the bootstrapping the form
process out-of-bag (OOB) set
the out-of-bag usedset
(OOB) forusedthe testing
for the
generalization
testing ability ability
generalization of the oftrained model.
the trained To reduce
model. the dependence
To reduce the dependence of the models
of the modelson
thethe
on training
training set,set,
each
each tree uses
tree usesa arandomly
randomlyselected
selectedsubset
subsetof of the
the explanatory variables
(features). Normally, we use the square root of of the
the total
total number
number of of features.
features. The system
classification of
aggregates the classification of each
each tree
tree and
and retains
retains thethe most
most popular
popular class.
class.

3.4. Performance Matrix

3.4. Performance Matrix
Our
Our study
study uses
uses four
four separate
separate performance
performance indicators
indicators to to illustrate
illustrate how
how effectively
effectively the
the
machine-learning categorization models execute detailed forecasting.
machine-learning categorization models execute detailed forecasting. The confusion The confusion matrix
ma-
is created
trix as shown
is created in Table
as shown 2, where
in Table the predictive
2, where scoresscores
the predictive are binary and just
are binary andone single
just one
confusion matrixmatrix
single confusion can analyze it. Eachit.category
can analyze of the confusion
Each category matrixmatrix
of the confusion (TN, FN,
(TN,FP,FN,
TP)FP,
is
evaluated separately. Specifically, the TN expresses the number of predictions
TP) is evaluated separately. Specifically, the TN expresses the number of predictions that that were
correctly classified
were correctly in the in
classified negative category,
the negative while the
category, FP the
while implies the number
FP implies of predictions
the number of pre-
that were incorrectly classified in the positive category. Also, the
dictions that were incorrectly classified in the positive category. Also, the FNFN expresses the number
expresses
of
thepredictions
number ofthat were incorrectly
predictions that were classified in the
incorrectly negativein
classified category, whilecategory,
the negative the TP declares
while
the number of predictions that were correctly classified in the positive category.
the TP declares the number of predictions that were correctly classified in the positive
category.
Table 2. Classification Results using Confusion Matrix.
Table 2. Classification Results using Confusion Matrix.
Predicted Label
0 Predicted Label1
TN 0 FP 1
0
(True Negatives)
TN (False Positives)
FP
Actual 0
(True
FN Negatives) (False TPPositives)
Actual 1
(False Negatives) (True Positives)
FN TP
1
(False Negatives) (True Positives)
Entropy 2023, 25, 777 7 of 10

Based on the results of the confusion matrix, the following performance metrics are
computed to evaluate the models.

TP
Recall = (1)
TP + FN

TP + TN
Accuracy = (2)
TP + FN + FP + FN
TP
Precision = (3)
TP + FN

Precision × Recall
F1-Score = 2 × (4)
TPrecision + Recall
All performance metrix ratios range from 0 to 1. In our research, accuracy is the key
performance matric to evaluate and compare the machine-learning models, as the models
do not have balanced problems between the two classes of the target variable. Accuracy
is expressed as the ratio of all the true predictions (positives and negatives) to the total
number for all datasets. Moreover, accuracy is considered a significant performance metric
in classification problems [28,29]. However, when the dataset has unbalanced data, a high
value of accuracy can be a misleading factor since the models tend to choose the majority
class, achieving extremely high accuracy (“Accuracy Paradox”) [30].
Precision estimates the ratio of true positives cases among all cases (both true and
false), showing how many times our model predicted the positive class, and the numerator
counts how many of those classes were actually positive, while the F1-Score is the harmonic
mean of precision and sensitivity. Recall is the fraction of the true positive instances (cases)
among all the cases (both true and false), reporting all the positive cases. The numerator
counts how many of those cases were correctly predicted by our model.

4. Empirical Results
Given the scope of this study, we apply a coarse-to-fine grid search scheme on the
training set. We can obtain the optimal values of the hyperparameters that maximize the
predictive ability of the SVM and random forest models. To accomplish this, we use a
5-fold cross validation process, avoiding overfitting issues. Given the balanced nature of
our dataset, the procedure continued to identify the best parameters of the optimal model.
The results of the hyperparameters of the SVM model with an RBF kernel are c = 0.0001
and γ = 100, while the optimal hyperparameters for the random forest model that we tested
were n-estimators = 75 (total numbers of decision Trees).
However, the generalization ability of the trained model is evaluated using the testing
dataset, which includes 239 observations. Results of 96 observations present an upward
trend in the Bitcoin’s price, while 143 observations have a negative direction. As a perfor-
mance matric, we employ four different metrics, recall (sensitivity), accuracy, precision,
and F1-Score.
According to Table 3, the random forest and SVM with RBF kernel represent the same
accuracy performance of 58%. However, the Logit model achieves a significantly higher
predictive performance for all performance metrics. The performance of accuracy gives the
highest result of 0.66. This implies that 66% is expressed as true predictions (positive and
negative) in the total number for all the data. The precision is likewise the highest (53%)
through all metrics. This means that from the cases that the model forecasts an increase
in Bitcoin return (true positives + false positives), 53% are actual increased values (true
positives), so were correctly anticipated each time the model predicted this category.
Entropy 2023,
Entropy 25,25,
2023, 777x FOR PEER REVIEW 8 of 810of 10

Table3.3.Performance
Table Performance metrics of the
metrics of the three
threemethodologies.
methodologies.

RecallRecall Accuracy
Accuracy Precision
Precision F1-Score
F1-Score
Logistic Regression
Logistic Regression Model 0.411765 0.66667 0.538462 0.466667
0.411765 0.66667 0.538462 0.466667
SVMModel
Linear Kernel 0.058824 0.583333 0.200000 0.090909
SVMRandom
Linear Kernel
Forest 0.058824
0.588235 0.583333
0.583333 0.200000
0.434783 0.090909
0.500000
Random Forest 0.588235
Notes: 0.583333
Three different performance 0.434783
metrics evaluated and analyzed 0.500000
using the Logistic Regression model, support
vector
Notes:machine, and random
Three different forest technique.
performance metrics evaluated and analyzed using the Logistic Regression
model, support vector machine, and random forest technique.
5. Conclusions
5. Conclusions
Bitcoin has evolved rapidly over the past decades and is attracting strong attention
Bitcoin haswho
from investors, evolved
see rapidly over the
this as part pastalternative
of the decades and is attracting
investment strongWith
space. attention
this sig-
from investors, who see this as part of the alternative investment space.
nificantly growing attention from the investment community, Bitcoin is an important With this signifi-
cantly
asset growing
class attention from
for researchers andthe investment
traders alike. community,
The objective Bitcoin
of our is an important
paper asset
is to construct
class for researchers and traders alike. The objective of our paper is
a model which predicts Bitcoin movements and to investigate whether Bitcoin follows to construct a model
anwhich predicts
efficient Bitcoin
market movements
hypothesis or and to investigate
a random walk. whether Bitcoin
To achieve follows
this, an efficient
we collect a large
market hypothesis or a random walk. To achieve this, we collect a
dataset consisting of 24 variables that includes exchange rates, interest rates,large dataset consisting
macroeco-
of 24 variables
nomic variables,that includes
another exchange rates, interest
13 cryptocurrencies, rates,auxiliary
and four macroeconomic
variables, variables,
spanning an- the
other 13 cryptocurrencies, and four auxiliary variables, spanning the period
period from 2 December 2014 to 8 July 2019. The dataset includes 239 observations (5-days from 2 De-
cember 2014divided
frequency), to 8 Julyinto
2019. Thesubsamples:
two dataset includes 239 observations
in-sample (5-days frequency),
and out-of-sample. di-
Two different
vided into two subsamples: in-sample and out-of-sample. Two different machine-learning
machine-learning techniques and a traditional regression model are used, namely, logistic
techniques and a traditional regression model are used, namely, logistic regression, the
regression, the support vector machine and the random forest algorithm, which demon-
support vector machine and the random forest algorithm, which demonstrate the predict-
strate the predictability of the upward or downward price moves. For the machine-learning
ability of the upward or downward price moves. For the machine-learning model, the
model, the optimal values of the respective hyperparameters were initially found using
optimal values of the respective hyperparameters were initially found using five-fold
five-fold cross-validation and out-of-bag methods to avoid overfitting.
cross-validation and out-of-bag methods to avoid overfitting.
Figure 3 summarizes the results of the three forecasting methodologies used. A tradi-
Figure 3 summarizes the results of the three forecasting methodologies used. A tra-
tional logit model achieved the best performance (66% accuracy) for Bitcoin movements.
ditional logit model achieved the best performance (66% accuracy) for Bitcoin movements.
However,
However,all allthe
the other
other performance metricshave
performance metrics havealmost
almostsimilar
similar and
and lowest
lowest results.
results.

Compare Performance Metrics

0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
Recall Accuracy Precision F1-score
Logistic Regression Model 0.411765 0.66667 0.538462 0.466667
SVM Linear Kernel 0.294118 0.604167 0.416667 0.344828
Random Forest 0.529412 0.5625 0.409091 0.461538

Logistic Regression Model SVM Linear Kernel Random Forest

Figure 3. Aggregated results and comparison of proposed methodologies.

The empirical
The empirical analysis
analysis confirms
confirmsthat
thatthe
thereturns
returnsofofBitcoin
Bitcoinareare
notnot
affected by by
affected the the
returns of other cryptocurrencies or macroeconomic variables. This implies
returns of other cryptocurrencies or macroeconomic variables. This implies that Bitcoin that Bitcoin is
a unique asset that is not related to economic policy or other digital currencies.
is a unique asset that is not related to economic policy or other digital currencies. This This sug-
gests thatthat
suggests investors can can
investors use Bitcoin as a hedge
use Bitcoin against
as a hedge government
against policy on
government inflation
policy and
on inflation
interest rates. Given its hedging qualities and its robustness to quantitative easing
and interest rates. Given its hedging qualities and its robustness to quantitative easing due to
due to its fixed supply, Bitcoin has the ability to continue to grow and make an important
contribution to alternative investments for years to come.
Entropy 2023, 25, 777 9 of 10

Author Contributions: Writing—original draft, A.D. and A.G. All authors have read and agreed to
the published version of the manuscript.
Funding: This research received no external funding.
Conflicts of Interest: The authors declare no conflict of interest.

References
1. Henriques, I.; Sadorsky, P. Can Bitcoin Replace Gold in an Investment Portfolio? J. Risk Financ. Manag. 2018, 11, 48. [CrossRef]
2. Junttila, J.; Pesonen, J.; Raatikainen, J. Commodity market based hedging against stock market risk in times of financial crisis: The
case of crude oil and gold. J. Int. Financ. Mark. Inst. Money 2018, 56, 255–280. [CrossRef]
3. Tronzano, M. Financial Crises, Macroeconomic Variables, and Long-Run Risk: An Econometric Analysis of Stock Returns
Correlations (2000 to 2019). J. Risk Financ. Manag. 2021, 14, 127. [CrossRef]
4. Ferreira, M.; Rodrigues, S.; Reis, C.I.; Maximiano, M. Blockchain: A Tale of Two Applications. Appl. Sci. 2018, 8, 1506. [CrossRef]
5. Fama, E.F. Efficient Capital Markets: A Review of Theory and Empirical Work. J. Financ. 1970, 25, 383–417. [CrossRef]
6. Corbet, S.; Larkin, C.; Lucey, B.; Meegan, A.; Yarovaya, L. Cryptocurrency reaction to FOMC Announcements: Evidence of
heterogeneity based on blockchain stack position. J. Financ. Stab. 2020, 46, 100706. [CrossRef]
7. Joo, M.H.; Nishikawa, Y.; Dandapani, K. Announcement effects in the cryptocurrency market. Appl. Econ. 2020, 52, 4794–4808.
[CrossRef]
8. Basher, S.A.; Sadorsky, P. Forecasting Bitcoin price direction with random forests: How important are interest rates, inflation, and
market volatility? Mach. Learn. Appl. 2022, 9, 100355. [CrossRef]
9. Adcock, R.; Gradojevic, N. Non-fundamental, non-parametric Bitcoin forecasting. Phys. A Stat. Mech. Its Appl. 2019, 531, 121727.
[CrossRef]
10. Nakano, M.; Takahashi, A.; Takahashi, S. Bitcoin technical trading with artificial neural network. Phys. A Stat. Mech. Its Appl.
2018, 510, 587–609. [CrossRef]
11. Jang, H.; Lee, J. An Empirical Study on Modeling and Prediction of Bitcoin Prices with Bayesian Neural Networks Based on
Blockchain Information. IEEE Access 2018, 6, 5427–5437. [CrossRef]
12. Lahmiri, S.; Bekiros, S. Cryptocurrency forecasting with deep learning chaotic neural networks. Chaos Solitons Fractals 2019, 118,
35–40. [CrossRef]
13. Jain, A.; Tripathi, S.; Dwivedi, H.D.; Saxena, P. Forecasting Price of Cryptocurrencies Using Tweets Sentiment Analysis. In
Proceedings of the 2018 Eleventh International Conference on Contemporary Computing (IC3), Noida, India, 2–4 August 2018;
pp. 1–7. [CrossRef]
14. Kraaijeveld, O.; De Smedt, J. The predictive power of public Twitter sentiment for forecasting cryptocurrency prices. J. Int. Financ.
Mark. Inst. Money 2020, 65, 101188. [CrossRef]
15. Valencia, F.; Gómez-Espinosa, A.; Valdés-Aguirre, B. Price Movement Prediction of Cryptocurrencies Using Sentiment Analysis
and Machine Learning. Entropy 2019, 21, 589. [CrossRef]
16. Corbet, S.; Larkin, C.; Lucey, B.M.; Meegan, A.; Yarovaya, L. The impact of macroeconomic news on Bitcoin returns. Eur. J. Financ.
2020, 26, 1396–1416. [CrossRef]
17. Akyildirim, E.; Goncu, A.; Sensoy, A. Prediction of cryptocurrency returns using machine learning. Ann. Oper. Res. 2021, 297,
3–36. [CrossRef]
18. Jaquart, P.; Dann, D.; Weinhardt, C. Short-term bitcoin market prediction via machine learning. J. Financ. Data Sci. 2021, 7, 45–66.
[CrossRef]
19. Chen, Z.; Li, C.; Sun, W. Bitcoin price prediction using machine learning: An approach to sample dimension engineering.
J. Comput. Appl. Math. 2020, 365, 112395. [CrossRef]
20. Yen, K.-C.; Cheng, H.-P. Economic Policy Uncertainty and Cryptocurrency Volatility. Financ. Res. Lett. 2021, 38, 101428. [CrossRef]
21. Zi˛eba, D.; Kokoszczyński, R.; Śledziewska, K. Shock transmission in the cryptocurrency market. Is Bitcoin the most influential?
Int. Rev. Financ. Anal. 2019, 64, 102–125. [CrossRef]
22. Vapnik, V. The Nature of Statistical Learning Theory; Springer: New York, NY, USA, 1995.
23. Mehta, P.; Bukov, M.; Wang, C.-H.; Day, A.G.; Richardson, C.; Fisher, C.K.; Schwab, D.J. A high-bias, low-variance introduction to
Machine Learning for physicists. Phys. Rep. 2019, 810, 1–124. [CrossRef] [PubMed]
24. Russo, D.; Zou, J. How Much Does Your Data Exploration Overfit? Controlling Bias via Information Usage. IEEE Trans. Inf.
Theory 2020, 66, 302–323. [CrossRef]
25. Breiman, L. Bagging predictors. Mach Learn. 1996, 24, 123–140. [CrossRef]
26. Lang, L.; Tiancai, L.; Shan, A.; Xiangyan, T. An improved random forest algorithm and its application to wind pressure prediction.
Int. J. Intell. Syst. 2021, 36, 4016–4032. [CrossRef]
27. Mishina, Y.; Murata, R.; Yamauchi, Y.; Yamashita, T.; Fujiyoshi, H. Boosted Random Forest. IEICE Trans. Inf. Syst. 2015, 98,
1630–1636. [CrossRef]
28. Fernández, J.C.; Carbonero, M.; Gutiérrez, P.A.; Hervás-Martínez, C. Multi-objective evolutionary optimization using the
relationship between F1 and accuracy metrics in classification tasks. Appl. Intell. 2019, 49, 3447–3463. [CrossRef]
29. Vujovic, Ž.Ð. Classification Model Evaluation Metrics. Int. J. Adv. Comput. Sci. Appl. 2021, 12, 599–606. [CrossRef]
Entropy 2023, 25, 777 10 of 10

30. Valverde-Albacete, F.J.; Peláez-Moreno, C. 100% Classification Accuracy Considered Harmful: The Normalized Information
Transfer Factor Explains the Accuracy Paradox. PLoS ONE 2014, 9, e84217. [CrossRef] [PubMed]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.

Electronics 11 04088 v2
No ratings yet
Electronics 11 04088 v2
18 pages
Bitcoin Price Prediction via Machine Learning
No ratings yet
Bitcoin Price Prediction via Machine Learning
6 pages
Predicting Bitcoin Returns Using High-Dimensional
No ratings yet
Predicting Bitcoin Returns Using High-Dimensional
16 pages
Forecasting 06 00016 v2
No ratings yet
Forecasting 06 00016 v2
17 pages
Price Action Prediciton DL
No ratings yet
Price Action Prediciton DL
26 pages
Forecasting Cryptocurrency Returns With Machine Learning
No ratings yet
Forecasting Cryptocurrency Returns With Machine Learning
21 pages
Applied Sciences: Hybrid Forecasting Models Based On The Neural Networks For The Volatility of Bitcoin
No ratings yet
Applied Sciences: Hybrid Forecasting Models Based On The Neural Networks For The Volatility of Bitcoin
16 pages
A Gated Recurrent Unit Approach To Bitcoin
No ratings yet
A Gated Recurrent Unit Approach To Bitcoin
16 pages
A Review Paper On Bitcoin Price Prediction Using Machine Learning Techniques
No ratings yet
A Review Paper On Bitcoin Price Prediction Using Machine Learning Techniques
5 pages
Stochastic Neural Networks For Cryptocurrency
No ratings yet
Stochastic Neural Networks For Cryptocurrency
15 pages
Crypto Market Prediction Using History Data and Machine Learning Algorithms
No ratings yet
Crypto Market Prediction Using History Data and Machine Learning Algorithms
7 pages
The Prediction of Bitcoin Price Through Gold Price Using Long Short-Term Memory Model
No ratings yet
The Prediction of Bitcoin Price Through Gold Price Using Long Short-Term Memory Model
8 pages
2022 V13i7128
No ratings yet
2022 V13i7128
7 pages
Electronics 11 02349 v2
No ratings yet
Electronics 11 02349 v2
22 pages
Crypto Price Forecasting with ML
No ratings yet
Crypto Price Forecasting with ML
7 pages
Prediction of Cryptocurrency Returns Using Machine Learning
No ratings yet
Prediction of Cryptocurrency Returns Using Machine Learning
34 pages
#Paper Forecasting and Trading Cryptocurrencies With Machine Learning Under Changing Market Conditions
No ratings yet
#Paper Forecasting and Trading Cryptocurrencies With Machine Learning Under Changing Market Conditions
30 pages
Document+ +2022 08 03T085010.521
No ratings yet
Document+ +2022 08 03T085010.521
9 pages
Rathan 2019
No ratings yet
Rathan 2019
5 pages
Chapter 1
No ratings yet
Chapter 1
7 pages
Forecasting Bitcoin Closing Price Series Using Linear Regression and Neural Networks Models
No ratings yet
Forecasting Bitcoin Closing Price Series Using Linear Regression and Neural Networks Models
25 pages
IJNRD2302004
No ratings yet
IJNRD2302004
6 pages
Bitcoin Price Prediction and Analysis Using Deep Learning Models
No ratings yet
Bitcoin Price Prediction and Analysis Using Deep Learning Models
10 pages
1 s2.0 S2405844024044463 Main
No ratings yet
1 s2.0 S2405844024044463 Main
9 pages
30-Predicting Bitcoin Returns Using High-Dimensional Technical Indicators (2018)
No ratings yet
30-Predicting Bitcoin Returns Using High-Dimensional Technical Indicators (2018)
33 pages
IJRPR44096
No ratings yet
IJRPR44096
7 pages
Bitcoin Price Prediction Based On Machine Learning and Granger Causality Test
No ratings yet
Bitcoin Price Prediction Based On Machine Learning and Granger Causality Test
7 pages
Using Sentiment and Technical Analysis To Predict Bitcoin With Machine Learning
No ratings yet
Using Sentiment and Technical Analysis To Predict Bitcoin With Machine Learning
17 pages
Extracting Rules Via Markov Chains For Cryptocurrencies Returns Forecasting
No ratings yet
Extracting Rules Via Markov Chains For Cryptocurrencies Returns Forecasting
20 pages
Hel Former
No ratings yet
Hel Former
39 pages
Mathematics 08 01245 v2
No ratings yet
Mathematics 08 01245 v2
29 pages
Paper 26
No ratings yet
Paper 26
14 pages
A New Hybrid Machine Learning Model For Predicting The Bitcoin Price 1740349452
No ratings yet
A New Hybrid Machine Learning Model For Predicting The Bitcoin Price 1740349452
32 pages
Cryptocurrency Price Prediction Using Linear Regression and Long Short-Term Memory (LSTM)
No ratings yet
Cryptocurrency Price Prediction Using Linear Regression and Long Short-Term Memory (LSTM)
10 pages
Forecasting 04 00041
No ratings yet
Forecasting 04 00041
15 pages
Hybrid Data Decomposition-Based Deep Learning For Bitcoin Prediction and Algorithm Trading
No ratings yet
Hybrid Data Decomposition-Based Deep Learning For Bitcoin Prediction and Algorithm Trading
33 pages
IJE - Volume 34 - Issue 1 - Pages 140-148
No ratings yet
IJE - Volume 34 - Issue 1 - Pages 140-148
9 pages
Bitcoin Price Prediction Using Machine Learning Algorithms Report
No ratings yet
Bitcoin Price Prediction Using Machine Learning Algorithms Report
31 pages
Bitcoin Price Prediction Using Machine Learning Models
No ratings yet
Bitcoin Price Prediction Using Machine Learning Models
15 pages
Short-Term Bitcoin Price Fluctuation Prediction Using Social Media and Web Search Data2019
No ratings yet
Short-Term Bitcoin Price Fluctuation Prediction Using Social Media and Web Search Data2019
6 pages
Automated Bitcoin Trading Dapp Using Price Predict
No ratings yet
Automated Bitcoin Trading Dapp Using Price Predict
26 pages
Khalid Salman 2020 IOP Conf. Ser. Mater. Sci. Eng. 928 032007
No ratings yet
Khalid Salman 2020 IOP Conf. Ser. Mater. Sci. Eng. 928 032007
12 pages
Time-Series Prediction of Cryptocurrency Market Using Machine Learning Techniques
No ratings yet
Time-Series Prediction of Cryptocurrency Market Using Machine Learning Techniques
9 pages
E A T S B M: Xploration of Lgorithmic Rading Trategies For THE Itcoin Arket
No ratings yet
E A T S B M: Xploration of Lgorithmic Rading Trategies For THE Itcoin Arket
9 pages
CryptocurrenCryptocurrency Price Prediction Using Regression Models On Momentum Indicators - ICAST - 2023
No ratings yet
CryptocurrenCryptocurrency Price Prediction Using Regression Models On Momentum Indicators - ICAST - 2023
6 pages
Zhao - 2019 - Cryptocurrency Price Prediction Using Support Vector Machines - Albolote - Q1-1
No ratings yet
Zhao - 2019 - Cryptocurrency Price Prediction Using Support Vector Machines - Albolote - Q1-1
20 pages
On Methods of Building The Trading Strategies in T
No ratings yet
On Methods of Building The Trading Strategies in T
9 pages
On Technical Trading and Social Media Indicators For Cryptocurrency Price
No ratings yet
On Technical Trading and Social Media Indicators For Cryptocurrency Price
15 pages
6663-Article Text-15588-1-10-20250326
No ratings yet
6663-Article Text-15588-1-10-20250326
22 pages
Analysis of The Use of Artificial Neural Network Models in Predicting Bitcoin Prices
No ratings yet
Analysis of The Use of Artificial Neural Network Models in Predicting Bitcoin Prices
6 pages
Cryptocurrency Price Prediction Review
No ratings yet
Cryptocurrency Price Prediction Review
5 pages
The Future of Bitcoin Price Predictions Integrating Deep Learning and The Hybrid Model Method
No ratings yet
The Future of Bitcoin Price Predictions Integrating Deep Learning and The Hybrid Model Method
7 pages
Crypto Predictions
No ratings yet
Crypto Predictions
26 pages
17 Sell or HODL Cryptos Cryptocurrency Short-to-Long Term Projection Using Simultaneous Classification-Regression Deep Learning Framework
No ratings yet
17 Sell or HODL Cryptos Cryptocurrency Short-to-Long Term Projection Using Simultaneous Classification-Regression Deep Learning Framework
16 pages
Paper 3807
No ratings yet
Paper 3807
6 pages
Cryptocurrency Price Forecasting
No ratings yet
Cryptocurrency Price Forecasting
12 pages
Ohlc Olhc PDF
100% (3)
Ohlc Olhc PDF
13 pages
Monthly Housing Market Outlook
No ratings yet
Monthly Housing Market Outlook
56 pages
Prasoon - Bajpai - MSQE2 - Prasoon Bajpai
No ratings yet
Prasoon - Bajpai - MSQE2 - Prasoon Bajpai
1 page
27.3. Gudivada - Old & New SS Tank - 496 Lakhs - 14!03!24
No ratings yet
27.3. Gudivada - Old & New SS Tank - 496 Lakhs - 14!03!24
283 pages
Bs Vijay Kumar
No ratings yet
Bs Vijay Kumar
9 pages
Saver CF Declaration of Conformity 3355890-34
No ratings yet
Saver CF Declaration of Conformity 3355890-34
10 pages
Econ 323 Midterm 1 Notes
100% (1)
Econ 323 Midterm 1 Notes
15 pages
Bankislam 12hb
No ratings yet
Bankislam 12hb
3 pages
Bus 5110 - Managerial Accounting - Unit 2 - Written Assignment
100% (1)
Bus 5110 - Managerial Accounting - Unit 2 - Written Assignment
5 pages
Tax Invoice: Client Name: Client Address: Client PAN/VAT No.: Invoice No.: Payment Terms: CASH Phone No.
No ratings yet
Tax Invoice: Client Name: Client Address: Client PAN/VAT No.: Invoice No.: Payment Terms: CASH Phone No.
202 pages
MCV00109994 Rev 06 To Rev 07 Redline
No ratings yet
MCV00109994 Rev 06 To Rev 07 Redline
2 pages
Tribal Livelihood Cluster Plan
No ratings yet
Tribal Livelihood Cluster Plan
15 pages
Iron Ore Contract for Buyers
100% (2)
Iron Ore Contract for Buyers
10 pages
Trussless Roof
No ratings yet
Trussless Roof
12 pages
Financial Compliance Document
75% (4)
Financial Compliance Document
4 pages
5.continuous Girder Design
No ratings yet
5.continuous Girder Design
34 pages
Stability Check For Tower DB
No ratings yet
Stability Check For Tower DB
4 pages
Global Supply Chain Management - Chapter 1
No ratings yet
Global Supply Chain Management - Chapter 1
14 pages
Microeconomics Chapter 6 Calculating Exercises
No ratings yet
Microeconomics Chapter 6 Calculating Exercises
10 pages
Microeconomics 13th Edition Roger A. Arnold Available Any Format
0% (1)
Microeconomics 13th Edition Roger A. Arnold Available Any Format
142 pages
Algorithmic Trading
No ratings yet
Algorithmic Trading
4 pages
Mile Marker Mississippi River
No ratings yet
Mile Marker Mississippi River
8 pages
Theories of Economic Development
No ratings yet
Theories of Economic Development
6 pages
Disclousre
No ratings yet
Disclousre
1 page
Get Forecasting For Economics and Business 1st Edition Gloria Gonzalez-Rivera Solutions Manual Free All Chapters Available
100% (9)
Get Forecasting For Economics and Business 1st Edition Gloria Gonzalez-Rivera Solutions Manual Free All Chapters Available
51 pages
Lesson Plans in Urdu Subject
No ratings yet
Lesson Plans in Urdu Subject
10 pages
Aluminium Solutions for Builders
No ratings yet
Aluminium Solutions for Builders
7 pages
Regd. Folio No: The Share Certificate in Respect of The Said Shares Has Been Lost or Destroyed. I Have Not
No ratings yet
Regd. Folio No: The Share Certificate in Respect of The Said Shares Has Been Lost or Destroyed. I Have Not
1 page
A Conversation With Distinguished Alumnus Charles T. Munger - Novel Investor
No ratings yet
A Conversation With Distinguished Alumnus Charles T. Munger - Novel Investor
21 pages
Bài Tập Theo Từng Unit Unit 1: Family Life Có Đáp Án
No ratings yet
Bài Tập Theo Từng Unit Unit 1: Family Life Có Đáp Án
8 pages

Predicting Bitcoin Prices Using Machine Learning

Uploaded by

Predicting Bitcoin Prices Using Machine Learning

Uploaded by

entropy

Abstract: In this paper we predict Bitcoin movements by utilizing a machine-learning framework.

Entropy 2023, 25, 777. https://doi.org/10.3390/e25050777 https://www.mdpi.com/journal/entropy

Table 1. Descriptive statistics of 18 cryptocurrencies and exchange rates.

Variables Name Std Mean Skew Kurt

Variables Name Std Mean Skew Kurt

3.2. Support Vector Machine

3.4. Performance Matrix

Compare Performance Metrics

Logistic Regression Model SVM Linear Kernel Random Forest

Figure 3. Aggregated results and comparison of proposed methodologies.

You might also like