[go: up one dir, main page]

0% found this document useful (0 votes)
36 views11 pages

Algorithmic Trading Bot

The document discusses the development of an algorithmic trading bot aimed at improving investment decisions and reducing transaction costs through machine learning techniques. It employs ensemble learning and support vector machines to optimize trading strategies for both long-term and short-term investments, while also addressing the issue of human error in trading. The bot is designed to assist various types of investors, including institutional and retail traders, by automating trade monitoring and execution.

Uploaded by

Sezgin Feim
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views11 pages

Algorithmic Trading Bot

The document discusses the development of an algorithmic trading bot aimed at improving investment decisions and reducing transaction costs through machine learning techniques. It employs ensemble learning and support vector machines to optimize trading strategies for both long-term and short-term investments, while also addressing the issue of human error in trading. The bot is designed to assist various types of investors, including institutional and retail traders, by automating trade monitoring and execution.

Uploaded by

Sezgin Feim
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

International Journal of Scientific & Engineering Research Volume 13, Issue 7, July-2022 520

ISSN 2229-5518

Algorithmic Trading Bot


Arjun Khanijau

Abstract- We aim to dive into the field of financial machine learning by building an algorithmic trading bot. This would be used to make better
investment decisions and for seeking out more profitable trades. A lot of time is spent by traders on continuous monitoring of transactions.
These activities can instead be monitored by the trading bot, thereby saving cost and time for constant supervision. This would lead to a
reduction in transaction costs. It would also greatly reduce the possibility of human errors while placing trades. The bot can be used for
institutional investors and big brokerage houses for both stock market companies like Apple, Microsoft, and Tesla (AAPL, MSFT, TSLA) as
well as cryptocurrency investments like Bitcoin and Ethereum (BTCUSD, ETHUSD). We have implemented two machine learning algorithms
using the concepts of ensemble learning and support vector machine. A random forest regressor is implemented into the buy-and-hold trading
strategy for long-term investments and a support vector regressor is implemented into the scalping trading strategy for short-term
investments. We have also used backtesting for a successful deployment.

Index Terms— Ensemble learning, financial machine learning, hyperparameter optimization, random forests, regression, support vector
machine, trading algorithms

——————————  ——————————

limitation is that patterns vanish immediately because of


1 INTRODUCTION intense competition in the market. Due to this high

IJSER
competition, many traders in the same market use these
machine learning algorithms for the same purpose. So, the

“A RTIFICIAL intelligence is to trade what fire was to the


patterns that are identified by one trader are also available for
other traders in the market.
cavemen.” This is how one industry player described the
impact of algorithmic trading in the financial industry.

There is hardly any aspect of our lives that is untouched by


artificial intelligence, and the financial sector is no exception.
Our bot will attract investors of all types, young or old. It would
An estimated 70%-80% of overall trading volume is generated
also be beneficial for those retail traders who do intraday
through algorithmic trading in many developed financial
trading. According to some sources, 75% of retail traders lose
markets, including the US Stock Market. However, this figure
money with intraday trading. This project would benefit them
is estimated to be around 40% for emerging economies like
as it predicts data based on previous datasets. Automated
India, and this is where we want our project to create an impact.
trading would help corporate workers and other people busy
Our research aims to integrate engineering skills with the
with their work or daily schedules to focus on their work
financial sector.
without thinking about their trades. It will also help beginners
in the trading industry to understand and make trades quickly
Algorithmic trading, or black-box trading, is used for speedily
and conveniently. These trading bots make it very easy for both
placing a trade to generate profits. This is done by using a set
novice traders and seasoned ones to achieve successful results,
of automated and pre-programmed trading instructions. ML
and these results take minor work, time, and loss compared to
algorithms have proved to be highly useful to optimize the
humans. Incorporating economic and financial knowledge
decision-making process of human traders as they manoeuvre
with AI and machine learning is very profitable for future
data and predict the market picture with high accuracy.
trading and boosts both efficiency and revenue.
The trading algorithms scan the markets for qualifying trade
Compared to human traders, algorithmic trading leads to faster
setups, and once the right setup is encountered, they execute
trading speeds and improved accuracy- With the advent of
trades and manage them through an entirely automated
high-frequency trading (HFT), a large number of trading orders
process!
are executed within a fraction of seconds, adding liquidity to
markets. Placing bids manually after reading market trends
ML algorithms are utilized to take in huge historical data and
would take a significant amount of time whereas AI algorithms
predict the future picture with accuracy. These help traders
automate the process, recognize market movements with
find time and space-limited localized patterns. These patterns
higher accuracy, enhance trading strategies and modify
are constantly changing, which would otherwise require a lot
portfolios accordingly. Support vector machines (SVMs),
of time and energy to be spotted by human traders. The
utilized in high-frequency trading, create a line of separation in
IJSER © 2022
http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 13, Issue 7, July-2022 521
ISSN 2229-5518

data and can identify the features which indicate an B. A kind of Stock Market Forecasting method merged
approaching variation in the bid and market pricing. based on sentiment analysis and HMM (2014)

Algorithmic trading also eliminates human error. Human [2] In this patent, they used a method of Sentiment
traders may get affected by intense market pressures, which analysis and HMM by Enhancing the accuracy of Stock
may affect their judgement and lead to poor market decisions. Market Forecasting by employing the emotional
Algorithmic trading aims to reduce such errors caused by
tendency information in economic and financial news
psychological and emotional factors.
webpages and has great potential for application in
domains like sentiment analysis, subject identification,
Stock Market Forecasting, and Website content
monitoring.

Similarities with our solution:


Both the solutions include gathering information, text
2 RELATED WORK extraction, information pre-processing, technically
analysing the Stock Market, and improving the
A. Stock market prediction using natural language
accuracy of Stock market prediction.
processing (2019)

Differences between the patent and our solution:


[1] In this patent they used a method of extracting
In the patent, they used a method of Sentiment
information from online news feeds using natural

IJSER
analysis and HMM, whereas in our solution we have
language processing (NLP) techniques and then using
used Zipline Data Portal Interface to Plot and Chart
that information to predict changes in stock prices or
the Pricing Data using Matplotlib and analyse Candle
volatility. The algorithms are then used in a wide
Stick Charts
range of texts, including publications from online
newspapers such as financial newsletters and the Wall
C. Coordination of algorithms in algorithmic trading
Street Journal, as well as television transcripts, radio
engine (2010)
broadcasts, and annual reports. Their solution is made
up of two parts: a text understanding component that
[3] In this patent, they used a method for optimizing
fills in simple templates instantly and a statistical
algorithmic trading by making the chores of initiating
correlation component that analyses the relationship
and running algorithms easier while also giving real-
between this pattern and stock price gains or declines.
time feedback on the user's automated trade
executions. Preferred implementations of the specific
Similarities with our solution:
topic system overcome recognised algorithmic trading
Both The solutions are related to financial trading
products' limitations by
algorithms, namely the assessment of rapidly
(1) allowing financial markets to use a simplified,
changing sources of information combining natural
instinctive graphical interface to click and drag
language processing and user trading behaviour to
complicated, multi-algorithm investment strategies,
predict fluctuations in stock price or volatility. Both
(2) allowing users to track informational market
These predictions can be utilised to develop successful
impact costs in real-time, and
trading strategies.
(3) automating the classification, management, and
cancellation of algorithms based on user input.
Differences between the patent and our solution:
In the patent, they used natural language processing
Similarities with our solution:
in extracting information from news and parsing or
Both the Patent’s and our solution’s goal is to provide
pattern match on words to identify natural language
real-time feedback to traders on both their order
text describing activities or announcements of a
implementations and the market impact they have.
particular publicly-traded company to fetch the
Both the solutions are trying to complement rather
dataset. Where in our solution we used Zipline Data
than replace the value a human brings towards the
Portal Interface to Plot and Chart the Pricing Data
trading process, broadening rather than narrowing his
using Matplotlib and analyse Candle Stick Charts
point of view on the industry and also how his orders
affect it all through direct visual evidence of changes

IJSER © 2022
http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 13, Issue 7, July-2022 522
ISSN 2229-5518

in the market, the strategies his algorithms employ, 3 FLOW DIAGRAM


and the degree of market effect these strategy cause.

Differences between the patent and our solution:


In the patent they used a graphical interface to start a
complex, multi-algorithm investment strategy by
using the drag and drop method, in our solution we
fetched the data using Zipline Data Portal Interface to
Plot and Chart the Pricing Data using Matplotlib and
analyse Candle Stick Charts

D. User-defined algorithm electronic trading (2021)

[4] In this patent, they tried to provide an algorithm


Trading development tool that removes the problems
associated with conventionally coded algorithms, like
as syntax errors, ambiguous logic, and the necessity
for a non-trader developer to construct the algorithm
according to a trader's specifications. A market grid
shows market data for a certain marketable item. A

IJSER
simulation indicative order input area generates
feedback for assessing operational characteristics of an
algorithm described as in the algorithm. along with an
auto hedging option, a scratch quantity is employed.
If a quantity in a market at a counter order's market
price falls underneath the stated scratch amount, the
counter order's price level decreases.

Similarities with our solution:


Both the patent and our solution refer to an automated The flow diagram provided here is of our project algorithmic
trading platform Certain implementations related to a trading stating a closed form that is a solution to algorithmic
consumer online trading algorithm. trading strategies by building a financial machine learning
model with the proper understanding of financial terminology
Differences between the patent and our solution: and methodology. The flow diagram gives a conceptual guide
In the patent, they are more focused on creating a about our project with the basic steps which we are going to
competitive environment, including acquiring trading apply.
opportunities, such as computerised trading, where
each second or fraction of a second count, aid market Here we are going to ensemble our model, cross-validating in
participants in competing effectively. Where in our financial applications and backtesting. We are going to start our
project with data collection by fetching the dataset for our
solution we are a more focused growing number of
design model. After the dataset collection, we will be building
market players to be engaged at any given moment
the conventional buy-and-hold strategy. Followed by data
and helping them in predicting fluctuations in stock
preprocessing, a machine learning model will be designed in
price. Increasing the number of potential participants which implementation of Random Forests Algorithm is built as
in the market well as going to plug-in RFI in our bot.

The random forests which we are using here are for a better
result due to the bagging of many decisions trees which will
sort the required features according to their importance. We
will have a regression analysis for statistical analysis with the
use of a regression tree called split criterion. After the complete
evaluation of our RFI, we will be building a trading algorithm
such as a One-pass algorithm, Linear algorithm, Scikit-learns,
etc. followed by implementing exploit the correlation strategy.
IJSER © 2022
http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 13, Issue 7, July-2022 523
ISSN 2229-5518

We will then implement the GBoosting which is an ML as financial data structures as these are essential in designing
technique for regression as well as classification problems. any sort of financial machine learning model and hence will
build a bot skeleton. Data preprocessing is done through
We will build the model in a stage-wise way and will generate designing a machine learning model and diving deeper into
them by optimizing an arbitrary differentiable loss function ML and training and evaluating our model.
using the Python language and then evaluating the model We will be fetching the data and the process used will be,
performance. As we reach this step, we going to introduce risk installing the zipline into the packages of the python interpreter
management which is very important as it gives a signal to the system. To retrieve publicly available data, set we are creating
traders for the financial indicators for profitably driving the an account on quandl.com. To reduce the basic errors, we tried
trades. In the advance trading algorithm, we are going to using Quantopian-quandl. We are going to take the help of
introduce a scalping strategy where it will be very helpful to getting a history window to get the data frame containing the
the local investors due to which they will be less prone to risks history window as it is available as a member function of the
and hence attractive. Many such advances are benefiting our zipline data portal interface used internally to answer the
project from settling targets to higher rise to the opportunity by question about data. (df=data_port.get_history_window ()).
proper observation of the trend. The assumed positions which We will be initializing the data portal interface which requires
we are keeping are the real-time trading which will ensure respected three mandatory parameters. The first parameter is
profitability. The process which we will be following for risk asset unscored finder a method reference used internally to
management will be defining goals, measuring risk followed solve assets and here it’s a member method of bundle objects
by designing a system. which is initialized previously calling bundle. Methods and
passing the name of quandl data sets. Our second parameter is
the trading calendar used internally for minutes and session
scheduling. The other two parameters in the data set are also

IJSER
4 PROPOSED METHODOLOGY defined as bundle objects.
A. Data extraction, preprocessing, and feature selection
To identify the combination of features and strategy
parameters that can give the most accurate model we are using
different segments of the dataset. To fetch the data, set we are
going to use quandl and use for data preprocessing. We will
also try using Quantopian-quandl for a more accurate result.
For an accurate model and to evaluate the ultimate feature
space the technique we are going to apply due to its iterative
method is the Random Forest strategy in all the combinations
of strategy data giving a statistical outcome of the most
accurate model. In feature selection, we are using the filter
method. This will pick up the properties which are intrinsic to
the features. Since this method is faster and less expensive than
wrapper techniques. So, it is cheaper to use the filter method.
To measure the linear relationship of multiple variables and to
predict the data we will be using a correlation coefficient. The
logic we are generating behind using this for feature selection
relation is the correlation with the target. We are keeping in
mind the uncorrelation of the target variables between
themselves. The information gain techniques are used to
calculate the entropy further in transforming a dataset. We are
also trying other techniques and will finalize based on the value
which will be close to our profitable prediction values.

We are using the concepts of ensemble learning and support


vector machine to implement the machine learning algorithms.

B. Ensemble learning

This works on the principle that multiple weak entities are


We will be starting our project by building our Trading Bot by more powerful than a single strong entity. If we use a single
exploring our dataset using python libraries such as NumPy decision tree, the number of levels of the tree increases
and pandas. This part will work on the services of statistics and significantly with the increasing complexity of data. Increasing
probability and will introduce the financial terminology as well
IJSER © 2022
http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 13, Issue 7, July-2022 524
ISSN 2229-5518

the number of levels of the decision tree makes it more prone starting year of data availability is much before Tesla’s
to overfitting. foundation year, making the null values not useful.

We can relate the use of diverse decision trees with the E. Splitting criteria
diversity of financial portfolios. Like it is always better to
maintain a mixed portfolio across debt and equity funds to
reduce risk, multiple and diverse decision trees lead to more
efficient performance on unseen data. Ensemble learning
aggregates multiple results leading to more stability and
robustness, and a noise reduction. Another advantage of
ensemble learning is that it can catch both linear and non-linear
relationships of data by using different models and then
forming their ensemble.

C. Bagging using random forest

Bagging is a combination of two terms- bootstrap and


aggregating. Bootstrap refers to creating random samples with Random forest needs splitting criteria which can be either the
replacements from the training data and then building a Gini index (used in the CART algorithm) or information
decision tree for each of the samples. Aggregating refers to entropy (used in ID3 and C4.5 algorithms).
combining the results of multiple models using different
techniques like average (for regression) and majority voting Both Gini index and information entropy were giving similar

IJSER
(for classification). We shall be using the average method since results. But we shall be using the Gini index as it is faster. The
ours is a regression model for predicting stock prices. reason is that entropy makes the use of logarithmic function,
making it more computationally expensive. So, we shall use
multiple CART trees in the random forest.

F. Calculation of time and space complexity


Let n= Number of nodes and d= Number of features

Time complexity for training runtime= O(dn2 log2n)


Reason- Training requires the values to be sorted at each node,
which requires nlog2n time. This is multiplied by the number
of nodes(n). This will be done for every feature so this will be
multiplied by the number of features(d).

Time complexity for inference runtime= O(log2n)


Reason- One path of the tree needs to be traversed, which is
logarithmic.

Space complexity= O(n)


Bagging is a great way to decrease variance. The reason is that Reason- As there are n nodes, linear space is required, which
each model takes in a different subset of the training data due makes the model lightweight.
to randomness. This avoids overfitting the data.
G. Parameters of Random Forest Regression
Random forest uses the above concept. In addition, it also
introduces randomness in selecting a different subset of The first 32 columns of the data were used as features and the
features for every individual tree. This reduces the variance next 8 columns were set as the target. Due to the large size of
even further. the dataset, we used 40% of the data as testing data by setting
the test_size parameter of the train_test_split() function to 0.4.
D. Missing values We started with setting the number of trees to 50. This gave an
R2
Missing values can be handled by dropping the data points or value of 0.75. This parameter was tweaked several times and
by filling them with the median. We opted for dropping the the maximum value of R2 was recorded as 0.9937 using 100
data points with missing values using the dropna() function as trees. There was no significant improvement in the R2 value
this was giving better results. The reason is that the Zipline after increasing the number of trees beyond 100.
IJSER © 2022
http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 13, Issue 7, July-2022 525
ISSN 2229-5518

K. Risk management using CVaR (Conditional Value at Risk)


H. Integrating Random Forest Regressor to trading strategy

We integrate the above random forest model to the trading bot


using Zipline simulation using a simple buy-and-hold strategy.
In this strategy, stocks are bought and held for the long term,
regardless of market fluctuations.

The trading strategy involves computing the maximum of the


future predicted values to the historical mean of the stock. If
this maximum value is greater than the historical mean, this
indicates that the stock value is increasing and that it would be
a good time to buy more shares of this stock. In this case, the
bot would automatically order 1000 shares of the stock. On the
other hand, if the maximum of the future predicted values is
less than the historical mean of the stock, this indicates that it is
a good time to exit our position by selling the stock shares as
the stock value is reducing. In this case, the bot would
automatically sell 1000 shares of the stock. CVaR is used to quantify how much tail risk a portfolio
contains. This is used for optimizing the portfolio for efficient
management of risk. CVaR is calculated using the mean of the
I. Support Vector Regression for implementing Scalpers extreme losses which are present at the left tail of the

IJSER
Trading Strategy distribution after concatenating the historical returns with the
future returns. In our case, we choose this lowest percentile to
Scalping is a more advanced trading strategy than the one we be 5%.
used before. Scalping is used to execute trades at very high
speeds. This strategy works on the principle of opening and This risk metric will trigger an action that would be executed
closing positions rapidly. This limits the exposure to the whenever the returns for a particular trading minute fall below
market. A strict exit strategy is needed so that one huge loss the value at risk. This action would trigger an exit signal to exit
does not affect multiple small profits. our trading position.

Support vector regression is a supervised model that uses the


same concept as support vector machines to plot the best fit 5 RESULT AND IMPLEMENTATION
line. This best fit line is the hyperplane containing the
maximum number of data points. A. Implementation details

The machine learning models were implemented on the


The advantage of using support vector regression is that it is
Jupyter notebook and then combined on distributed version
very robust to outliers. Other regression models try to
minimize the error between the ground truth and predicted control using Git. Our system is using the latest version of Mac
values whereas SVR finds the best fit within a threshold value OS Monterey having 16GB RAM and 14-core GPU. The 16-core
(distance between boundary line and hyperplane). The neural engine helped in the efficient running of the model
generalization capability of this model is superior to others, along with fine-tuning the hyperparameters.
and it also gives a high prediction accuracy.
B. Hyperparameter analysis
J. Parameters of Support Vector Regression Since our project is based on regression, we have used a
function named random forest regressor which we imported
We extract 20 timesteps from the data out of which 15 are kept from sklearn.ensemble. We tried our model with
as lag timesteps which will be used to predict 5 future min_impurity_decrease as 1 which was giving r score of 0.88,
timesteps. We are passing 3 values to the ‘estimator__kernel’ then after testing with many values finally we set its value to 0
parameter- linear, polynomial, and radial basis function (RBF) which gave R2 value of 0.9937.
kernels. In this way, we will be able to find out which kernel
gives the best performance. We used the ‘best_estimator_’
SECTION 1 – MACHINE LEARNING MODEL
parameter of the GridSearchCV library to find out that the
linear kernel was giving the best performance as compared to In the image below we can see that the regression score which
the other two kernels. we are getting is 0.9937 which is a good score.

IJSER © 2022
http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 13, Issue 7, July-2022 526
ISSN 2229-5518

The graph below shows the price of Tesla stock over time.

The graph below depicts the final portfolio value of the tesla
stock and as we can see the final portfolio was somewhere
around 24 million dollars

Here we can see there are two colours – red and blue. The red
colour shows the ground truth(the data in the dataset) whereas
the blue colour shows the data that our model predicted.

IJSER Here the output is negative which means that we have sold the
stock for more value than we have purchased them indicating
that we have done a profit.

The graph below shows the algorithm volatility vs benchmark And this is the final portfolio value for a year which is around
volatility. Here, our volatility is high, but since we know high 17 million dollars
risk equals high profit, it is required.

SECTION 2 – TRADING ALGORITHM

This graph depicts the profit or loss we made through our


model in which when the bar is above 0 we experienced profit
whereas when it is below 0 we experienced loss.

In the figure below the orange line indicates the benchmark


return whereas the blue line represents the return that we are
getting from our model which in comparison is much higher.

IJSER © 2022
http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 13, Issue 7, July-2022 527
ISSN 2229-5518

The graph below has two plots – one for price and the other one SECTION 3 – ADVANCED TRADING ALGORITHM
for correlation
The graph we see below is the Sharpe ratio evaluation metric
which is a financial metric often used by investors.

IJSER The graph below is the bar chart for sharpe ratios and here most
of our values are on the negative side.
The graph below shows that we have only entered the trades
but not exiting them which is a sign of error in the long run.

SECTION 4 – MODEL AND STRATEGY EVALUATION

Now for backtesting, we are using conditional VaR and


The graph below shows gross leverage
implementing that using SVM. For that, we have imported SVR
from sklearn.svm. As a parameter here we have used
‘estimator_C’ which will take a range of values between 1 and
10 where these two are also included. Then after grid searching,

IJSER © 2022
http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 13, Issue 7, July-2022 528
ISSN 2229-5518

using ‘clf.best_params_’ we got our best value for estimator_c Finally, the image below shows the percentage of returns i.e.,
as ‘1’ which gave us a good result. we made around 33 per cent profit.

The below graph also depicts the Sharpe ratio, but this was at
the time of backtesting. As we can see most of the time Sharpe
ratio was in a positive direction. It was only in the beginning
that it was negative

B. Comparison

I.

A. Stock market prediction using natural language


processing (2019)
The graph below shows the returns which we have got and as
we can see they are more in the positive direction.

IJSER
[1] In this patent they used a method of extracting
information from online news feeds using Neural
Networks, a natural language processing technique
and then using that information to predict changes in
stock prices or volatility. While in our model we used
Random forests and Support Vector Machines for the
same. Random forests and SVM are comparatively
inexpensive and don’t require the use of a graphics
processor to complete training. A random forest can
provide a better understanding of a decision tree with
better performance. Whereas for Neural Networks To
be effective, they will require far more data than the
average person has on hand. Support Vector Machines
and Random Forest, on the other hand, require far
fewer data as input. For the sake of performance, the
The graph below depicts cumulative returns(which is defined neural network will just destroy the interpretability of
as the sum of historical returns) and as we can see the return is the data to the level of making it just meaningless.
very high. Therefore, using Random forests and Support vector
Machines over a Neural Networks is the best pick.

B. A kind of Stock Market Forecasting method merged


based on sentiment analysis and HMM (2014)

[2] In this patent, they used Naïve Bayes, a method of


Sentiment analysis and HMM to predict Stock Market
Forecasting. All attributes are assumed to be
conditionally independent in a Naïve Bayes. Since
stock market prediction depends on various
interdependent factors, the prediction, in this case, will
be inadequate. We used Support Vector Machine to
implement Scalper’s trading strategy in our solution.
Because it is not prone to mishaps, SVM can correlate
with other parts within the data, allowing it to grasp
IJSER © 2022
http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 13, Issue 7, July-2022 529
ISSN 2229-5518

the dense characteristics in NLP, and thus results in 1 Algorith Neural Naiev KNN SVM and
sentimental analysis and machine translation, whereas m Netwo e Random
rks Bayes Forests
Naive Bayes' results are inconsistent. You don't
anticipate the inputs to be substantially connected, 2 R2 Value 0.2567 0.36 0.997 0.9937
thus Naïve Bayes is more of a generic approach that 9
only works when we want to categorise a tiny corpus 3 Support 12% 42% 27% 33%
of data with a relatively limited number of input Vector
attributes. Regressio
n profit %
C. Squareoff - Algo Trading firm

[5] Squareoff is a website that provides automated


Trading bots based on quantitative trading strategies 6 CONCLUSION
which automatically place trades in a Trading account.
They used K-nearest neighbours (KNN) to achieve the
We have successfully integrated the random forest regressor
same. When using KNN if the sample size is huge, there into the buy-and-hold strategy which would help investors
will be a high cost of computation during runtime. And make long-term investments without worrying about market
since KNN is a lazy model, we must load all the training fluctuations. The people who prefer a short-term investment
data and calculate distances to all training samples. The can make use of our scalping trading strategy built using the
value of parameter K (number of nearest neighbours) support vector regressor. Upon backtesting, a high return value

IJSER
and the type of distance to be utilized must also be of 33% was obtained, making our models fit for deployment.
determined. As we must compute the distance between This was possible by implementing a strict exit policy for short-
each query instance and all training samples, the term investments wherein the bot would make multiple small
computation time is likewise very long. Therefore, the profits and exit the position before a huge loss can be inflicted.
selection of K, as well as the metric (distance) to use in
KNN, must be carefully calibrated. We used SVM in our
model because Outliers are handled better by SVM than 7 REFERENCES
by KNN. Since there are many characteristics and little
training data, SVM outperforms KNN, making or [1] https://patents.google.com/patent/US8285619B2/en
solution more efficient.
[2]
https://patents.google.com/patent/CN103778215B/en?q=sto
ck+market&oq=stock+market

[3]
https://patents.google.com/patent/US8095455B2/en?q=algo
rithm+trading&oq=algorithm+trading

[4]
https://patents.google.com/patent/JP2017117473A/en?q=al
gorithm+trading&oq=algorithm+trading

[5] https://squareoff.in/

II. Comparison Table concerning Algorithms used

S. Evaluatio A B C Algorith
n n Metris mic
o Trading
Bot

IJSER © 2022
http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 13, Issue 7, July-2022 530
ISSN 2229-5518

IJSER

IJSER © 2022
http://www.ijser.org

You might also like