peerj-cs-08-915
peerj-cs-08-915
peerj-cs-08-915
ABSTRACT
This paper proposes a pattern-based stock trading system using ANN-based deep
learning and utilizing the results to analyze and forecast highly volatile stock price
patterns. Three highly volatile price patterns containing at least a record of the price
hitting the daily ceiling in the recent trading days are defined. The implications of
each pattern are briefly analyzed using chart examples. The training of the neural
network was conducted with stock data filtered in three patterns and trading signals
were generated using the prediction results of those neural networks. Using data from
the KOSPI and KOSDAQ markets, It was found that that the proposed pattern-
based trading system can achieve better trading performances than domestic and
overseas stock indices. The significance of this study is the development of a stock
price prediction model that exceeds the market index to help overcome the continued
freezing of interest rates in Korea. Also, the results of this study can help investors who
fail to invest in stocks due to the information gap.
How to cite this article Oh J. 2022. Development of a stock trading system based on a neural network using highly volatile stock price
patterns. PeerJ Comput. Sci. 8:e915 http://doi.org/10.7717/peerj-cs.915
Most of the existing stock price prediction technical analyses have based the input
features on the moving average (MA) stock price, which can effectively express the recent
trends of price fluctuations. For example, the MACD (moving average convergence
divergence) utilizes the difference between the long term and short-term moving average
to represent the convergence and divergence of the moving average values.
O et al. (2006) performed pattern-defined predictions using patterns related to a
crossover, reversal into an uptrend, and reversal into a downtrend among 5-day, 10-day
and 20-day moving average lines.
However, all of the MA-based technical indicators, including the MACD, have a
‘time-lag’ limitation because buy and sell signals are mostly generated after price trends
have already been developed. This study will attempt to predict highly volatile stock
price patterns by introducing the concept of the ‘upper limit price’, which is defined
independently from the moving average. This study will also utilize Japanese candlestick
indicators and more short-term technical indicators to complement the time-lag problem.
In the Japanese candlestick indicator, a candlestick summarizes the intraday variations
of a stock price, expressing the differences between the opening price, highest, lowest, and
closing prices, through which the most recent price fluctuations can be summarized more
closely.
According to various empirical analyses of the Korean stock market, the Korean stock
market shows market inefficiency due to information asymmetry (Lee, 2007; Bark, 1991).
Although market inefficiency is lower than that of larger foreign stock markets, there is still
an issue of the information gap. Therefore, this paper suggest that special investment and
analysis information to overcome market inefficiency. However, since technical analysis
indicators are price- and chart-based information that many people already know, a new
chart analysis technique is needed.
The efficient market hypothesis asserted by Fama (1965) is rejected if it exceeds the
market rate of return using specific information. This study proposes a new ‘highly volatile
stock price pattern’ that does not yet exist in technical analysis. Using the pattern proposed
in this paper, it is possible to develop a predictive model that exceeds the market return.
The results of this study are in conflict with the efficient market hypothesis.
In summary, this study assumes market inefficiency in the Korean stock market and
provides new information that is expected to affect price fluctuations. The highly volatile
stock price pattern is defined by the relationship between the ‘upper limit’ and stock price
in the Korean stock market. Through fund simulation it was found that investors can
obtain efficient returns through a deep learning stock price prediction model using highly
volatile patterns.
It was also found that it was difficult to predict stock prices by only analyzing simple
charts such as moving averages, so a definition of a particular pattern of variation is needed.
This pattern can be found when there are stock prices that show an upper limit. This pattern
can also appear over various periods of time even when the chart shows the characteristics
of a random walk.
RELATED WORKS
Researches related to stock price prediction have traditionally been conducted using ARIMA
(Benvenuto et al., 2019; Ariyo, Adewumi & Ayo, 2014), Regression (Refenes, Zapranis &
Francis, 1994; Yang, Chan & King, 2002), and Bayesian (Pella & Masuda, 2001) to reflect
the characteristics of time series data.
ARIMA is a statistical model widely used in the financial sector. However, it is a model
that is used exclusively for short-term predictions, and has the disadvantage that it is
difficult to confirm long-term investment performance. Since stock price prediction is
closely related to profits, direct investment in a short-term verified model can lead to risks.
In addition, for the reason that the amount of stock data accumulated from the past is very
vast, a model that can handle large amounts of data is needed. In this regard, there is also a
study result that predicts the index using the ARIMA model has an accuracy of up to 38%
(Devi, Sundar & Alli, 2013). However, the prediction accuracy was very low to carry out
actual investments using this model.
Baysian is a model that can perform classification based on probabilistic theory and used
to predict stock prices in the past. However, with the recent development of artificial neural
networks, it is widely used as a comparative model. Baysian is also evaluated as not suitable
for mid or long-term prediction, such as the ARIMA model. In related studies using the
Bayesian model, predictions were performed with up to 78% accuracy (Malagrino, Roman
& Monteiro, 2018). This was better performance than the ARIMA model, but it showed
lower values than the neural network model to be described below.
Above all, these existing predictive models often misinterpret information due to
underfitting and overfitting problems, so they often do not help much in decision-making
activities for stock price prediction. In addition, it has already been proven that neural
BACKGROUND
MA (Moving average)-based patterns
Before examining the high volatile stock price pattern using the Japanese candlestick
indicator, which is the focus in this study, the meaning and limitations of the four MA-
based patterns presented in related studies are examined, taking the ‘divergence’ pattern
In the same way as above, the volume can also be calculated as the moving average of
the volume, and the 5-day volume moving average of stock trading day can be counted as
follows:
4
1X
VMA5st = volumest −k (2)
5
k=0
where closest is the closing price of the trading day t . The slope of the line connecting a
moving average to another moving average is denoted as Grad and can be calculated using
the following Eq. (3):
MA5st − MA5st −1
Grad5st = . (3)
MA5st
Equation (4) defines the training target set Dbear that corresponds to the divergence
pattern. Divergence refers to when the short-term moving average is located relatively
below the longer-term moving averages, resulting from the continuation of the downward
trend of the stock price for a considerable period of time. Equation (4) represents that the
5-day moving average is smaller than the 10-day moving average and the 10-day moving
average is smaller than the 20-day moving average:
where xs,t is a vector of the input feature, and os,t is the target value representing the price
fluctuation after the occurrence of the pattern; α and β are the entire stock set and the
entire trading day set, respectively. Figure 1 is an example of the charts corresponding to
a divergence pattern. The description of the graph in Fig. 1 is as follows. Sixteen trading
days from A to B correspond to the divergence pattern. In this case, relatively steep price
rises were shown around point B; but, in general, even if a rebound was to appear, the rise
would not be that large.
Equation (5) defines the training target set DTU that corresponds to the reversal to
uptrend pattern. Reversal to uptrend means that one of the moving average lines reverses
from a downtrend to an uptrend; trading day A, B, C in Fig. 2 show reversal to uptrend
patterns. Among these, A shows the case where the 5-day MA line reversed to an uptrend
due to the sharp rise of the stock price.
This case shows the weakness of MA as a prediction indicator because the price had
already risen before the trading signal was issued; the phenomenon of generating the signals
after the price movement has already occurred is called ‘time-lag’. This MA-based pattern
arises very frequently so it has the strength of using a large amount of training data, but
time lag decreases this pattern’s predictive ability.
DTU = xs,t ,os,t |(Grad5st −1 < 0&&Grad5st > 0)||(Grad10st −1 < 0&&Grad10st > 0),
analyzed charts over the years. This pattern is actually mainly seen in mid- to low-priced
stocks in the Korean stock market. Investors have to look directly at a vast amount of data
to utilize this pattern for real investment. However, if the prediction model using deep
learning is properly defined, it is easy to predict when the price rises after a certain pattern,
and even check whether it can actually make profits.
Adjustment pattern with one candlestick after single upper limit (p2)
An adjustment pattern with one candlestick after a single upper limit is when the upper
limit price condition is replaced by the single upper limit price; Fig. 4 shows a normal
example and a counter example of the case. Due to the relatively weaker rising intensity, it
is estimated that the ratio of normal examples is likely to be rather lower than p1 patterns.
!
closest − openst
true if Sanghanst −2 = false and Sanghanst −1 = true and < 0.05
p2st = closest (8)
false otherwise.
These examples of the three patterns examined above imply that price fluctuations after
the pattern occurs can vary depending on the slope of the moving average and the form
of the candlestick. As an example, the under tail is attached to the last candlestick in all
the patterns shown in the three normal examples. This means that the stock ended with
the emergence of buying powers leading the rebound in price after trading hours and is
more likely to be bullish the next day. Since the various factors act in combination on the
direction of the stock price after the appearance of the pattern, the neural network training
that will be described in the next section is needed to utilize these patterns in the trading
system.
EXPERIMENTS
Input features configuration for neural network
In order to train the neural networks for future price predictions for each pattern presented
in the previous sections, the input feature set constructing an input vector xs,t was used for
the input to the neural network from the training data and the target value corresponding
to the desired output was defined. Disparity, representing the distance between MA and the
current price, is denoted as Disp and the Disp from the 5-day MA line can be calculated
using the following equation:
closest − MA5st −1
Disp5st = . (10)
MA5st
Apart from the moving average line, the input features relating to Japanese candlesticks
include: RC (rate of change) in the trading day price compared to the previous day, Body,
US (upper shadow), and LS (lower shadow). These are defined in Eqs. (11) through (14),
respectively:
closest − closest −1
RCst = 100 × (11)
closest
xs,t = (RCst ,RCst −1 ,RCst −2 ,Bodyst ,Bodyst −1 ,Bodyst −2 ,USst ,LSst ,Grad5st ,Grad10st ,
Grad20st ,Disp5st ,Disp10st ,Disp20st ,VGrad5st ,
VGrad10st ,VGrad20st ,VDisp5st ,VDisp10st ,VDisp20st ) (15)
where VGrad is the slope of the volume moving average line. VDisp is the difference
between the volume moving average and the total volume; these two indicators can be
calculated by entering the total volume instead of the close price in the equations of the
Grad and Disp. Each input feature should be normalized as a value between 0 and 1 before
used.
EXPERIMENT RESULTS
Performance evaluation of prediction model
This section will present the experiment results of applying the proposed trading system to
the prices of 2,268 stocks listed on the KOSDAQ and KOSPI markets of the Korean Stock
Exchange. The entire dataset used in the experiment is divided into 4 sub-sets; the details
are shown in Table 1. In other related studies, there are no results of measuring returns
through the prediction model. In studies related to stock price prediction, not only the
accuracy of the prediction model but also the measurement of the rate of profit should
be conducted at the same time. In this paper, a fund simulation dataset is additionally
configured for this purpose. Training, verification, and test data are used exclusively for
determining prediction models. Only the data that was not used to generate the prediction
model was used for fund simulation. This process is for cross-validation and accurate rate
of profit measurement.
This study was aimed only at predicting the domestic stock market, so only data from
the KOSPI and KOSDAQ were used. . The total number of KOSPI and KOSDAQ stocks
in Korea is about 2,436. In this study, data were collected from October 2017 to December
2020 based on KOSPI and KOSDAQ stocks, the total number of data used for training,
verification, and testing, and additional fund simulations exceeded 2 million. Because of
the large amount of data, other analytical techniques and theories were needed to add data
from overseas stock markets.
The experiments showed that as training progressed, loss decreased and accuracy
increased, as shown in Fig. 7. The loss calculation used MSE, and the equation is as follows.
Where N is the number of samples, P is the predicted value and A is real value. Mean
square error is defined as the variance between predicted and actual values (Namasudra,
Test datasets showed a slight increase in loss and a slight decrease in accuracy with each
epoch. However, we can see significantly better numerical results than the training process.
The evaluation of the prediction model is shown in Table 2. The model’s evaluation was
performed on two criteria: accuracy and F1 score. Accuracy is a metric for the classification
model as a percentage of the total predictions performed. The F1 score is the harmonic
mean of precision and recall (Chakraborty et al., 2020). The equation for accuracy and f1
score is as follows.
Accuracy
True Positives + True Negatives
= 100 × (17)
True Positives + True Negatives + False Positive + False Negatives
Precision ∗ Recall
F 1 score = 2 ∗ (18)
Precision + Recall
where:
result is slightly lower or better compared to other classification models (Agrawal et al.,
2021; Ndichu, Kim & Ozawa, 2020) not stock price prediction models. However, in the
case of stock price forecasts, this figure can be said to be a good result due to high uncertain
volatility.
The predictors by pattern were constructed using the training data by pattern, and the
optimum trading policies were selected by performing the integrated multiple simulation
presented in Lee (2007), applying the ‘trading policy selection set’ to each predictor. Here,
the integrated multiple simulations refer to the technique to find the optimal trading
policies best suited to a given predictive neural network. For example, when a prediction
is performed on the fund simulation set, stocks and dates that will rise more than 10% are
derived. The stocks to rise consist of those with a neural network threshold of more than
0.5. It was found that the optimum trading policy had a 20% in profit realization rate,
−12% in stop loss rate, and a holding period of 19 days.
The results of this experiment were compared with similar studies using other filtering
algorithms to derive the results shown in Table 3. Highly volatile filtering algorithm defines
the pattern of fluctuations in stock prices using the concept of upper limits. In comparison,
the remaining three algorithms are ‘Resisted plunge filtering’, ‘Nosedive filtering’, and ‘Rise
stock filtering’, respectively. ‘Resisted plunge’ refers to the type in which an ascending stock
drops for a short period. ‘Nosedive’ literally means a slump. In this case, it is to collect
stocks that have shown a period of collapse. ‘Rise stock’ filtering represented the long-term
upward trend (Song & Lee, 2018).
Experiments were conducted using the same data and model structure. The experimental
results showed that Resisted plunge filtering achieved 72.39% accuracy, Nosedive filtering
achieved 75.11% accuracy, and Rise filtering achieved 64.93% accuracy. In contrast, it can
be seen that the highly volatile pattern filtering algorithm is 96.23%, which is higher than
the accuracy of the other filtering algorithm.
Additionally, to supplement the stock price prediction performance of the high volatile
model proposed in this paper, fund simulation results were compared it with the Nikkei
225, NYSE, and NASDAQ, which are representative stock indices in Japan and the United
States. The results are shown in a graph with the domestic stock index in Fig. 8. Even when
other indices were lowered, high volatile stock price prediction model showed a steady
upward graph. Finally, this model can earn the highest cumulative return. Figure 8 shows
the percentage of returns from each asset.
The reasons that the trading system proposed in the paper achieved better trading
performance than the domestic and overseas stock indices are: first, including the
microscopic price change processes of the most recent three days in the training input
features helped train the neural network training; second, defining the scope of each
pattern by using more strengthened constraints than the moving average pattern seems
to have contributed to the improvement of learning performance as well as the ultimate
trading performance.
CONCLUSIONS
This paper constructed a pattern-based stock trading system which learned data
corresponding to the three highly volatile stock price patterns and utilized that data
for trading. The highly volatile stock price pattern can be observed over a long period of
time and almost guarantees a short-term rise after the pattern occurs.
The significance of this study is the development of a stock price prediction model that
exceeds market indices to overcome the continued freezing of interest rates in Korea, Japan,
and the US Also, the results of this study can help investors who fail to invest in stocks due
to the information gap. If special analysis techniques and indicators such as high volatility
patterns are proven to be effective through this research method, individual investors can
use these methods in the future. In addition, a number of other patterns of variation can
Funding
This work was funded by the Sungshin Women’s University Research Grant no. H20200069.
The funders had no role in study design, data collection and analysis, decision to publish,
or preparation of the manuscript.
Grant Disclosures
The following grant information was disclosed by the author:
Sungshin Women’s University Research: H20200069.
Competing Interests
The authors declare there are no competing interests.
Author Contributions
• Jangmin Oh conceived and designed the experiments, performed the experiments,
analyzed the data, performed the computation work, prepared figures and/or tables,
authored or reviewed drafts of the paper, and approved the final draft.
Data Availability
The following information was supplied regarding data availability:
The data and source codes are available at GitHub: https://github.com/chfhdahwk/Hivol_
Stock_Prediction.git.