2023.3.3
2023.3.3
2023.3.3
03
Int. J. Advance Soft Compu. Appl, Vol. 15, No. 3, November 2023
Print ISSN: 2710-1274, Online ISSN: 2074-8523
Copyright © Al-Zaytoonah University of Jordan (ZUJ)
Abstract:
The prediction of stock prices poses an intricate and demanding challenge within
the realm of finance. The emergence of artificial intelligence (AI) and machine
learning (ML) methodologies has escalated the significance of stock price
prediction for investors, traders, and financial experts. This study unveils a
comparative examination of diverse ML algorithms intended for stock price
prediction through AI mechanisms. We assess the efficacy of multiple algorithms,
encompassing Linear Regression, Ridge Regression, Lasso Regression, Random
Forest Regression, and Gradient Boosting Regression, employing a dataset of
historical stock prices sourced from Yahoo Finance. Our findings demonstrate
that the Gaussian Process Regressor surpasses other algorithms, boasting an
impeccable R-squared value of 1.00. Moreover, we delve into the pivotal role played
by feature engineering and preprocessing techniques in augmenting the precision
of prediction models. This investigation furnishes valuable insights into the
integration of AI in the financial domain, with the potential to enlighten
investment and trading strategies.
1. Introduction
The economy's stock market constitutes a pivotal facet, exerting a crucial influence on the
financial welfare of both individuals and entities. Operating as a multifaceted system, it's
swayed by an array of factors, including economic indicators, company achievements,
market sentiment, geopolitical occurrences, and myriad others. Given the capricious nature
of the stock market, investors and traders have perpetually sought ways to forecast stock
prices with precision.
Historically, stock analysis relied on fundamental, technical, and quantitative approaches.
Despite their application, these methods possess constraints, and precise stock prediction
persists as an intricate undertaking. Yet, owing to the escalating data accessibility and the
2. Problem Statement
This research focuses on the challenge of achieving precise stock price predictions through
conventional means. Given the high volatility of stock prices and the multitude of variables
at play, encompassing economic indicators, news occurrences, and investor outlook, the
task of forecasting stock prices is inherently complex. Consequently, traditional
approaches like regression analysis frequently fall short in furnishing reliable predictions.
To surmount this obstacle, the primary goal of this study is to assess and contrast the
effectiveness of diverse machine learning algorithms in the realm of stock price prediction.
3. Objectives
The primary aim of this research is to evaluate and contrast the efficacy of distinct machine
learning algorithms in the task of forecasting stock prices. More specifically, the study
seeks to:
Evaluate the accuracy and reliability of various machine learning algorithms in
predicting stock prices.
Determine which machine learning algorithm(s) perform best in predicting stock
prices.
Identify the most important features that influence stock prices and their relative
importance in the prediction models.
Provide insights and recommendations to investors and financial analysts on the
best machine learning techniques to use in predicting stock prices.
• In what manner can the precision of stock price prediction be elevated through the
incorporation of ensemble methodologies or alternative techniques?
The structure of the research questions is tailored to align with the study's overarching
objectives. Each question hones in on a distinct facet of the objectives, endeavoring to
uncover a more profound comprehension of the intricate interplay between stock prices
and diverse influences. The intention is to unravel these questions, thereby facilitating
the realization of the study's objectives and making a substantive contribution to the
realm of stock price prediction facilitated by artificial intelligence.
5. Literature Review
5.1 Previous Studies
For an extended period, forecasting stock prices has held substantial prominence within
the spheres of finance and economics. Conventional approaches, including fundamental
and technical analysis, have garnered extensive utilization in the pursuit of predicting stock
prices. Nevertheless, the ascent of artificial intelligence (AI) has ushered in a new era,
rendering machine learning (ML) algorithms progressively favored for their efficacy in the
prediction of stock prices.
Numerous investigations have delved into the utilization of machine learning (ML)
algorithms for the anticipation of stock prices. One such inquiry by [1] conducted a
comprehensive assessment, juxtaposing the efficacy of diverse ML algorithms
encompassing Support Vector Classifier (SVC), Decision Tree (DT), Random Forest (RF),
Adaboost, XGBoost, and Logistic Regression (LR) in the realm of stock price prediction.
Their findings revealed that Logistic Regression stood out, surpassing other algorithms in
terms of both accuracy and efficiency. The accuracy scores achieved across various
machine learning models were as follows: Decision Tree (68.46), Random Forest (72.18),
Adaboost (72.31), XGBoost (71.67), SVC (73.17), and Logistic Regression (76.67).
In another scholarly investigation detailed by [2], a novel stock price prediction model
rooted in deep learning was proposed, merging a convolutional neural network (CNN) with
a long short-term memory (LSTM) network. This inventive framework garnered a notably
higher accuracy rate (88.00) compared to traditional machine learning models,
underscoring the potential of deep learning in enhancing stock price prediction.
Similarly, an additional study referenced as [3] introduced an innovative approach,
assembling a cluster of machine learning models that harnessed wavelet transform in
conjunction with machine learning algorithms. These algorithms encompassed Support
Vector Machine (SVM), Random Forest, K-Nearest Neighbor (KNN), Naive Bayes, and
Softmax, aiming to prognosticate stock prices. Noteworthy accuracies were achieved by
each model: Support Vector Machine (75.98), Random Forest (80.55), K-Nearest
Neighbor (77.00), Naive Bayes (70.80), and Softmax (64.74). Evidently, the Random
Forest model outperformed its counterparts, accentuating the paramount significance of
feature extraction in the context of stock price prediction.
Predicting Stock Prices using Artificial … 44
However, the overarching consensus points towards the persistent potential for
enhancement in terms of both accuracy and efficiency. As such, the central aim of this
study is to conduct a comprehensive comparison of the performance of various machine
learning algorithms, thus identifying the most adept, accurate, and efficient algorithm
capable of tackling the task of stock price prediction.
6. Methodology
Based on the research questions and objectives, the methodology for this study will involve
the following steps:
6.1 Data collection
In this step, historical data for the stock prices of the Yahoo Finance companies was
collected from Kaggle depository. The dataset is called “Time Series Forecasting with
Yahoo Stock Price”. It consist of 1984 samples [29].
We have the data from 23rd November 2015 to 20th November 2020.
The dataset includes six features containing information about the stock prices for a
given date. These columns are:
• High: The maximum price that the stock attained on that specific date.
• Low: The minimum price that the stock reached on that particular date.
• Open: The initial price at which the stock commenced trading on that date.
• Close: The concluding price at which the stock ceased trading on that specific date.
• Volume: The aggregate trading activity that transpired on that date.
• AdjClose: Altered values that accommodate corporate actions like dividends, stock
splits, and fresh share issuance.
The finalized model will be harnessed to forecast forthcoming stock prices for the
designated companies. The precision of these projections will undergo thorough
evaluation, with the outcomes meticulously scrutinized. Subsequently, the obtained results
47 B. Abunasser et al.
Table 2: Summary of the results obtained during the testing of the machine learning
algorithms
Machine Learning R^2 Adjusted MSE MAE RMSE Accuracy
algorithms R^2
LGBM Regressor 0.7750 0.7715 0.0008 0.0192 0.0284 0.7402
XGB Regressor 0.8619 0.8597 0.0005 0.0169 0.0223 0.8132
CatBoost Regressor -0.9883 -1.0195 0.0071 0.0755 0.0846 0.7931
Ridge 0.9726 0.9721 0.0001 0.0084 0.0099 0.9909
Lasso -5.8202 -5.9271 0.0220 0.1380 0.1483 -4.6712
Predicting Stock Prices using Artificial … 48
LGBMRegressor XGBRegressor
CatBoostRegressor Ridge
49 B. Abunasser et al.
Lasso LinearRegression
DecisionTreeRegressor RandomForestRegressor
GaussianMixture GradientBoostingRegresor
Predicting Stock Prices using Artificial … 50
LinearSVR NuSVR
Fig.1: Comparisons between Actual and predicted Yahoo prices in all machine learning
Algorithms
9. Conclusion:
We present an in-depth comparative analysis of diverse Machine Learning algorithms
tailored for the prediction of Yahoo stock prices, facilitated by Artificial Intelligence. Our
investigation encompasses the evaluation of algorithmic performance, featuring Linear
Regression, Ridge Regression, Lasso Regression, Random Forest Regression, and
Gradient Boosting Regression. This evaluation was conducted upon a dataset sourced from
Yahoo Finance, comprising historical stock prices.
Drawing upon the outcomes and the meticulous analysis of our comparative study,
we deduce that the Gaussian Process Regressor algorithm exhibited superior performance
compared to its counterparts. It commanded the highest R^2 value of 1.000, along with the
most remarkable feats of the lowest MSE (0.000001) and the lowest MAE (0.000002).
This discernment underscores the capability of the Gaussian Process Regressor as a
dependable and potent tool for stock price prediction through the avenue of artificial
intelligence.
Conversely, our analysis disclosed that certain algorithms, including
CatBoostRegressor and GaussianMixture, displayed suboptimal performance in our study.
Consequently, these algorithms warrant cautionary usage in the context of stock price
prediction through machine learning methodologies.
In summation, the implications of our study underscore the practicality of machine
learning algorithms as valuable tools for stock price prediction. Nonetheless, the pivotal
consideration remains the judicious selection of the most fitting algorithm, tailored to the
unique problem and dataset at hand. The trajectory of future research could be directed
toward the formulation and validation of alternative machine learning algorithms, coupled
with the integration of supplementary features to elevate the precision of stock price
projections.
References
1. B. S. Abunasser, M. R. J. AL-Hiealy, I. S. Zaqout, and S. S. Abu-Naser, "Breast
cancer detection and classification using deep learning Xception algorithm,"
International Journal of Advanced Computer Science and Applications, vol. 13,
no. 7, 2022.
2. B. S. Abunasser, S. M. Daud, I. Zaqout, and S. S. Abu-Naser, "Abunaser-A Novel
Data Augmentation Algorithm For Datasets With Numerical Features," Journal of
Theoretical and Applied Information Technology, vol. 101, no. 11, 2023.
3. B. S. ABUNASSER, S. M. DAUD, I. Zaqout, and S. S. ABU-NASER,
"CONVOLUTION NEURAL NETWORK FOR BREAST CANCER
DETECTION AND CLASSIFICATION–FINAL RESULTS," Journal of
Predicting Stock Prices using Artificial … 52
Theoretical and Applied Information Technology, vol. 101, no. 1, pp. 315-329,
2023.
4. B. S. Abunasser, M. R. J. AL-Hiealy, A. M. Barhoom, A. R. Almasri, and S. S.
Abu-Naser, "Prediction of instructor performance using machine and deep learning
techniques," International Journal of Advanced Computer Science and
Applications, vol. 13, no. 7, 2022.
5. B. S. Abunasser, M. R. J. Al-Hiealy, I. S. Zaqout, and S. S. Abu-Naser,
"Convolution Neural Network for Breast Cancer Detection and Classification
Using Deep Learning," Asian Pacific journal of cancer prevention: APJCP, vol.
24, no. 2, pp. 531, 2023.
6. M. M. Alayoubi, Z. M. Arekat, M. J. Al Shobaki, and S. S. Abu-Naser, "The impact
of work stress on job performance among nursing staff in Al-Awda Hospital,"
Foundations of Management, vol. 14, no. 1, pp. 87-108, 2022.
7. Z. K. Alkayyali, S. A. B. Idris, and S. S. Abu-Naser, "A New Algorithm for Audio
Files Augmentation," Journal of Theoretical and Applied Information Technology,
vol. 101, no. 12, 2023.
8. Z. K. Alkayyali, S. A. B. Idris, and S. S. Abu-Naser, "A Systematic Literature
Review of Deep and Machine Learning Algorithms in Cardiovascular Diseases
Diagnosis," Journal of Theoretical and Applied Information Technology, vol. 101,
no. 4, pp. 1353-1365, 2023.
9. A. Almasri, T. Obaid, M. S. Abumandil, B. Eneizan, A. Y. Mahmoud, and S. S.
Abu-Naser, "Mining Educational Data to Improve Teachers’ Performance," in
International Conference on Information Systems and Intelligent Applications,
Cham: Springer International Publishing, 2022, pp. 243-255.
10. A. R. Almasri, N. A. Yahaya, and S. S. Abu-Naser, "Instructor Performance
Modeling For Predicting Student Satisfaction Using Machine Learning-
Preliminary Results," J Theor Appl Inf Technol, vol. 100, pp. 5481-96, 2022.
11. S. M. Arqawi, M. A. Abu Rumman, and E. A. Zitawi, "Predicting Employee
Attrition and Performance Using Deep Learning," J Theor Appl Inf Technol, vol.
100, 2022.
12. S. M. Arqawi, E. A. Zitawi, A. H. Rabaya, B. S. Abunasser, and S. S. Abu-Naser,
"Predicting university student retention using artificial intelligence," International
Journal of Advanced Computer Science and Applications, vol. 13, no. 9, 2022.
13. A. M. Barhoom, M. R. J. Al-Hiealy, and S. S. Abu-Naser, "Bone Abnormalities
Detection and Classification Using Deep Learning-Vgg16 Algorithm," Journal of
Theoretical and Applied Information Technology, vol. 100, no. 20, pp. 6173-6184,
2022.
14. A. M. Barhoom, M. R. J. Al-Hiealy, and S. S. Abu-Naser, "Deep Learning-
Xception Algorithm for Upper Bone Abnormalities Classification," Journal of
Theoretical and Applied Information Technology, vol. 100, no. 23, pp. 6986-6997,
2022.
15. B. Y. El-Habil and S. S. Abu-Naser, "Global climate prediction using deep
learning," J Theor Appl Inf Technol, vol. 100, 2022.
16. K. P. Ferentinos, "Deep learning models for plant disease detection and diagnosis,"
Computers and electronics in agriculture, vol. 145, pp. 311-318, 2018.
17. D. Justus, J. Brennan, S. Bonner, and A. S. McGough, "Predicting the
computational cost of deep learning models," in 2018 IEEE international
conference on big data (Big Data), 2018, pp. 3873-3882.
53 B. Abunasser et al.
18. Y. J. Kim and J. H. Kim, "Stock price prediction using machine learning
algorithms," Journal of Physics: Conference Series, vol. 1238, no. 1, p. 012068,
2019.
19. I. Kumar, K. Dogra, C. Utreja, and P. Yadav, "A comparative study of supervised
machine learning algorithms for stock market trend prediction," in 2018 Second
International Conference on Inventive Communication and Computational
Technologies (ICICCT), 2018, pp. 1003-1007.
20. I. Kumar, K. Dogra, C. Utreja, and P. Yadav, "A Comparative Study of Supervised
Machine Learning Algorithms for Stock Market Trend Prediction," in Proceedings
of the International Conference on Inventive Communication and Computational
Technologies, ICICCT 2018, pp. 1003–1007, 2018. doi:
10.1109/ICICCT.2018.8473214.
21. M. Nabipour, P. Nayyeri, H. Jabani, S. Shahab, and A. Mosavi, "Predicting stock
market trends using machine learning and deep learning algorithms via continuous
and binary data; a comparative analysis," IEEE Access, vol. 8, pp. 150199-150212,
2020.
22. H. Li, Y. Shen, and Y. Zhu, "Stock price prediction using attention-based multi-
input LSTM," in Asian conference on machine learning, 2018, pp. 454-469.
PMLR.
23. A. Saleh, R. Sukaik, and S. S. Abu-Naser, "Brain tumor classification using deep
learning," in 2020 International Conference on Assistive and Rehabilitation
Technologies (iCareTech), 2020, pp. 131-136. IEEE.
24. A. M. Taha, D. S. B. B. Ariffin, and S. S. Abu-Naser, "A Systematic Literature
Review of Deep and Machine Learning Algorithms in Brain Tumor and Meta-
Analysis," Journal of Theoretical and Applied Information Technology, vol. 101,
no. 1, pp. 21-36, 2023.
25. A. M. Taha, D. S. B. B. Ariffin, and S. S. Abu-Naser, "Investigating the Effects of
Data Augmentation Techniques on Brain Tumor Detection Accuracy," Journal of
Theoretical and Applied Information Technology, vol. 101, no. 11, 2023.
26. Q. M. Zarandah, S. M. Daud, and S. S. Abu-Naser, "A Systematic Literature
Review Of Machine and Deep Learning-Based Detection And Classification
Methods for Diseases Related To the Respiratory System," Journal of Theoretical
and Applied Information Technology, vol. 101, no. 4, pp. 1273-1296, 2023.
27. Q. M. Zarandah, S. M. Daud, and S. S. Abu-Naser, "Spectrogram Flipping: A New
Technique For Audio Augmentation," Journal of Theoretical and Applied
Information Technology, vol. 101, no. 11, 2023.
28. X. Zhang, Y. Zhang, S. Wang, Y. Yao, B. Fang, and S. Y. Philip, "Improving stock
market prediction via heterogeneous information fusion," Knowledge-Based
Systems, vol. 143, pp. 236-247, 2018.
29. www.kaggle.com, last visited 10/01/2023.