Prediction of Oil–Water Two-Phase Flow Patterns Based on Bayesian Optimisation of the XGBoost Algorithm
<p>Bayesian optimisation algorithm optimisation XGBoost model process.</p> "> Figure 2
<p>XGBoost training flow chart.</p> "> Figure 3
<p>Schematic of oil–water flow patterns (<b>left</b>) and the photographed diagram (<b>right</b>).</p> "> Figure 4
<p>Schematic of the experimental setup, including: 1. simulation well; 2. well inclination regulator; 3. oil–water mixer; 4, 5. position control valves; 6, 7. flow meters; 8. water pump; 9. oil pump; 10. water tank; 11. oil tank; 12. oil–water separation tank.</p> "> Figure 5
<p>Confusion matrix of prediction results of the XGBoost algorithm training set. (<b>a</b>) Non-normalized data; (<b>b</b>) Normalized data.</p> "> Figure 6
<p>Confusion matrix of prediction results of the XGBoost algorithm test set. (<b>a</b>) Non-normalized data; (<b>b</b>) Normalized data.</p> "> Figure 7
<p>Scatter plot of the XGBoost training set and test set flow prediction results.</p> "> Figure 8
<p>Confusion matrix of the prediction results of the BO-XGBoost algorithm training set. (<b>a</b>) Non-normalized data; (<b>b</b>) Normalized data.</p> "> Figure 9
<p>Confusion matrix of the prediction results of the BO-XGBoost algorithm test set. (<b>a</b>) Non-normalized data; (<b>b</b>) Normalized data.</p> "> Figure 10
<p>Scatter plot of the BO-XGBoost training set and test set flow prediction results.</p> "> Figure 11
<p>XGBoost ROC curve.</p> "> Figure 12
<p>BO-XGBoost ROC curve.</p> "> Figure 13
<p>Flow pattern prediction accuracy statistics.</p> "> Figure 14
<p>Feature importance image.</p> "> Figure 15
<p>Feature global explanation image.</p> ">
Abstract
:1. Introduction
2. Algorithm Principle
2.1. XGBoost Algorithm
2.2. Bayesian Optimisation Algorithm
- Initialise the model by randomly selecting several sets of as observation points.
- Use a probabilistic surrogate model to estimate the objective function.
- Use the acquisition function to determine the next observation point and substitute it into to obtain the observation value .
- Add the obtained to the historical dataset and update the probabilistic surrogate model.
3. Method Application
3.1. Data Preprocessing
3.2. Bayesian Optimisation XGBoost
- Define the objective function: The objective function was established as the mean accuracy of 10-fold cross-validation, with the maximum number of iterations for the Bayesian optimisation algorithm set to 200.
- Initial observation point selection: Within the predefined search ranges of the XGBoost model’s hyperparameters (such as n_estimators, learning_rate, gamma, max_depth, and subsample), several sets of hyperparameters were randomly selected as initial observation points. These points were used to train the model and obtain the initial distribution of the objective function and the initial observation set D.
- Gaussian process estimation: Based on the observation set D, a Gaussian process was employed as the probabilistic surrogate model to estimate the objective function (a Gaussian process is a statistical method used to predict unknown values by assuming that the function values follow a Gaussian distribution, allowing for a probabilistic approach to modelling and estimation).
- Acquisition function calculation: The acquisition function was utilised to calculate the next observation point and to compute its corresponding observation value, which represents the model’s predicted accuracy .
- Update the observation set: The new observation point was added to the historical observation set D, and the Gaussian process surrogate model was updated.
- Iteration judgment: It was determined whether the maximum number of iterations had been reached. If not, the steps from 3 onwards were repeated; if the maximum number of iterations had been reached, the optimal hyperparameter combination and the corresponding optimal value of the objective function were output, and the model’s performance was evaluated using the testing set.
3.3. XGBoost
4. Experiment
4.1. Design Experiment
4.2. Prediction Results Analysis
4.3. Model Interpretability and Feature Analysis
4.4. Limitations of the Study
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Wu, Y.; Guo, H.; Song, H.; Deng, R. Fuzzy inference system application for oil-water flow patterns identification. Energy 2022, 239, 122359. [Google Scholar] [CrossRef]
- Ohnuki, A.; Akimoto, H. Experimental study on transition of flow pattern and phase distribution in upward air-water two-phase flow along a large vertical pipe. Int. J. Multiph. Flow 2000, 26, 367–386. [Google Scholar] [CrossRef]
- Xu, X.X. Study on oil-water two-phase flow in horizontal pipelines. J. Pet. Sci. Eng. 2007, 59, 43–58. [Google Scholar] [CrossRef]
- Bannwart, A.C.; Rodriguez, O.M.; Trevisan, F.E.; Vieira, F.F.; De Carvalho, C.H. Experimental investigation on liquid-liquid-gas flow: Flow patterns and pressure-gradient. J. Pet. Sci. Eng. 2009, 65, 1–13. [Google Scholar] [CrossRef]
- Sun, Y.; Guo, H.; Liang, H.; Li, A.; Zhang, Y.; Zhang, D. A Comparative Study of Oil-Water Two-Phase Flow Pattern Prediction Based on the GA-BP Neural Network and Random Forest Algorithm. Processes 2023, 11, 3155. [Google Scholar] [CrossRef]
- Wang, Y.; Cai, Z.; Yu, L. Prediction Model for Goodwill Impairment Based on Machine Learning. Account. Res. 2024, 3, 51–64. [Google Scholar]
- Zhang, Y.; Liu, R.; Chen, H. Financial Crisis Prediction Model Based on Particle Swarm Optimization and Kernel Extreme Learning Machine. Stat. Decis. 2019, 35, 67–71. [Google Scholar] [CrossRef]
- Zhang, X. Enterprise Financial Distress Prediction Method Based on Subspace Multi-Kernel Learning. Oper. Manag. 2021, 30, 184–191. [Google Scholar]
- Sukpancharoen, S.; Katongtung, T.; Rattanachoung, N.; Tippayawong, N. Unlocking the potential of transesterification catalysts for biodiesel production through machine learning approach. Bioresour. Technol. 2023, 378, 128961. [Google Scholar] [CrossRef] [PubMed]
- Şahin, S. Comparison of machine learning algorithms for predicting diesel/biodiesel/iso-pentanol blend engine performance and emissions. Heliyon 2023, 9, e21365. [Google Scholar] [CrossRef] [PubMed]
- Tang, Q.; Wang, T. Productivity Prediction of Fractured Horizontal Wells Based on XGBoost. China Petrochem. Stand. Qual. 2023, 43, 15–17. [Google Scholar]
- Zhao, R.; Yang, L.; Xu, X.; Ma, W.; Li, J. Lithology Identification Method and Research of Volcanic Rocks Based on XGBoost Algorithm. Adv. Geophys. 2024, 1–12. Available online: http://kns.cnki.net/kcms/detail/11.2982.P.20240611.1227.017.html (accessed on 20 June 2024).
- Wu, J.; Chen, S.; Chen, X.; Zhou, R. Model Selection and Hyperparameter Optimization Based on Reinforcement Learning. J. Univ. Electron. Sci. Technol. China 2020, 49, 255–261. [Google Scholar]
- Chai, D.; Xu, S.; Luo, C.; Lu, Y. Object Accurate Localization of Remote Sensing Image Based on Bayesian Optimization. Remote Sens. Technol. Appl. 2020, 35, 1377–1385. [Google Scholar]
- Guo, L.; Wang, Y. Research on Prediction of Stored Grain Temperature Based on XGBoost Optimization Algorithm. Cereals Oils 2022, 35, 78–82. [Google Scholar]
- Zhou, X.; Wang, R.; Dai, Y.; Zhang, J.; Sun, Y. Classified Early Warning of Coal Spontaneous Combustion Based on BO-XGBoost. Coal Eng. 2022, 54, 108–114. [Google Scholar]
- Chen, T.Q.; Guestrin, C. XGBoost: A scalable tree boosting system. arXiv 2016, arXiv:1603.02754. Available online: http://arxiv.org/abs/1603.02754.pdf (accessed on 20 June 2024).
- Mockus, J. Application of Bayesian approach to numerical methods of global and stochastic optimization. J. Global Optim. 1994, 4, 347–365. [Google Scholar] [CrossRef]
- Pelikan, M. Bayesian Optimization Algorithm: From Single Level to Hierarchy. Ph.D. Thesis, University of Illinois at Urbana-Champaign, Urbana, IL, USA, 2002. [Google Scholar]
- Cui, J.; Yang, B. Survey on Bayesian optimization methodology and applications. J. Softw. 2018, 29, 3068–3090. (In Chinese) [Google Scholar]
- Luo, Y.; Wang, C.; Ye, W. Interpretable prediction model of acute kidney injury based on XGBoost and SHAP. J. Electron. Inf. Technol. 2022, 44, 27–38. [Google Scholar]
Density (g/cm3) | Viscosity (mPa·s) | Surface Tension (mN/m) | |
---|---|---|---|
Oil | 0.826 | 2.92 | 30.00 |
Water | 0.988 | 1.16 | 72.00 |
Flow Pattern | Schematic Diagram | Coding |
---|---|---|
Bubbly flow | 0 | |
Emulsion flow | 1 | |
Frothy flow | 2 | |
Wavy flow | 3 | |
Stratified flow | 4 |
Parameter | Search Scope | Optimal Parameters | Parameter Meanings |
---|---|---|---|
colsample_bytree | [0.5, 1.0] | 0.71 | Feature random sampling ratio |
learning_rate | [0.01, 0.3] | 0.23 | Learning rate |
max_depth | [3, 15] | 12 | Maximum tree depth |
n_estimators | [100, 500] | 200 | Number of decision trees |
subsample | [0.5, 1.0] | 0.79 | Sample sampling ratio |
gamma | [0, 5] | 1.0 | Node split reduction factor |
alpha | [0, 10] | 3.56 | Regularisation coefficient |
min_child_weight | [0, 10] | 0.3 | Minimum weight of leaf nodes |
Algorithm Model | Accuracy | Precision | Recall | F1 Score |
---|---|---|---|---|
XGBoost | 0.750 | 0.788 | 0.791 | 0.784 |
BO-XGBoost | 0.938 | 0.967 | 0.971 | 0.966 |
Inclination (°) | Flow Rates (m3/d) | Water Cut (%) | Actual Flow Pattern | XGBoost Prediction | BO-XGBoost Prediction | Accuracy (%) |
---|---|---|---|---|---|---|
0 | 100 | 20 | bubble flow | emulsion flow | bubble flow | 93.75% |
0 | 300 | 40 | bubble flow | bubble flow | bubble flow | |
0 | 300 | 60 | bubble flow | bubble flow | bubble flow | |
0 | 600 | 80 | emulsion flow | frothy flow | emulsion flow | |
60 | 100 | 20 | bubble flow | emulsion flow | bubble flow | |
60 | 300 | 40 | emulsion flow | emulsion flow | emulsion flow | |
60 | 600 | 60 | frothy flow | frothy flow | frothy flow | |
60 | 600 | 90 | frothy flow | frothy flow | frothy flow | |
85 | 100 | 20 | wavy flow | wavy flow | wavy flow | |
85 | 300 | 40 | bubble flow | bubble flow | bubble flow | |
85 | 300 | 80 | frothy flow | bubble flow | bubble flow | |
85 | 600 | 90 | foam flow | frothy flow | frothy flow | |
90 | 100 | 20 | stratified flow | stratified flow | stratified flow | |
90 | 300 | 40 | frothy flow | frothy flow | frothy flow | |
90 | 600 | 60 | frothy flow | frothy flow | frothy flow | |
90 | 600 | 90 | frothy flow | frothy flow | frothy flow |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, D.; Guo, H.; Sun, Y.; Liang, H.; Li, A.; Guo, Y. Prediction of Oil–Water Two-Phase Flow Patterns Based on Bayesian Optimisation of the XGBoost Algorithm. Processes 2024, 12, 1660. https://doi.org/10.3390/pr12081660
Wang D, Guo H, Sun Y, Liang H, Li A, Guo Y. Prediction of Oil–Water Two-Phase Flow Patterns Based on Bayesian Optimisation of the XGBoost Algorithm. Processes. 2024; 12(8):1660. https://doi.org/10.3390/pr12081660
Chicago/Turabian StyleWang, Dudu, Haimin Guo, Yongtuo Sun, Haoxun Liang, Ao Li, and Yuqing Guo. 2024. "Prediction of Oil–Water Two-Phase Flow Patterns Based on Bayesian Optimisation of the XGBoost Algorithm" Processes 12, no. 8: 1660. https://doi.org/10.3390/pr12081660