Abstract
The prediction of multiple numeric outputs at the same time is called multi-target regression (MTR), and it has gained attention during the last decades. This task is a challenging research topic in supervised learning because it poses additional difficulties to traditional single-target regression (STR), and many real-world problems involve the prediction of multiple targets at once. One of the most successful approaches to deal with MTR, although not the only one, consists in transforming the problem in several STR problems, whose outputs will be combined building up the MTR output. In this paper, the Rotation Forest ensemble method, previously proposed for single-label classification and single-target regression, is adapted to MTR tasks and tested with several regressors and data sets. Our proposal rotates the input space in an efficient and novel fashion, avoiding extra rotations forced by MTR problem decomposition. Four approaches for MTR are used: single-target (ST), stacked-single target (SST), Ensembles of Regressor Chains (ERC), and Multi-target Regression via Quantization (MRQ). For assessing the benefits of the proposal, a thorough experimentation with 28 MTR data sets and statistical tests are used, concluding that Rotation Forest, adapted by means of these approaches, outperforms other popular ensembles, such as Bagging and Random Forest.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Change history
25 July 2021
A Correction to this paper has been published: https://doi.org/10.1007/s13042-021-01354-0
Notes
If the number of features is not a multiple of 3, the last group is completed with previously selected features.
There can be repetitions among these chains, specially if the number of targets is low.
Code available at https://github.com/hfawaz/cd-diagram.
There is neither advantage according to the average ranks nor Bayesian tests from the results of all the data sets. But for particular data sets the results were improved with other RotF methods.
References
Abraham Z, Tan PN, Winkler J, Zhong S, Liszewska M, et al (2013) Position preserving multi-output prediction. In: Joint European conference on machine learning and knowledge discovery in databases. Springer, pp 320–335
Adıyeke E, Baydoğan MG (2020) The benefits of target relations: a comparison of multitask extensions and classifier chains. Pattern Recogn 107:107507
Aho T, Ženko B, Džeroski S, Elomaa T (2012) Multi-target regression with rule ensembles. J Mach Learn Res 13(Aug):2367–2407
Aho T, Ženko B, Džeroski S (2009) Rule ensembles for multi-target regression. In: 2009 Ninth IEEE International Conference on Data Mining, pp 21–30. IEEE
Alvarez MA, Rosasco L, Lawrence ND et al (2012) Kernels for vector-valued functions: a review. Found Trends Mach Learn 4(3):195–266
Appice A, Džeroski S (2007) Stepwise induction of multi-target model trees. In: European Conference on Machine Learning. Springer, pp 502–509
Argyriou A, Evgeniou T, Pontil M (2008) Convex multi-task feature learning. Mach Learn 73(3):243–272
Argyriou A, Evgeniou T, Pontil M (2006) Multi-task feature learning. In: Advances in neural information processing systems. pp 41–48
Ayerdi B, Graña M (2014) Hybrid extreme rotation forest. Neural Networks 52:33–42
Benavoli A, Corani G, Mangili F (2016) Should we really use post-hoc tests based on mean-ranks? J Mach Learn Res 17(1):152–161
Benavoli A, Corani G, Demšar J, Zaffalon M (2017) Time for a change: a tutorial for comparing multiple classifiers through bayesian analysis. J Mach Learn Res 18(77): 1–36. http://jmlr.org/papers/v18/16-305.html
Blaser R, Fryzlewicz P (2016) Random rotation ensembles. J Mach Learn Res 17(1):126–151
Borchani H, Varando G, Bielza C, Larrañaga P (2015) A survey on multi-output regression. Wiley Interdiscip Rev 5(5):216–233
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Breiman L, Friedman JH (1997) Predicting multivariate responses in multiple linear regression. J Roy Stat Soc 59(1):3–54
Breskvar M, Kocev D, Džeroski S (2018) Ensembles for multi-target regression with random output selections. Mach Learn 107(11):1673–1709
Cai Z, Zhu W (2018) Multi-label feature selection via feature manifold learning and sparsity regularization. Int J Mach Learn Cybernet 9(8):1321–1334. https://doi.org/10.1007/s13042-017-0647-y
Caruana R (1994) Learning many related tasks at the same time with backpropagation. In: Advances in neural information processing systems. pp 657–664
Collobert R, Weston J (2008) A unified architecture for natural language processing: Deep neural networks with multitask learning. In: Proceedings of the 25th international conference on Machine learning. pp 160–167. ACM
De’Ath G (2002) Multivariate regression trees: a new technique for modeling species-environment relationships. Ecology 83(4):1105–1117
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7(Jan):1–30
Dua D, Graff C (2019) UCI machine learning repository. http://archive.ics.uci.edu/ml
Džeroski S, Demšar D, Grbović J (2000) Predicting chemical parameters of river water quality from bioindicator data. Appl Intell 13(1):7–17
Garcia S, Herrera F (2008) An extension on statistical comparisons of classifiers over multiple data sets for all pairwise comparisons. J Mach Learn Res 9(12): 2677–2694
García-Pedrajas N, Maudes-Raedo J, García-Osorio C, Rodríguez-Díez JJ (2012) Supervised subspace projections for constructing ensembles of classifiers. Inf Sci 193:1–21
Ghosn J, Bengio Y (1997) Multi-task learning for stock selection. Adv Neural Inf Process Syst pp. 946–952
Godbole S, Sarawagi S (2004) Discriminative methods for multi-labeled classification. In: Pacific-Asia conference on knowledge discovery and data mining. Springer, pp 22–30
Goovaerts P et al (1997) Geostatistics for natural resources evaluation. Oxford University Press on Demand
Hatzikos EV, Tsoumakas G, Tzanis G, Bassiliades N, Vlahavas I (2008) An empirical study on sea water quality prediction. Knowl-Based Syst 21(6):471–478
Herrera F, Charte F, Rivera AJ, Del Jesus MJ (2016) Multilabel classification. In: Multilabel classification. Springer, pp 17–31
Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–844
Ismail Fawaz H, Forestier G, Weber J, Idoumghar L, Muller PA (2019) Deep learning for time series classification: a review. Data Min Knowl Disc 33(4):917–963
Izenman AJ (1975) Reduced-rank regression for the multivariate linear model. J Multivar Anal 5(2):248–264
Jalali A, Sanghavi S, Ruan C, Ravikumar PK (2010) A dirty model for multi-task learning. Adv Neural Inf Process Syste 23:964–972
Jeong JY, Kang JS, Jun CH (2020) Regularization-based model tree for multi-output regression. Inf Sci 507:240–255
Juez-Gil M (2020) mjuez/baycomp\_plotting. https://doi.org/10.5281/zenodo.4244542
Kaggle (2012) Kaggle competition: online product sales. https://www.kaggle.com/c/online-sales
Kaggle (2013) Kaggle competition: see click predict fix. https://www.kaggle.com/c/see-click-predict-fix
Karalič A, Bratko I (1997) First order regression. Mach Learn 26(2–3):147–176
Kocev D, Vens C, Struyf J, Džeroski S (2013) Tree ensembles for predicting structured outputs. Pattern Recogn 46(3):817–833
Kocev D, Vens C, Struyf J, Džeroski S (2007) Ensembles of multi-objective decision trees. In: European conference on machine learning. Springer, pp 624–631
Kordos M, Arnaiz-González Á, García-Osorio C (2019) Evolutionary prototype selection for multi-output regression. Neurocomputing 358:309–320
Kuncheva LI (2014) Combining pattern classifiers: methods and algorithms, 2nd edn. Wiley
Latorre Carmona P, Sotoca JM, Pla F (2012) Filter-type variable selection based on information measures for regression tasks. Entropy 14(2):323–343
Li H, Zhang W, Chen Y, Guo Y, Li GZ, Zhu X (2017) A novel multi-target regression framework for time-series prediction of drug efficacy. Sci Rep 7:40652
Mastelini SM, da Costa VGT, Santana EJ, Nakano FK, Guido RC, Cerri R, Barbon S (2019) Multi-output tree chaining: an interpretative modelling and lightweight multi-target approach. J Signal Process Syst 91(2):191–215
Melki G, Cano A, Kecman V, Ventura S (2017) Multi-target support vector regression via correlation regressor chains. Inf Sci 415:53–69
Mitrović T, Antanasijević D, Lazović S, Perić-Grujić A, Ristić M (2019) Virtual water quality monitoring at inactive monitoring sites using monte carlo optimized artificial neural networks: a case study of danube river (serbia). Sci Total Environ 654:1000–1009
Nunes M, Gerding E, McGroarty F, Niranjan M (2019) A comparison of multitask and single task learning with artificial neural networks for yield curve forecasting. Expert Syst Appl 119:362–375
Obozinski G, Taskar B, Jordan MI (2010) Joint covariate selection and joint subspace selection for multiple classification problems. Stat Comput 20(2):231–252
Pardo C, Diez-Pastor JF, García-Osorio C, Rodríguez JJ (2013) Rotation forests for regression. Appl Math Comput 219(19):9914–9924
Petković M, Kocev D, Džeroski S (2020) Feature ranking for multi-target regression. Mach Learn 109(6):1179–1204
Pham BT, Bui DT, Prakash I, Dholakia M (2016) Rotation forest fuzzy rule-based classifier ensemble for spatial prediction of landslides using gis. Nat Hazards 83(1):97–127
Polikar R (2006) Ensemble based systems in decision making. IEEE Circuits Syst Mag 6(3):21–45. https://doi.org/10.1109/MCAS.2006.1688199
Read J, Pfahringer B, Holmes G, Frank E (2011) Classifier chains for multi-label classification. Mach Learn 85(3):333
Reyes O, Fardoun HM, Ventura S (2018) An ensemble-based method for the selection of instances in the multi-target regression problem. Integr Comput-Aided Eng 25(4):305–320
Rodríguez JJ, Kuncheva LI, Alonso CJ (2006) Rotation forest: a new classifier ensemble method. IEEE Trans Pattern Anal Mach Intell 28(10):1619–1630. https://doi.org/10.1109/TPAMI.2006.211. http://doi.ieeecomputersociety.org/10.1109/TPAMI.2006.211
Sánchez-Fernández M, de Prado-Cumplido M, Arenas-García J, Pérez-Cruz F (2004) SVM multiregression for nonlinear channel estimation in multiple-input multiple-output systems. IEEE Trans Signal Process 52(8):2298–2307
Santana EJ, Geronimo BC, Mastelini SM, Carvalho RH, Barbin DF, Ida EI, Barbon S Jr (2018) Predicting poultry meat characteristics using an enhanced multi-target regression method. Biosyst Eng 171:193–204
Shim J, Kang S, Cho S (2020) Kernel rotation forests for classification. In: 2020 IEEE International Conference on Big Data and Smart Computing (BigComp). pp 406–409. IEEE
Spyromitros-Xioufis E, Tsoumakas G, Groves W, Vlahavas I (2016) Multi-target regression via input space expansion: treating targets as inputs. Mach Learn 104(1):55–98
Spyromitros-Xioufis E, Sechidis K, Vlahavas I (2020) Multi-target regression via output space quantization. In: 2020 International Joint Conference on Neural Networks (IJCNN). pp 1–9. IEEE
Stiglic G, Rodriguez JJ, Kokol P (2011) Rotation of random forests for genomic and proteomic classification problems. In: Software tools and algorithms for biological systems. Springer, pp 211–221
Struyf J, Džeroski S (2005) Constraint based induction of multi-objective regression trees. In: International workshop on knowledge discovery in inductive databases. Springer, pp 222–233
Triguero I, Basgalupp M, Cerri R, Schietgat L, Vens C (2016) Partitioning the target space in multi-output learning. In: Proceedings of the 25th Belgian-Dutch Machine Learning Conference (Benelearn)
Tsanas A, Xifara A (2012) Accurate quantitative estimation of energy performance of residential buildings using statistical machine learning tools. Energy Build 49:560–567
Tsoumakas G, Spyromitros-Xioufis E, Vilcek J, Vlahavas I (2011) Mulan: a java library for multi-label learning. J Mach Learn Res 12:2411–2414
Tsoumakas G, Spyromitros-Xioufis E, Vrekou A, Vlahavas I (2014) Multi-target regression via random linear target combinations. In: Joint european conference on machine learning and knowledge discovery in databases. Springer, pp 225–240
Van Der Merwe A, Zidek J (1980) Multivariate regression analysis and canonical variates. Can J Stat 8(1):27–39
Vazquez E, Walter E (2003) Multi-output suppport vector regression. IFAC Proc Volumes 36(16):1783–1788
Wang L, You ZH, Xia SX, Chen X, Yan X, Zhou Y, Liu F (2018) An improved efficient rotation forest algorithm to predict the interactions among proteins. Soft Comput 22(10):3373–3381
Wang J, Chen Z, Sun K, Li H, Deng X (2019) Multi-target regression via target specific features. Knowl-Based Syst 170:70–78
Wilson DR, Martinez TR (2000) Reduction techniques for instance-based learning algorithms. Mach Learn 38(3):257–286
Wolpert DH (1992) Stacked generalization. Neural Networks 5(2):241–259
Xu S, An X, Qiao X, Zhu L, Li L (2013) Multi-output least-squares support vector regression machines. Pattern Recogn Lett 34(9):1078–1084
Yeh IC (2007) Modeling slump flow of concrete using second-order regressions and artificial neural networks. Cement Concr Compos 29(6):474–480
Zeng J, Liu Y, Leng B, Xiong Z, Cheung YM (2017) Dimensionality reduction in multiple ordinal regression. IEEE Trans Neural Networks Learn Syst 29(9):4088–4101
Zhang CX, Zhang JS (2008) Rotboost: a technique for combining rotation forest and adaboost. Pattern Recogn Lett 29(10):1524–1536
Zhang W, Liu X, Ding Y, Shi D (2012) Multi-output lS-SVR machine in extended feature space. In: 2012 IEEE International conference on computational intelligence for measurement systems and applications (CIMSA) proceedings. pp 130–134. IEEE
Zhen X, Yu M, He X, Li S (2017) Multi-target regression via robust low-rank learning. IEEE Trans Pattern Anal Mach Intell 40(2):497–504
Zhen X, Wang Z, Yu M, Li S (2015) Supervised descriptor learning for multi-output regression. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 1211–1218
Zhu X, Gao Z (2018) An efficient gradient-based model selection algorithm for multi-output least-squares support vector regression machines. Pattern Recogn Lett 111:16–22
Zolfagharnasab H, Bessa S, Oliveira S, Faria P, Teixeira J, Cardoso J, Oliveira H (2018) A regression model for predicting shape deformation after breast conserving surgery. Sensors 18(1):167
Acknowledgements
We thank Eleftherios Spyromitros-Xioufis and Esra Adıyeke for their help with the implementations [2, 63]. This work was supported by the Ministerio de Economía y Competitividad of the Spanish Government under project TIN2015-67534-P (MINECO-FEDER, UE), by the Junta de Castilla y León under project BU085P17 (JCyL/FEDER, UE) (both projects co-financed through European Union FEDER funds), and by the Consejería de Educación of the Junta de Castilla y León and the European Social Fund with the EDU/1100/2017 pre-doctoral grant. The authors gratefully acknowledge the support of the NVIDIA Corporation and its donation of the TITAN Xp GPUs used in this research.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The original online version of this article was revised due to change in figures.
Rights and permissions
About this article
Cite this article
Rodríguez, J.J., Juez-Gil, M., López-Nozal, C. et al. Rotation Forest for multi-target regression. Int. J. Mach. Learn. & Cyber. 13, 523–548 (2022). https://doi.org/10.1007/s13042-021-01329-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-021-01329-1