Rotation Forest for multi-target regression

Juan J. Rodríguez¹,
Mario Juez-Gil¹,
Carlos López-Nozal¹ &
…
Álvar Arnaiz-González ORCID: orcid.org/0000-0001-6965-0237¹

819 Accesses
6 Citations
1 Altmetric
Explore all metrics

A Correction to this article was published on 25 July 2021

This article has been updated

Abstract

The prediction of multiple numeric outputs at the same time is called multi-target regression (MTR), and it has gained attention during the last decades. This task is a challenging research topic in supervised learning because it poses additional difficulties to traditional single-target regression (STR), and many real-world problems involve the prediction of multiple targets at once. One of the most successful approaches to deal with MTR, although not the only one, consists in transforming the problem in several STR problems, whose outputs will be combined building up the MTR output. In this paper, the Rotation Forest ensemble method, previously proposed for single-label classification and single-target regression, is adapted to MTR tasks and tested with several regressors and data sets. Our proposal rotates the input space in an efficient and novel fashion, avoiding extra rotations forced by MTR problem decomposition. Four approaches for MTR are used: single-target (ST), stacked-single target (SST), Ensembles of Regressor Chains (ERC), and Multi-target Regression via Quantization (MRQ). For assessing the benefits of the proposal, a thorough experimentation with 28 MTR data sets and statistical tests are used, concluding that Rotation Forest, adapted by means of these approaches, outperforms other popular ensembles, such as Bagging and Random Forest.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-target regression via input space expansion: treating targets as inputs

Article 19 February 2016

Ensembles for multi-target regression with random output selections

Article 11 July 2018

Random Forests with Random Projections of the Output Space for High Dimensional Multi-label Classification

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Change history

25 July 2021
A Correction to this paper has been published: https://doi.org/10.1007/s13042-021-01354-0

Notes

If the number of features is not a multiple of 3, the last group is completed with previously selected features.
http://mulan.sourceforge.net/datasets-mtr.html.
http://people.vcu.edu/~acano/MTR-SVRCC/datasets.zip.
There can be repetitions among these chains, specially if the number of targets is low.
Code available at https://github.com/hfawaz/cd-diagram.
There is neither advantage according to the average ranks nor Bayesian tests from the results of all the data sets. But for particular data sets the results were improved with other RotF methods.

References

Abraham Z, Tan PN, Winkler J, Zhong S, Liszewska M, et al (2013) Position preserving multi-output prediction. In: Joint European conference on machine learning and knowledge discovery in databases. Springer, pp 320–335
Adıyeke E, Baydoğan MG (2020) The benefits of target relations: a comparison of multitask extensions and classifier chains. Pattern Recogn 107:107507
Google Scholar
Aho T, Ženko B, Džeroski S, Elomaa T (2012) Multi-target regression with rule ensembles. J Mach Learn Res 13(Aug):2367–2407
MathSciNet MATH Google Scholar
Aho T, Ženko B, Džeroski S (2009) Rule ensembles for multi-target regression. In: 2009 Ninth IEEE International Conference on Data Mining, pp 21–30. IEEE
Alvarez MA, Rosasco L, Lawrence ND et al (2012) Kernels for vector-valued functions: a review. Found Trends Mach Learn 4(3):195–266
MATH Google Scholar
Appice A, Džeroski S (2007) Stepwise induction of multi-target model trees. In: European Conference on Machine Learning. Springer, pp 502–509
Argyriou A, Evgeniou T, Pontil M (2008) Convex multi-task feature learning. Mach Learn 73(3):243–272
MATH Google Scholar
Argyriou A, Evgeniou T, Pontil M (2006) Multi-task feature learning. In: Advances in neural information processing systems. pp 41–48
Ayerdi B, Graña M (2014) Hybrid extreme rotation forest. Neural Networks 52:33–42
MATH Google Scholar
Benavoli A, Corani G, Mangili F (2016) Should we really use post-hoc tests based on mean-ranks? J Mach Learn Res 17(1):152–161
MathSciNet MATH Google Scholar
Benavoli A, Corani G, Demšar J, Zaffalon M (2017) Time for a change: a tutorial for comparing multiple classifiers through bayesian analysis. J Mach Learn Res 18(77): 1–36. http://jmlr.org/papers/v18/16-305.html
Blaser R, Fryzlewicz P (2016) Random rotation ensembles. J Mach Learn Res 17(1):126–151
MathSciNet MATH Google Scholar
Borchani H, Varando G, Bielza C, Larrañaga P (2015) A survey on multi-output regression. Wiley Interdiscip Rev 5(5):216–233
Google Scholar
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
MATH Google Scholar
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
MATH Google Scholar
Breiman L, Friedman JH (1997) Predicting multivariate responses in multiple linear regression. J Roy Stat Soc 59(1):3–54
MathSciNet MATH Google Scholar
Breskvar M, Kocev D, Džeroski S (2018) Ensembles for multi-target regression with random output selections. Mach Learn 107(11):1673–1709
MathSciNet MATH Google Scholar
Cai Z, Zhu W (2018) Multi-label feature selection via feature manifold learning and sparsity regularization. Int J Mach Learn Cybernet 9(8):1321–1334. https://doi.org/10.1007/s13042-017-0647-y
Article Google Scholar
Caruana R (1994) Learning many related tasks at the same time with backpropagation. In: Advances in neural information processing systems. pp 657–664
Collobert R, Weston J (2008) A unified architecture for natural language processing: Deep neural networks with multitask learning. In: Proceedings of the 25th international conference on Machine learning. pp 160–167. ACM
De’Ath G (2002) Multivariate regression trees: a new technique for modeling species-environment relationships. Ecology 83(4):1105–1117
Google Scholar
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7(Jan):1–30
MathSciNet MATH Google Scholar
Dua D, Graff C (2019) UCI machine learning repository. http://archive.ics.uci.edu/ml
Džeroski S, Demšar D, Grbović J (2000) Predicting chemical parameters of river water quality from bioindicator data. Appl Intell 13(1):7–17
Google Scholar
Garcia S, Herrera F (2008) An extension on statistical comparisons of classifiers over multiple data sets for all pairwise comparisons. J Mach Learn Res 9(12): 2677–2694
García-Pedrajas N, Maudes-Raedo J, García-Osorio C, Rodríguez-Díez JJ (2012) Supervised subspace projections for constructing ensembles of classifiers. Inf Sci 193:1–21
Google Scholar
Ghosn J, Bengio Y (1997) Multi-task learning for stock selection. Adv Neural Inf Process Syst pp. 946–952
Godbole S, Sarawagi S (2004) Discriminative methods for multi-labeled classification. In: Pacific-Asia conference on knowledge discovery and data mining. Springer, pp 22–30
Goovaerts P et al (1997) Geostatistics for natural resources evaluation. Oxford University Press on Demand
Google Scholar
Hatzikos EV, Tsoumakas G, Tzanis G, Bassiliades N, Vlahavas I (2008) An empirical study on sea water quality prediction. Knowl-Based Syst 21(6):471–478
Google Scholar
Herrera F, Charte F, Rivera AJ, Del Jesus MJ (2016) Multilabel classification. In: Multilabel classification. Springer, pp 17–31
Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–844
Google Scholar
Ismail Fawaz H, Forestier G, Weber J, Idoumghar L, Muller PA (2019) Deep learning for time series classification: a review. Data Min Knowl Disc 33(4):917–963
MathSciNet MATH Google Scholar
Izenman AJ (1975) Reduced-rank regression for the multivariate linear model. J Multivar Anal 5(2):248–264
MathSciNet MATH Google Scholar
Jalali A, Sanghavi S, Ruan C, Ravikumar PK (2010) A dirty model for multi-task learning. Adv Neural Inf Process Syste 23:964–972
Jeong JY, Kang JS, Jun CH (2020) Regularization-based model tree for multi-output regression. Inf Sci 507:240–255
MathSciNet MATH Google Scholar
Juez-Gil M (2020) mjuez/baycomp\_plotting. https://doi.org/10.5281/zenodo.4244542
Kaggle (2012) Kaggle competition: online product sales. https://www.kaggle.com/c/online-sales
Kaggle (2013) Kaggle competition: see click predict fix. https://www.kaggle.com/c/see-click-predict-fix
Karalič A, Bratko I (1997) First order regression. Mach Learn 26(2–3):147–176
MATH Google Scholar
Kocev D, Vens C, Struyf J, Džeroski S (2013) Tree ensembles for predicting structured outputs. Pattern Recogn 46(3):817–833
Google Scholar
Kocev D, Vens C, Struyf J, Džeroski S (2007) Ensembles of multi-objective decision trees. In: European conference on machine learning. Springer, pp 624–631
Kordos M, Arnaiz-González Á, García-Osorio C (2019) Evolutionary prototype selection for multi-output regression. Neurocomputing 358:309–320
Google Scholar
Kuncheva LI (2014) Combining pattern classifiers: methods and algorithms, 2nd edn. Wiley
MATH Google Scholar
Latorre Carmona P, Sotoca JM, Pla F (2012) Filter-type variable selection based on information measures for regression tasks. Entropy 14(2):323–343
MATH Google Scholar
Li H, Zhang W, Chen Y, Guo Y, Li GZ, Zhu X (2017) A novel multi-target regression framework for time-series prediction of drug efficacy. Sci Rep 7:40652
Google Scholar
Mastelini SM, da Costa VGT, Santana EJ, Nakano FK, Guido RC, Cerri R, Barbon S (2019) Multi-output tree chaining: an interpretative modelling and lightweight multi-target approach. J Signal Process Syst 91(2):191–215
Google Scholar
Melki G, Cano A, Kecman V, Ventura S (2017) Multi-target support vector regression via correlation regressor chains. Inf Sci 415:53–69
MathSciNet MATH Google Scholar
Mitrović T, Antanasijević D, Lazović S, Perić-Grujić A, Ristić M (2019) Virtual water quality monitoring at inactive monitoring sites using monte carlo optimized artificial neural networks: a case study of danube river (serbia). Sci Total Environ 654:1000–1009
Google Scholar
Nunes M, Gerding E, McGroarty F, Niranjan M (2019) A comparison of multitask and single task learning with artificial neural networks for yield curve forecasting. Expert Syst Appl 119:362–375
Google Scholar
Obozinski G, Taskar B, Jordan MI (2010) Joint covariate selection and joint subspace selection for multiple classification problems. Stat Comput 20(2):231–252
MathSciNet Google Scholar
Pardo C, Diez-Pastor JF, García-Osorio C, Rodríguez JJ (2013) Rotation forests for regression. Appl Math Comput 219(19):9914–9924
MathSciNet MATH Google Scholar
Petković M, Kocev D, Džeroski S (2020) Feature ranking for multi-target regression. Mach Learn 109(6):1179–1204
Pham BT, Bui DT, Prakash I, Dholakia M (2016) Rotation forest fuzzy rule-based classifier ensemble for spatial prediction of landslides using gis. Nat Hazards 83(1):97–127
Google Scholar
Polikar R (2006) Ensemble based systems in decision making. IEEE Circuits Syst Mag 6(3):21–45. https://doi.org/10.1109/MCAS.2006.1688199
Article Google Scholar
Read J, Pfahringer B, Holmes G, Frank E (2011) Classifier chains for multi-label classification. Mach Learn 85(3):333
MathSciNet Google Scholar
Reyes O, Fardoun HM, Ventura S (2018) An ensemble-based method for the selection of instances in the multi-target regression problem. Integr Comput-Aided Eng 25(4):305–320
Google Scholar
Rodríguez JJ, Kuncheva LI, Alonso CJ (2006) Rotation forest: a new classifier ensemble method. IEEE Trans Pattern Anal Mach Intell 28(10):1619–1630. https://doi.org/10.1109/TPAMI.2006.211. http://doi.ieeecomputersociety.org/10.1109/TPAMI.2006.211
Sánchez-Fernández M, de Prado-Cumplido M, Arenas-García J, Pérez-Cruz F (2004) SVM multiregression for nonlinear channel estimation in multiple-input multiple-output systems. IEEE Trans Signal Process 52(8):2298–2307
MathSciNet MATH Google Scholar
Santana EJ, Geronimo BC, Mastelini SM, Carvalho RH, Barbin DF, Ida EI, Barbon S Jr (2018) Predicting poultry meat characteristics using an enhanced multi-target regression method. Biosyst Eng 171:193–204
Google Scholar
Shim J, Kang S, Cho S (2020) Kernel rotation forests for classification. In: 2020 IEEE International Conference on Big Data and Smart Computing (BigComp). pp 406–409. IEEE
Spyromitros-Xioufis E, Tsoumakas G, Groves W, Vlahavas I (2016) Multi-target regression via input space expansion: treating targets as inputs. Mach Learn 104(1):55–98
MathSciNet MATH Google Scholar
Spyromitros-Xioufis E, Sechidis K, Vlahavas I (2020) Multi-target regression via output space quantization. In: 2020 International Joint Conference on Neural Networks (IJCNN). pp 1–9. IEEE
Stiglic G, Rodriguez JJ, Kokol P (2011) Rotation of random forests for genomic and proteomic classification problems. In: Software tools and algorithms for biological systems. Springer, pp 211–221
Struyf J, Džeroski S (2005) Constraint based induction of multi-objective regression trees. In: International workshop on knowledge discovery in inductive databases. Springer, pp 222–233
Triguero I, Basgalupp M, Cerri R, Schietgat L, Vens C (2016) Partitioning the target space in multi-output learning. In: Proceedings of the 25th Belgian-Dutch Machine Learning Conference (Benelearn)
Tsanas A, Xifara A (2012) Accurate quantitative estimation of energy performance of residential buildings using statistical machine learning tools. Energy Build 49:560–567
Google Scholar
Tsoumakas G, Spyromitros-Xioufis E, Vilcek J, Vlahavas I (2011) Mulan: a java library for multi-label learning. J Mach Learn Res 12:2411–2414
MathSciNet MATH Google Scholar
Tsoumakas G, Spyromitros-Xioufis E, Vrekou A, Vlahavas I (2014) Multi-target regression via random linear target combinations. In: Joint european conference on machine learning and knowledge discovery in databases. Springer, pp 225–240
Van Der Merwe A, Zidek J (1980) Multivariate regression analysis and canonical variates. Can J Stat 8(1):27–39
MathSciNet MATH Google Scholar
Vazquez E, Walter E (2003) Multi-output suppport vector regression. IFAC Proc Volumes 36(16):1783–1788
Google Scholar
Wang L, You ZH, Xia SX, Chen X, Yan X, Zhou Y, Liu F (2018) An improved efficient rotation forest algorithm to predict the interactions among proteins. Soft Comput 22(10):3373–3381
Google Scholar
Wang J, Chen Z, Sun K, Li H, Deng X (2019) Multi-target regression via target specific features. Knowl-Based Syst 170:70–78
Google Scholar
Wilson DR, Martinez TR (2000) Reduction techniques for instance-based learning algorithms. Mach Learn 38(3):257–286
MATH Google Scholar
Wolpert DH (1992) Stacked generalization. Neural Networks 5(2):241–259
Google Scholar
Xu S, An X, Qiao X, Zhu L, Li L (2013) Multi-output least-squares support vector regression machines. Pattern Recogn Lett 34(9):1078–1084
Google Scholar
Yeh IC (2007) Modeling slump flow of concrete using second-order regressions and artificial neural networks. Cement Concr Compos 29(6):474–480
Google Scholar
Zeng J, Liu Y, Leng B, Xiong Z, Cheung YM (2017) Dimensionality reduction in multiple ordinal regression. IEEE Trans Neural Networks Learn Syst 29(9):4088–4101
Google Scholar
Zhang CX, Zhang JS (2008) Rotboost: a technique for combining rotation forest and adaboost. Pattern Recogn Lett 29(10):1524–1536
Google Scholar
Zhang W, Liu X, Ding Y, Shi D (2012) Multi-output lS-SVR machine in extended feature space. In: 2012 IEEE International conference on computational intelligence for measurement systems and applications (CIMSA) proceedings. pp 130–134. IEEE
Zhen X, Yu M, He X, Li S (2017) Multi-target regression via robust low-rank learning. IEEE Trans Pattern Anal Mach Intell 40(2):497–504
Google Scholar
Zhen X, Wang Z, Yu M, Li S (2015) Supervised descriptor learning for multi-output regression. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 1211–1218
Zhu X, Gao Z (2018) An efficient gradient-based model selection algorithm for multi-output least-squares support vector regression machines. Pattern Recogn Lett 111:16–22
Google Scholar
Zolfagharnasab H, Bessa S, Oliveira S, Faria P, Teixeira J, Cardoso J, Oliveira H (2018) A regression model for predicting shape deformation after breast conserving surgery. Sensors 18(1):167
Google Scholar

Download references

Acknowledgements

We thank Eleftherios Spyromitros-Xioufis and Esra Adıyeke for their help with the implementations [2, 63]. This work was supported by the Ministerio de Economía y Competitividad of the Spanish Government under project TIN2015-67534-P (MINECO-FEDER, UE), by the Junta de Castilla y León under project BU085P17 (JCyL/FEDER, UE) (both projects co-financed through European Union FEDER funds), and by the Consejería de Educación of the Junta de Castilla y León and the European Social Fund with the EDU/1100/2017 pre-doctoral grant. The authors gratefully acknowledge the support of the NVIDIA Corporation and its donation of the TITAN Xp GPUs used in this research.

Author information

Authors and Affiliations

Department of Computer Science, Universidad de Burgos, Avda. Cantabria s/n, 09006, Burgos, Spain
Juan J. Rodríguez, Mario Juez-Gil, Carlos López-Nozal & Álvar Arnaiz-González

Authors

Juan J. Rodríguez
View author publications
You can also search for this author in PubMed Google Scholar
Mario Juez-Gil
View author publications
You can also search for this author in PubMed Google Scholar
Carlos López-Nozal
View author publications
You can also search for this author in PubMed Google Scholar
Álvar Arnaiz-González
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Álvar Arnaiz-González.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this article was revised due to change in figures.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rodríguez, J.J., Juez-Gil, M., López-Nozal, C. et al. Rotation Forest for multi-target regression. Int. J. Mach. Learn. & Cyber. 13, 523–548 (2022). https://doi.org/10.1007/s13042-021-01329-1

Download citation

Received: 31 January 2020
Accepted: 07 April 2021
Published: 22 April 2021
Issue Date: February 2022
DOI: https://doi.org/10.1007/s13042-021-01329-1

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multi-target regression via input space expansion: treating targets as inputs

Ensembles for multi-target regression with random output selections

Random Forests with Random Projections of the Output Space for High Dimensional Multi-label Classification

Change history

25 July 2021

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Rotation Forest for multi-target regression

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multi-target regression via input space expansion: treating targets as inputs

Ensembles for multi-target regression with random output selections

Random Forests with Random Projections of the Output Space for High Dimensional Multi-label Classification

Explore related subjects

Change history

25 July 2021

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation