Abstract
Large information datasets often impose an immense number of features where many are found redundant and thus inessential for statistical analysis. In the past, a data preprocessing phase was formalized to cope with the problem and take appropriate remedial measures. Traditionally, this was a fixed and stationary process that suffered from a lack of transparency and high susceptibility to input variations. This paper presents a novel and fully automated meta-heuristic nature-inspired wrapper-based feature selection framework DynFS with dynamically cutting search space. The experiments show that the DynFS statistically significantly overcomes a fixed feature selection framework and allows for a high level of robustness and stability.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Code availability
The source codes are available from the corresponding author on reasonable request.
Data availability
The datasets are publicly available.
Notes
openml.org/d/299.
archive.ics.uci.edu/ml/datasets/Low+Resolution+Spectrometer.
archive.ics.uci.edu/ml/datasets/Optical+Recognition+of+Handwritten+Digits.
References
Beyer H-G, Schwefel H-P (2002) Evolution strategies—a comprehensive introduction. Nat. Comput. 1(1):3–52
Blumer A, Ehrenfeucht A, Haussler D, Warmuth MK (1987) Occam’s razor. Inf Process Lett 24(6):377–380. ISSN:0020-0190
Brest J, Greiner S, Boskovic B, Mernik M, Zumer V (2006) Self-adapting control parameters in differential evolution: a comparative study on numerical benchmark problems. IEEE Trans Evol Comput 10(6):646–657
Brezočnik L, Fister I, Podgorelec V (2018) Swarm intelligence algorithms for feature selection: a review. Appl Sci 8(9):1521
Chandrashekar G, Sahin F (2014) A survey on feature selection methods. Comput Electr Eng 40(1):16–28. ISSN:00457906. https://doi.org/10.1016/j.compeleceng.2013.11.024. https://linkinghub.elsevier.com/retrieve/pii/S0045790613003066
Del Ser J, Osaba E, Molina D, Yang X-S, Salcedo-Sanz S, Camacho D, Das S, Suganthan PN, Coello CAC, Herrera F (2019) Bio-inspired computation: where we stand and what’s next. Swarm Evol Comput 48:220–250
Faris H, Aljarah I, Al-Betar MA, Mirjalili S (2018) Grey wolf optimizer: a review of recent variants and applications. Neural Comput Appl 30(2):413–435
Fisher R (1936) Linear discriminant analysis. Ann Eugen 7:179
Fister D, Fister I, Jagrič T, Fister I, Brest J (2019) A novel self-adaptive differential evolution for feature selection using threshold mechanism. In: Proceedings of the 2018 IEEE symposium series on computational intelligence, SSCI 2018. IEEE, pp 17–24. ISBN:9781538692769. https://doi.org/10.1109/SSCI.2018.8628715
Fister I Jr, Yang XS, Fister I, Brest J, Fister D (2013) A brief review of nature-inspired algorithms for optimization. Elektroteh Vestn 80(3):116–122
García S, Luengo J, Herrera F (2015) Data preprocessing in data mining, vol 72. Springer, Cham, Switzerland
Golberg DE (1989) Genetic algorithms in search, optimization, and machine learning, vol 102. Addison Wesley, Berkeley, p 36
Harman HH (1976) Modern factor analysis. University of Chicago Press, Chicago
Holland JH et al (1992) Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence. MIT Press, Cambridge
Karakatič S (2020) EvoPreprocess-data preprocessing framework with nature-inspired optimization algorithms. Mathematics 8(6):900. ISSN:22277390. https://doi.org/10.3390/MATH8060900
Karakatič S, Fister I, Fister D (2020) Dynamic genotype reduction for narrowing the feature selection search space. In: 2020 IEEE 20th international symposium on computational intelligence and informatics (CINTI). IEEE, pp 35–38. ISBN:978-1-7281-8340-4. https://doi.org/10.1109/CINTI51262.2020.9305851. https://ieeexplore.ieee.org/document/9305851/
Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of ICNN’95-international conference on neural networks, vol 4. IEEE, pp 1942–1948
Maldonado S, Weber R (2009) A wrapper method for feature selection using Support Vector Machines. Inf Sci 179(13):2208–2217. ISSN:00200255. https://doi.org/10.1016/j.ins.2009.02.014. https://linkinghub.elsevier.com/retrieve/pii/S0020025509000917
Meng Z, Li G, Wang X, Sait SM, Yıldız AR (2021) A comparative study of metaheuristic algorithms for reliability-based design optimization problems. Arch Comput Methods Eng 28:1853–1869, 5. ISSN:1134-3060. https://doi.org/10.1007/s11831-020-09443-z
Michaud RO, Michaud RO (2008) Efficient asset management: a practical guide to stock portfolio optimization and asset allocation. Oxford University Press, Oxford
Mirjalili S, Mirjalili SM, Lewis A (2014) Grey wolf optimizer. Adv Eng Softw 69:46–61
Pearson K (1901) Principal components analysis. Lond Edinb Dublin Philos Mag J Sci 6(2):559
Quinlan JR (1996) Improved use of continuous attributes in C4.5. J Artif Intell Res 4:77–90. ISSN:10769757. https://doi.org/10.1613/jair.279. https://www.jair.org/index.php/jair/article/view/10157
Rechenberg I (1978) Evolutionsstrategien. In: Simulationsmethoden in der Medizin und Biologie. Springer, Berlin, pp 83–114
Sangaiah A, Suraki M, Sadeghilalimi M, Bozorgi S, Hosseinabadi A, Wang J (2019) A new meta-heuristic algorithm for solving the flexible dynamic job-shop problem with parallel machines. Symmetry 11:165, 2. ISSN:2073-8994. https://doi.org/10.3390/sym11020165
Srikanth K, Panwar LK, Panigrahi BK, Herrera-Viedma E, Sangaiah AK, Wang G-G (2018) Meta-heuristic framework: quantum inspired binary grey wolf optimizer for unit commitment problem. Comput Electr Eng 70:243–260, 8. ISSN:00457906. https://doi.org/10.1016/j.compeleceng.2017.07.023
Srividya TD, Arulmozhi V (2018) Feature selection classification of skin cancer using genetic algorithm. In: Proceedings of the 3rd international conference on communication and electronics systems, ICCES 2018. IEEE, pp 412–417. ISBN:9781538647653. https://doi.org/10.1109/CESYS.2018.8724028
Storn R, Price K (1997) Differential evolution-a simple and efficient heuristic for global optimization over continuous spaces. J Glob Optim 11(4):341–359
Tran CT, Zhang M, Andreae P, Xue B (2016) Improving performance for classification with incomplete data using wrapper-based feature selection. Evol Intell 9(3):81–94. ISSN:18645917. https://doi.org/10.1007/s12065-016-0141-6. http://link.springer.com/10.1007/s12065-016-0141-6
Vrbančič G, Brezočnik L, Mlakar U, Fister D, Fister I (2018) Niapy: Python microframework for building nature-inspired algorithms. J Open Source Softw 3(23):613
Xue B, Zhang M, Browne WN, Yao X (2015) A survey on evolutionary computation approaches to feature selection. IEEE Trans Evol Comput 20(4):606–626
Xue Y, Xue B, Zhang M (2019) Self-adaptive particle swarm optimization for large-scale feature selection in classification. ACM Trans Knowl Discov Data 13 (5):1–27. ISSN:1556-4681. https://doi.org/10.1145/3340848
Funding
This work was supported by the Slovenian Research Agency (Research Core Funding nos. P2-0057, P5-0027).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no potential conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Fister, D., Fister, I. & Karakatič, S. DynFS: dynamic genotype cutting feature selection algorithm. J Ambient Intell Human Comput 14, 16477–16490 (2023). https://doi.org/10.1007/s12652-022-03872-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-022-03872-3