Abstract
In some classification problems the feature space is heterogeneous in that the best features on which to base the classification are different in different parts of the feature space. In some other problems the classes can be divided into subsets such that distinguishing one subset of classes from another and classifying examples within such subsets require very different decision rules, involving different sets of features. In such heterogeneous problems, many modeling techniques (including decision trees, rules, and neural networks) evaluate the performance of alternative decision rules by averaging over the entire problem space, and axe prone to generating a model that is suboptimal in any of the regions or subproblems. Better overall models can be obtained by splitting the problem appropriately and modeling each subproblem separately.
This paper presents a new measure to determine the degree of dissimilarity between the decision surfaces of two given problems, and suggests a way to search for a strategic splitting of the feature space that identifies regions with different characteristics. We illustrate the concept using a multiplexor problem.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone: Classification and regression trees. Monterey, Calif.: Wadsworth (1984).
J. R. Quinlan.: C4.5: programs for machine learning. Morgan Kaufmann (1993).
I. Kononenko, E. Simec, and M. Robnik: Overcoming the myopia of inductive learning algorithms with RELIEFF. Applied Intelligence, 7, 39–55 (1997).
K. Kira and L. Rendell: The feature selection problem: traditional methods and a new algorithm. Proceedings of AAAI-92 (1992), 129–134.
S. J. Hong: Use of contextual information for feature ranking and discretization. IEEE Trans. Knowl. Data Eng., 9, to appear (1997).
J. Dougherty, R. Kohavi, and M. Sahami: Supervised and unsupervised discretization of continuous features. Proceedings of ML-95 (1995).
U. Fayyad and K. Irani: The attribute selection problem in decision tree generation. Proceedings of AAAI-92 (1992), 104–110.
S. J. Hong, J. R. M. Hosking, and S. Winograd: Use of randomization to normalize feature merits. Information, Statistics and Induction in Science, eds. D. L. Dowe, K. B. Korb and J. J. Oliver. Singapore: World Scientific (1996), 10–19.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag
About this paper
Cite this paper
Apte, C., Hong, S.J., Hosking, J.R.M., Lepre, J., Pednault, E.P.D., Rosen, B.K. (1997). Decomposition of heterogeneous classification problems. In: Liu, X., Cohen, P., Berthold, M. (eds) Advances in Intelligent Data Analysis Reasoning about Data. IDA 1997. Lecture Notes in Computer Science, vol 1280. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0052826
Download citation
DOI: https://doi.org/10.1007/BFb0052826
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63346-4
Online ISBN: 978-3-540-69520-2
eBook Packages: Springer Book Archive