Abstract
This paper presents ways to use subgroup discovery to generate actionable knowledge for decision support. Actionable knowledge is explicit symbolic knowledge, typically presented in the form of rules, that allows the decision maker to recognize some important relations and to perform an appropriate action, such as targeting a direct marketing campaign, or planning a population screening campaign aimed at detecting individuals with high disease risk. Different subgroup discovery approaches are outlined, and their advantages over using standard classification rule learning are discussed. Three case studies, a medical and two marketing ones, are used to present the lessons learned in solving problems requiring actionable knowledge generation for decision support.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., & Verkamo, A. I. (1996). Fast discovery of association rules. In Advances in knowledge discovery and data mining. Menlo Park, CA: AAAI Press.
Berger, J. (1985). Statistical decision theory and bayesian analysis. Springer-Verlag.
Berry, M., & Linoff, G. (2000). Mastering data mining, the art and science of customer relationship managemen. John Wiley.
Cestnik, B., Lavrač, N., Železny, F., Gamberger, D., Todorovski, L., & Kline, M. (2002). Data mining for decision support in marketing:Acase study in targeting a marketing campaign. In Proceedings of the ECML/PKDD-2002 Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning (pp. 25–34).
Clark, P., & Boswell, R. (1991). Rule induction with CN2: Some recent improvements. In Proc. Fifth European Working Session on Learning (pp. 151–163). Springer.
Clark, P., & Niblett, T. (1989). The CN2 induction algorithm. Machine Learning, 3, 261–283.
Cohen, W.W. (1995). Fast effective rule induction. In A. Prieditis & S. Russell (Eds.), Proc. of the 12th International Conference on Machine Learning (pp. 115–123). Morgan Kaufmann.
Cohen, W. W., & Singer, Y. (1999). A simple, fast, and effective rule learner. In Proceedings of the 17th National Conference on Artificial Intelligence. American Association for Artificial Intelligence.
De Raedt, L., Blockeel, H., Dehaspe, L., & Laer, W. V. (2001). Three companions for data mining in first order logic. In S. Džeroski & N. Lavrač (Eds.), Relational Data Mining. Springer-Verlag.
De Raedt, L., & Dehaspe, L. (1997). Clausal discovery. Machine Learning, 26, 99–146.
Flach, P. (2003). The geometry of ROC space: Understanding machine learning metrics through ROC isometrics. In Proc. 20th International Conference on Machine Learning (ICML03) (pp. 194–201). AAAI Press.
Flach, P., & Gamberger, D. (2001). Subgroup evaluation and decision support for direct mailing problem. In Proceedings of the ECML/PKDD-2001 Workshop on Integration Aspects of Data Mining, Decision Support and Meta-Learning (pp. 45–56).
Fürnkranz, J., & Flach, P. (2003). An analysis of rule evaluation metrics. In Proc. 20th International Conference on Machine Learning (ICML03) (pp. 202–209). AAAI Press.
Gamberger, D., & Lavrač, N. (2002). Expert guided subgroup discovery: Methodology and application. Journal of Artificiel Intelligence Research, 17, 501–527.
Gamberger, D., Lavrač, N., & Krstačić, G. (2003). Active subgroup mining: A case study in coronary heart disease risk group detection. Artificial Intelligence in Medicine, 28, 27–57.
Holte, R. (1993).Very simple classification rules perform well on most commonly used datasets.Machine Learning, 11, 63–91.
Kavšek, B., Lavrač, N., & Jovanoski, V. (2003). APROPRI-SD: Adapting association rule learning to subgroup discovery. In M. Berthold, H. J. Lenz, E. Bradley, R. Kruse, & C. Borgelt (Eds.), Advances in intelligent data analysis (pp. 230–241). Springer-Verlag.
Kloesgen,W. (1996). EXPLORA:Amultipattern and multistrategy discovery assistant. In M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, & R. Uthurusamy (Eds.), Advances in knowledge discovery and data mining. Menlo Park, CA: AAAI Press.
Lavrač, N., Flach, P., Kavšek, B., & Todorovski, L. (2002). Adapting classification rule induction to subgroup discovery. In V. Kumar, S. Tsumoto, N. Zhong, P. Yu, & X.Wu (Eds.), Proceedings of the 2002 IEEE International Conference on Data Mining (pp. 266–273). IEEE Computer Society.
Lavrač, N., Flach, P., & Zupan, B. (1999). Rule evaluation measures: A unifying view. In S. Džeroski & P. Flach (Eds.), Proceedings of the 9th International Workshop on Inductive Logic Programming (pp. 174–185). Springer-Verlag.
Lavrač, N., Kavšek, B., Flach, P., & Todorovski, L. (2004). Subgroup discovery with CN2-SD. Journal of Machine Learning Research, 5, 153–188.
Michalski, R., Mozetič, I., Hong, J., & Lavrač, N. (1986). The multi-purpose incremental learning system AQ15 and its testing application on three medical domains. In Proc. 5th National Conference on Artificial Intelligence (pp. 1041–1045). Morgan Kaufmann.
Myers, J. (1996). Segmentation and positioning for strategic marketing decisions. American Marketing Association.
Piatetsky-Shapiro, G., & Matheus, C. (1994). The interestingness of deviation. In Proceedings of the AAAI-94 Workshop on Knowledge Discovery in Databases (pp. 25–36).
Provost, F. J., & Fawcett, T. (1998). Robust classification systems for imprecise environments. In Proceedings of the 19th National Conference on Artificial Intelligence (pp. 706–713).
Rivest, R. L. (1987).Learning decision lists. Machine Learning, 2:3, 229–246.
Silberschatz, A., & Tuzhilin, A. (1995). On subjective measures of interestingness in knowledge discovery. In Knowledge Discovery and data mining (pp. 275–281).
Wrobel, S. (1997). An algorithm for multi-relational discovery of subgroups. In J. Komorowski & J. Zytkow (Eds.), Proc. First European Symposion on Principles of Data Mining and Knowledge Discovery (PKDD-97) (pp. 78–87). Springer Verlag.
Wrobel, S. (2001). Inductive logic programming for knowledge discovery in databases. In S.Džeroski & N. Lavrač (Eds.), Relational data mining. Springer-Verlag.
Wrobel, S.,& Džeroski, S. (1995). The ILP description learning problem: Towards a general model-level definition of data mining in ILP. In K. Morik & J. Herrmann (Eds.), Proc. Fachgruppentreffen Maschinelles Lernen (FGML-95). 44221 Dortmund, Univ. Dortmund.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Lavrač, N., Cestnik, B., Gamberger, D. et al. Decision Support Through Subgroup Discovery: Three Case Studies and the Lessons Learned. Machine Learning 57, 115–143 (2004). https://doi.org/10.1023/B:MACH.0000035474.48771.cd
Issue Date:
DOI: https://doi.org/10.1023/B:MACH.0000035474.48771.cd