0% found this document useful (0 votes)

58 views16 pages

03 Data Mining Functionalities

Data mining functionalities are categorized into descriptive and predictive tasks, which involve characterizing data properties and making predictions, respectively. Key functionalities include concept/class description, frequent pattern mining, classification and prediction, cluster analysis, and outlier analysis. Each functionality serves specific purposes, such as summarizing data characteristics, predicting outcomes, and identifying anomalies.

Uploaded by

handahmadzai121234

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

58 views16 pages

03 Data Mining Functionalities

Uploaded by

handahmadzai121234

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 16

Data Mining Functionalities

 Data mining functionalities specify the kind of patterns

to be found in data mining tasks.

 In general, data mining tasks can be classified into two

categories: descriptive and predictive.

 Descriptive mining tasks characterize the general

properties of the data in the database.

 Predictive mining tasks perform inference on the

current data in order to make predictions.
Data Mining Functionalities
 Data mining functionalities, and the kinds of patterns they can
discover, are:

 Concept/Class Description: Characterization and

Discrimination

 Mining Frequent Patterns, Associations, and Correlations

 Classification and Prediction

 Cluster Analysis

 Outlier Analysis
Concept/Class Description
 Data can be associated with classes or concepts.

 Class: A collection of things sharing a common attribute.

 Classes of items for sale include computers and printers

 Concept: An abstract or general idea inferred or derived from specific

instances.

 Concepts of customers include bigSpenders and budgetSpenders.

 Summarized, concise and precise descriptions of individual classes and

concepts are called class/concept descriptions.

 These descriptions can be derived via data characterization, data

discrimination or both.
Concept/Class Description
 Data characterization is a summary of the general characteristics or
features of a target class of data.

 The data corresponding to the user-specified class are typically

collected by a database query.

 For example, to study the characteristics of software products whose

sales increased by 10% in the last year, the data related to such
products can be collected by executing an SQL query.

 Simple data summaries can be done based on statistical measures and

plots.
Concept/Class Description

 The output of data characterization can be presented in

various forms.

 Examples include pie charts, bar charts, curves,

multidimensional data cubes, and multidimensional tables.
Concept/Class Description
 Example:
 A data mining system should be able to produce a description
summarizing the characteristics of customers who spend more than
$1,000 a year at AllElectronics.

 The result could be a general profile of the customers, such as they are
40–50 years old, employed, and have excellent credit ratings.

 The system should allow users to drill down on any dimension, such as
on occupation in order to view these customers according to their type of
employment.
Concept/Class Description
 Data discrimination is a comparison of the general
features of target class data objects with the general
features of objects from one or a set of contrasting classes.

 The target and contrasting classes can be specified by the

user, and the corresponding data objects retrieved through
database queries.

 For example, the user may like to compare the general

features of software products whose sales increased by 10%
in the last year with those whose sales decreased by at least
30% during the same period.
Concept/Class Description
 Example of Data discrimination

 A data mining system should be able to compare two groups

of AllElectronics customers, such as those who shop for
computer products regularly versus those who rarely shop
for such products.
Concept/Class Description

 80% of the customers who frequently purchase computer

products are between 20 and 40 years old and have a
university education.

 60% of the customers who infrequently buy such products

are either seniors or youths, and have no university degree.

 Drilling down on a dimension, such as occupation, or adding

new dimensions, such as income level, may help in finding
even more discriminative features between the two classes.
Mining Frequent Patterns
 Patterns that occur frequently in data – Frequent Patterns

 Frequent itemset is a set of items that frequently appear together in a

transactional data set, such as milk and bread.

 Subsequence is a (frequent) sequential pattern such as the pattern that

customers tend to purchase first a PC, followed by a digital camera, and
then a memory card.

 Substructure can refer to different structural forms, such as graphs,

trees, or lattices, which may be combined with itemsets or
subsequences.

 Mining frequent patterns leads to the discovery of interesting

Classification and Prediction
 Classification: process of finding a model that describes and
distinguishes data classes or concepts.

 Use the model to predict the class of objects whose class label is
unknown.

 The derived model is based on the analysis of a set of training data.

 How is the derived model presented?

 Classification (IF-THEN) rules,

 Decision trees

 Mathematical formulae, or neural networks.

Classification and Prediction
 Classification predicts categorical (discrete, unordered) labels.

 Prediction models continuous-valued functions.

 Prediction is used to predict missing or unavailable numerical data values

rather than class labels.

 Regression analysis is a statistical methodology that is most often used

for numeric prediction, although other methods exist as well.
Classification and Prediction
 Example:

 Classify a large set of items in the store, based on three kinds of

responses to a sales campaign: good response, mild response, and no
response.

 Derive a model for each of these three classes based on the descriptive
features of the items, such as price, brand, place made, type, and
category.

 IF-THEN rules:
Classification and Prediction
 Example:

 Decision tree:

 Predict the amount of revenue that each item will generate during an
upcoming sale at AllElectronics, based on previous sales data.
Cluster Analysis
 Unlike classification and prediction, which analyse class-labelled data
objects, clustering analyses data objects without consulting a known
class label.

 The objects are clustered or grouped based on the principle of

maximizing the intra-class similarity and minimizing the interclass
similarity.

 Objects within a cluster have high similarity in comparison to one

another, but are very dissimilar to objects in other clusters.
Outlier Analysis
 A database may contain data objects that do not comply with the general
behavior or model of the data.

 These data objects are outliers. Most data mining methods discard
outliers as noise or exceptions.

Lect 2
No ratings yet
Lect 2
35 pages
Data Warehouse
No ratings yet
Data Warehouse
19 pages
Data Mining Essentials Guide
No ratings yet
Data Mining Essentials Guide
23 pages
Lecture2 DataMiningFunctionalities
No ratings yet
Lecture2 DataMiningFunctionalities
18 pages
Data Mining Functionalities
100% (1)
Data Mining Functionalities
4 pages
Patterns Mined +frequent Patterns
No ratings yet
Patterns Mined +frequent Patterns
18 pages
Chapter 1
No ratings yet
Chapter 1
16 pages
MR22-DM 1
No ratings yet
MR22-DM 1
21 pages
Unit 1
No ratings yet
Unit 1
21 pages
DataWarehouseMining Complete Notes
No ratings yet
DataWarehouseMining Complete Notes
55 pages
2 Data Mining Tasks A Functionalities
No ratings yet
2 Data Mining Tasks A Functionalities
24 pages
Data Mining Concepts and Challenges
No ratings yet
Data Mining Concepts and Challenges
5 pages
Data Mining and Knowledge Discovery
No ratings yet
Data Mining and Knowledge Discovery
10 pages
UNIT 1 Introduction of Data Mining
No ratings yet
UNIT 1 Introduction of Data Mining
11 pages
Lecture Notes 1.1 & 1.2
No ratings yet
Lecture Notes 1.1 & 1.2
8 pages
Important Questions Unit-1
No ratings yet
Important Questions Unit-1
20 pages
III-IT-Data Mining Unit 1-Session 2-Part1
No ratings yet
III-IT-Data Mining Unit 1-Session 2-Part1
17 pages
Archana Data Mining
No ratings yet
Archana Data Mining
24 pages
Wk. 1. Introduction (08.10.2020)
No ratings yet
Wk. 1. Introduction (08.10.2020)
30 pages
Unit-1 Notes
No ratings yet
Unit-1 Notes
24 pages
Data Mining Tasks
No ratings yet
Data Mining Tasks
3 pages
Data Mining Techniques Overview
No ratings yet
Data Mining Techniques Overview
15 pages
Data Mining
No ratings yet
Data Mining
6 pages
Data Mining: Techniques & Applications
No ratings yet
Data Mining: Techniques & Applications
21 pages
Data Mining Essentials for Analysts
No ratings yet
Data Mining Essentials for Analysts
73 pages
DM Lec1
No ratings yet
DM Lec1
14 pages
#CH-2 2 2
No ratings yet
#CH-2 2 2
16 pages
DM - Unit I-Updated
No ratings yet
DM - Unit I-Updated
65 pages
Data Mining
No ratings yet
Data Mining
23 pages
DWDM Unit-II Notes
No ratings yet
DWDM Unit-II Notes
29 pages
Data Mining Is Defined As The Procedure of Extracting Information From Huge Sets of Data
No ratings yet
Data Mining Is Defined As The Procedure of Extracting Information From Huge Sets of Data
6 pages
Data Mining Techniques Unit 2
No ratings yet
Data Mining Techniques Unit 2
48 pages
CH 2
No ratings yet
CH 2
37 pages
DM-Unit-I Introduction To Association-1
No ratings yet
DM-Unit-I Introduction To Association-1
97 pages
Data Mining
No ratings yet
Data Mining
7 pages
Unit 3
No ratings yet
Unit 3
38 pages
BCA Data Mining
No ratings yet
BCA Data Mining
116 pages
DMlecture 1
No ratings yet
DMlecture 1
39 pages
Chapter 1 Data Mining (Cont.)
No ratings yet
Chapter 1 Data Mining (Cont.)
50 pages
Data Mining & Machine Learning Guide
No ratings yet
Data Mining & Machine Learning Guide
19 pages
Kinds of Data: 1. Data Bases Data 2.data Warehouses Data 3. Transactional Data
No ratings yet
Kinds of Data: 1. Data Bases Data 2.data Warehouses Data 3. Transactional Data
24 pages
DM 1 PDF
No ratings yet
DM 1 PDF
67 pages
Unit 1
No ratings yet
Unit 1
59 pages
DM Module 1
No ratings yet
DM Module 1
13 pages
Tasks and Functionalities of Data Mining
No ratings yet
Tasks and Functionalities of Data Mining
3 pages
DW&M Unit - 1-Imp Vii Sem
No ratings yet
DW&M Unit - 1-Imp Vii Sem
9 pages
Data Mining & Agent Selection Guide
No ratings yet
Data Mining & Agent Selection Guide
8 pages
DM Day2 DataUnderstanding MS S25
No ratings yet
DM Day2 DataUnderstanding MS S25
165 pages
Data Mining Course Overview
No ratings yet
Data Mining Course Overview
38 pages
Data Mining - Tasks: Data Characterization Data Discrimination
No ratings yet
Data Mining - Tasks: Data Characterization Data Discrimination
4 pages
Data Mining
No ratings yet
Data Mining
25 pages
Data Mining 1
No ratings yet
Data Mining 1
56 pages
4 Datamining
No ratings yet
4 Datamining
90 pages
Data Mining Unit 1
No ratings yet
Data Mining Unit 1
39 pages
1.1 - Data Mining
No ratings yet
1.1 - Data Mining
18 pages
DM UNIT-1 Question and Answer
No ratings yet
DM UNIT-1 Question and Answer
25 pages
3 DM Classification
No ratings yet
3 DM Classification
55 pages
Data Mining for Computer Science Students
No ratings yet
Data Mining for Computer Science Students
52 pages
ABB IS2 Switchboard
No ratings yet
ABB IS2 Switchboard
308 pages
Lecture 04 - Boiler
No ratings yet
Lecture 04 - Boiler
22 pages
Printing Material Suppliers in India - Dic
No ratings yet
Printing Material Suppliers in India - Dic
13 pages
03a Hyperledger Fabric
No ratings yet
03a Hyperledger Fabric
37 pages
Pashto Shairi DR Baqi Durrani
No ratings yet
Pashto Shairi DR Baqi Durrani
133 pages
Nanotechnology in Smart Textiles 9781773616490 9781773615493 - Compress
No ratings yet
Nanotechnology in Smart Textiles 9781773616490 9781773615493 - Compress
282 pages
Pocket Money Project Report
No ratings yet
Pocket Money Project Report
23 pages
CC10 - Thomas Hardy's The Mayor of Casterbridge As A Late Victorian Novel
No ratings yet
CC10 - Thomas Hardy's The Mayor of Casterbridge As A Late Victorian Novel
6 pages
Engine Oil 20W50 CH4-SL
No ratings yet
Engine Oil 20W50 CH4-SL
2 pages
(2021) Zri Solar Pump Catalog PDF
No ratings yet
(2021) Zri Solar Pump Catalog PDF
56 pages
The Expectation Maximization (EM) Algorithm
No ratings yet
The Expectation Maximization (EM) Algorithm
10 pages
Buy Google Review
No ratings yet
Buy Google Review
6 pages
ASI Eksklusif: Data dan Dukungan
No ratings yet
ASI Eksklusif: Data dan Dukungan
10 pages
Hison Marine Axial Flow Fans Guide
No ratings yet
Hison Marine Axial Flow Fans Guide
7 pages
Globalization Case Study
No ratings yet
Globalization Case Study
5 pages
Chain Lubrication
No ratings yet
Chain Lubrication
6 pages
Hoja de Seguridad - MSDS - Pqs-Abc - Pyrochem
No ratings yet
Hoja de Seguridad - MSDS - Pqs-Abc - Pyrochem
4 pages
SNCP
No ratings yet
SNCP
1 page
Towards The Modern Period
100% (1)
Towards The Modern Period
2 pages
Performance Review Plan Template
No ratings yet
Performance Review Plan Template
9 pages
Instant Download Low Platinum Fuel Cell Technologies Junliang Zhang PDF All Chapters
100% (3)
Instant Download Low Platinum Fuel Cell Technologies Junliang Zhang PDF All Chapters
55 pages
A Thesis Proposal On Designing A Art Gallery
100% (3)
A Thesis Proposal On Designing A Art Gallery
12 pages
Chemometrics in Analytical Chemistry
100% (1)
Chemometrics in Analytical Chemistry
2 pages
Reading Film As Complex Text
No ratings yet
Reading Film As Complex Text
13 pages
PAS 68 2013 Impact Test Specifications For Vehicle Security Barrier Systems
50% (2)
PAS 68 2013 Impact Test Specifications For Vehicle Security Barrier Systems
36 pages
1894-Littledale-Journey Across Central Asia
No ratings yet
1894-Littledale-Journey Across Central Asia
41 pages
History of Plane Surveying
No ratings yet
History of Plane Surveying
6 pages
English 4b
No ratings yet
English 4b
3 pages
Xerox Versant 280 Press Safety, Regulatory, Recycling, and Disposal Reference Guide
No ratings yet
Xerox Versant 280 Press Safety, Regulatory, Recycling, and Disposal Reference Guide
40 pages
Freytag Mitosis Lesson Plan
No ratings yet
Freytag Mitosis Lesson Plan
71 pages

03 Data Mining Functionalities

Uploaded by

03 Data Mining Functionalities

Uploaded by

Data Mining Functionalities

 Data mining functionalities specify the kind of patterns

 In general, data mining tasks can be classified into two

 Descriptive mining tasks characterize the general

 Predictive mining tasks perform inference on the

 Concept/Class Description: Characterization and

 Mining Frequent Patterns, Associations, and Correlations

 Classification and Prediction

 Class: A collection of things sharing a common attribute.

 Classes of items for sale include computers and printers

 Concept: An abstract or general idea inferred or derived from specific

 Concepts of customers include bigSpenders and budgetSpenders.

 Summarized, concise and precise descriptions of individual classes and

concepts are called class/concept descriptions.

 These descriptions can be derived via data characterization, data

 The data corresponding to the user-specified class are typically

 For example, to study the characteristics of software products whose

 Simple data summaries can be done based on statistical measures and

 The output of data characterization can be presented in

 Examples include pie charts, bar charts, curves,

 The target and contrasting classes can be specified by the

 For example, the user may like to compare the general

 A data mining system should be able to compare two groups

 80% of the customers who frequently purchase computer

 60% of the customers who infrequently buy such products

 Drilling down on a dimension, such as occupation, or adding

 Frequent itemset is a set of items that frequently appear together in a

 Subsequence is a (frequent) sequential pattern such as the pattern that

 Substructure can refer to different structural forms, such as graphs,

 Mining frequent patterns leads to the discovery of interesting

 The derived model is based on the analysis of a set of training data.

 How is the derived model presented?

 Classification (IF-THEN) rules,

 Mathematical formulae, or neural networks.

 Prediction models continuous-valued functions.

 Prediction is used to predict missing or unavailable numerical data values

 Regression analysis is a statistical methodology that is most often used

 Classify a large set of items in the store, based on three kinds of

 The objects are clustered or grouped based on the principle of

 Objects within a cluster have high similarity in comparison to one

You might also like