KDD Vs Data Mining
KDD Vs Data Mining
As mentioned above, Data Mining is only a step within the overall KDD process. There
are two major Data Mining goals as defined by the goal of the application, and they are
namely verification or discovery. Verification is verifying the users hypothesis about
data, while discovery is automatically finding interesting patterns. There are four major
data mining task: clustering, classification, regression, and association (summarization).
Clustering is identifying similar groups from unstructured data. Classification is learning
rules that can be applied to new data. Regression is finding functions with minimal error
to model data. And association is looking for relationships between variables. Then, the
specific data mining algorithm needs to be selected. Depending on the goal, different
algorithms like linear regression, logistic regression, decision trees and Nave Bayes
can be selected. Then patterns of interest in one or more representational forms are
searched. Finally, models are evaluated either using predictive accuracy or
understandability.
What is the difference between KDD and Data mining?
Although, the two terms KDD and Data Mining are heavily used interchangeably, they
refer to two related yet slightly different concepts. KDD is the overall process of
extracting knowledge from data while Data Mining is a step inside the KDD process,
which deals with identifying patterns in data. In other words, Data Mining is only the
application of a specific algorithm based on the overall goal of the KDD process.