CS2032 2 Marks & 16 Marks With Answers
CS2032 2 Marks & 16 Marks With Answers
om
1.Define Data mining.
It refers to extracting or “mining” knowledge from large amount of data. Data
.c
mining is a process of discovering interesting knowledge from large amounts of data
stored either, in database, data warehouse, or other information repositories
ul
2.Give some alternative terms for data mining.
• Knowledge mining
pa
• Knowledge extraction
• Data/pattern analysis.
• Data Archaeology
in
• Data dredging
j
3.What is KDD.
KDD-Knowledge Discovery in Databases.
re
4.What are the steps involved in KDD process.
m
• Data cleaning
• Data Mining
fro
• Pattern Evaluation
• Knowledge Presentation
• Data Integration
• Data Selection
ed
• Data Transformation
Knowledge base is domain knowledge that is used to guide search or evaluate the
interestingness of resulting pattern. Such knowledge can include concept hierarchies used
to organize attribute /attribute values in to different levels of abstraction of Data Mining.
o
nl
• Statistics
• Machine learning
• Decision Tree
• Hidden markov models
• Artificial Intelligence
• Genetic Algorithm
• Meta learning
om
• Testing Hypothesis
• Correlation
• Regression
.c
9.What is meta learning.
Concept of combining the predictions made from multiple models of data
ul
mining and analyzing those predictions to formulate a new and previously unknown
prediction.
pa
GUI
Pattern Evaluation
Database or Data warehouse
in
server
DB DW
j
10.Define Genetic algorithm.
• Search algorithm.
re
m
• Enables us to locate optimal binary string by processing an initial
random population of binary strings by performing operations such as
fro
• Classification
• Regression
nl
om
16. List out the advanced database systems.
• Extended-relational databases
• Object-oriented databases
.c
• Deductive databases
• Spatial databases
ul
• Temporal databases
• Multimedia databases
pa
• Active databases
• Scientific databases
• Knowledge databases
in
17. Define cluster analysis
j
re
Cluster analyses data objects without consulting a known class label. The class
labels are not present in the training data simply because they are not known to begin
with.
m
18.Classifications of Data mining systems.
• Based on the kinds of databases mined:
fro
o According to model
_ Relational mining system
_ Transactional mining system
_ Object-oriented mining system
ed
o Types of Data
_ Spatial data mining system
_ Time series data mining system
o
_ Characterization
_ Discrimination
do
_ Association
_ Classification
_ Clustering
_ Outlier analysis
_ Evolution analysis
o According to levels of abstraction of the knowledge mined
om
o According to user interaction
_ Autonomous systems
_ Interactive exploratory system
_ Query-driven systems
.c
o According to methods of data analysis
_ Database-oriented
ul
_ Data warehouse-oriented
_ Machine learning
pa
_ Statistics
_ Visualization
_ Pattern recognition
in
_ Neural networks
• Based on applications adopted
j
o Finance
o Telecommunication
o DNA
re
o Stock markets
m
o E-mail and so on
fro
19.Describe challenges to data mining regarding data mining methodology and user
interaction issues.
• Mining different kinds of knowledge in databases
• Interactive mining of knowledge at multiple levels of abstraction
ed
om
subjective, can be used to guide the discovery process.
.c
under a unified schema at a single site in order to facilitate management decision-making.
Database consists of a collection of interrelated data.
ul
pa
j in
re
m
fro
ed
o ad
nl
w
do
om
a given data set.
.c
support threshold and a minimum confidence threshold. Users or domain experts
can set such thresholds.
ul
3. Explain Association rule in mathematical notations.
pa
Let I-{i1,i2,…..,im} be a set of items
Let D, the task relevant data be a set of database transaction T is a set of
items
in
An association rule is an implication of the form A=>B where A C I, B C I,
and An B=f. The rule A=>B contains in the transaction set D with support s,
j
re
where s is the percentage of transactions in D that contain AUB. The Rule A=> B
has confidence c in the transaction set D if c is the percentage of transactions in D
containing A that also contain B.
m
4. Define support and confidence in Association rule mining.
Support S is the percentage of transactions in D that contain AUB.
fro
om
algorithm uses prior knowledge of frequent item set properties.
.c
9. How to generate association rules from frequent item sets?
ul
Association rules can be generated as follows
For each frequent item set1, generate all non empty subsets of 1.
pa
For every non empty subsets s of 1, output the rule “S=>(1-s)”if
Support count(1)
=min_conf,
in
Support_count(s)
Where min_conf is the minimum confidence threshold.
j
re
10. Give few techniques to improve the efficiency of Apriori algorithm.
• Hash based technique
• Transaction Reduction
m
• Portioning
• Sampling
fro
11. What are the things suffering the performance of Apriori candidate
generation technique.
ed
12. Describe the method of generating frequent item sets without candidate
o
generation.
Frequent-pattern growth(or FP Growth) adopts divide-and-conquer
nl
strategy.
Steps:
w
Compress the database representing frequent items into a frequent pattern tree
or FP tree
do
om
agg_f, an iceberg query is the form
Select R.a1,R.a2,…..R.an,agg_f(R,b)
From relation R
Group by R.a1,R.a2,….,R.an
.c
Having agg_f(R.b)>=threshold
ul
14. Mention few approaches to mining Multilevel Association Rules
• Uniform minimum support for all levels(or uniform support)
pa
• Using reduced minimum support at lower levels(or reduced support)
• Level-by-level independent
• Level-cross filtering by single item
in
• Level-cross filtering by k-item set
j
re
15. What are multidimensional association rules?
Association rules that involve two or more dimensions or predicates
• Interdimension association rule: Multidimensional association rule with no
repeated predicate or dimension
m
• Hybrid-dimension association rule: Multidimensional association rule with
multiple occurrences of some predicates or dimensions.
fro
• Dimension/level constraints
• Interestingness constraints
• Rule constraints.
o
nl
attributes.
• The model is used for classification.
om
root node.
.c
in the decision tree. Such a measure is referred to as an attribute selection measure
or a measure of the goodness of split.
ul
20. Describe Tree pruning methods.
pa
When a decision tree is built, many of the branches will reflect anomalies in
the training data due to noise or outlier. Tree pruning methods address this
problem of over fitting the data.
in
Approaches:
• Pre pruning
j
• Post pruning
1.Define Clustering?
Clustering is a process of grouping the physical or conceptual data object into
om
clusters.
.c
different objects into meaningful and descriptive objects.
ul
3. What are the fields in which clustering techniques are used?
• Clustering is used in biology to develop new plants and animal
pa
taxonomies.
• Clustering is used in business to enable marketers to develop new
distinct groups of their customers and characterize the customer group on basis
in
of purchasing.
• Clustering is used in the identification of groups of automobiles
j
Insurance policy customer.
re
• Clustering is used in the identification of groups of house in a city on
the basis of house type, their cost and geographical location.
• Clustering is used to classify the document on the web for information
m
discovery.
fro
• Constraints on clustering.
• Dealing with arbitrary shapes.
• High dimensionality
ad
• Scalability
nl
5.What are the different types of data used for cluster analysis?
w
The different types of data used for cluster analysis are interval scaled, binary,
nominal, ordinal and ratio scaled data.
do
om
symmetric and asymmetric binary variables. Symmetric variables are those variables that
have same state values and weights. Asymmetric variables are those variables that have
not same state values and weights.
.c
8. Define nominal, ordinal and ratio scaled variables?
A nominal variable is a generalization of the binary variable. Nominal variable
ul
has more than two states, For example, a nominal variable, color consists of four states,
red, green, yellow, or black. In Nominal variables the total number of states is N and it is
pa
denoted by letters, symbols or integers.
An ordinal variable also has more than two states but all these states are ordered
in a meaningful sequence.
in
A ratio scaled variable makes positive measurements on a non-linear scale, such
as exponential scale, using the formula
j
AeBt or Ae-Bt
Where A and B are constants.
re
9. What do u mean by partitioning method?
m
In partitioning method a partitioning algorithm arranges all the objects into
various partitions, where the total number of partitions is less than the total number of
fro
objects. Here each partition represents a cluster. The two types of partitioning method are
k-means and k-medoids.
properly if any representative data set from the selected representative data sets does not
find best k-medoids.
To recover this drawback a new algorithm, Clustering Large Applications based
o
om
13. What is CURE?
Clustering Using Representatives is called as CURE. The clustering algorithms
generally work on spherical and similar size clusters. CURE overcomes the problem of
.c
spherical and similar size cluster and is more robust with respect to outliers.
ul
14. Define Chameleon method?
Chameleon is another hierarchical clustering method that uses dynamic modeling.
pa
Chameleon is introduced to recover the drawbacks of CURE method. In this method two
clusters are merged, if the interconnectivity between two clusters is greater than the
interconnectivity between the objects within a cluster.
in
15. Define Density based method?
j
re
Density based method deals with arbitrary shaped clusters. In density-based
method, clusters are formed on the basis of the region where the density of the objects is
high.
m
16. What is a DBSCAN?
Density Based Spatial Clustering of Application Noise is called as DBSCAN.
fro
DBSCAN is a density based clustering method that converts the high-density objects
regions into clusters with arbitrary shapes and sizes. DBSCAN defines the cluster as a
maximal set of density connected points.
ed
All the objects are quantized into a finite number of cells and the collection of cells build
the grid structure of objects. The clustering operations are performed on that grid
structure. This method is widely used because its processing time is very fast and that is
o
clustering method. In STING method, all the objects are contained into rectangular cells,
these cells are kept into various levels of resolutions and these levels are arranged in a
do
hierarchical structure.
om
For optimizing a fit between a given data set and a mathematical model based
methods are used. This method uses an assumption that the data are distributed by
probability distributions. There are two basic approaches in this method that are
1. Statistical Approach
.c
2. Neural Network Approach.
ul
21. What is the use of Regression?
Regression can be used to solve the classification problems but it can also be used
pa
for applications such as forecasting. Regression can be performed using many different
types of techniques; in actually regression takes a set of data and fits the data to a
formula.
in
22. What are the reasons for not using the linear regression model to estimate the
j
output data?
re
There are many reasons for that, One is that the data do not fit a linear model, It is
possible however that the data generally do actually represent a linear model, but the
linear model generated is poor because noise or outliers exist in the data.
m
Noise is erroneous data and outliers are data values that are exceptions to the usual and
expected data.
fro
23. What are the two approaches used by regression to perform classification?
Regression can be used to perform classification using the following approaches
1. Division: The data are divided into regions based on class.
ed
Instead of fitting a data into a straight line logistic regression uses a logistic curve.
The formula for the univariate logistic curve is
P= e (C0+C1X1)
o
1+e (C0+C1X1)
The logistic curve gives a value between 0 and 1 so it can be interpreted as the
nl
Analysis may be viewed as finding patterns in the data and predicting future values.
om
27. What is Smoothing?
Smoothing is an approach that is used to remove the nonsystematic behaviors
found in time series. It usually takes the form of finding moving averages of attribute
.c
values. It is used to filter out noise and outliers.
ul
28. Give the formula for Pearson’s r
One standard formula to measure correlation is the correlation coefficient r,
pa
sometimes called Pearson‟s r. Given two time series, X and Y with means X‟ and Y‟,
each with n elements, the formula for r is
S (xi – X‟) (yi – Y‟)
in
(S (xi – X‟)2 S(yi – Y‟)2)1/2
j
29. What is Auto regression?
re
Auto regression is a method of predicting a future time series value by looking at
previous values. Given a time series X = (x1,x2,….xn) a future value, x n+1, can be
found
m
using
x n+1 = x + j nx n + j n-1x n-1 +……+ e n+1
fro
Here e n+1 represents a random error, at time n+1.In addition, each element in the time
series can be viewed as a combination of a random error and a linear combination of
previous values.
ed
o ad
nl
w
do
om
organized under a unified schema at a single site to facilitate management decision
making .
(or)
A data warehouse is a subject-oriented, time-variant and nonvolatile
.c
collection of data in support of management‟s decision-making process.
ul
2.What are operational databases?
Organizations maintain large database that are updated by daily transactions are
pa
called operational databases.
3.Define OLTP?
in
If an on-line operational database systems is used for efficient retrieval, efficient
storage and management of large amounts of data, then the system is said to be on-line
j
transaction processing.
4.Define OLAP?
re
Data warehouse systems serves users (or) knowledge workers in the role of data
m
analysis and decision-making. Such systems can organize and present data in various
formats. These systems are known as on-line analytical processing systems.
fro
Snowflake schema
Fact constellation schema
o
This model is used for the design of corporate data warehouses and department data
marts. This model contains a Star schema, Snowflake schema and Fact constellation
w
om
11.Define dimension table?
A dimension table is used for describing the dimension.
(e.g.) A dimension table for item may contain the attributes item_ name, brand and type.
.c
12.Define fact table?
ul
Fact table contains the name of facts (or) measures as well as keys to each of the
related dimensional tables.
pa
13.What are lattice of cuboids?
In data warehousing research literature, a cube can also be called as cuboids. For
in
different (or) set of dimensions, we can construct a lattice of cuboids, each showing the
data at different level. The lattice of cuboids is also referred to as data cube.
j
14.What is apex cuboid?
re
The 0-D cuboid which holds the highest level of summarization is called the apex
cuboid. The apex cuboid is typically denoted by all.
m
A large central table (fact table) containing the bulk of data with no
redundancy.
_ A set of smaller attendant tables (dimension tables), one for each
dimension.
ed
The snowflake schema is a variant of the star schema model, where some
dimension tables are normalized thereby further splitting the tables in to additional tables.
o
can be viewed as a collection of stars and hence it is known as galaxy schema (or) fact
constellation schema.
w
18.Point out the major difference between the star schema and the snowflake
do
schema?
The dimension table of the snowflake schema model may be kept in normalized
form to reduce redundancies. Such a table is easy to maintain and saves storage space.
om
and more joins will be needed to execute a query.
.c
concepts to higher-level concepts.
ul
21.Define total order?
If the attributes of a dimension which forms a concept hierarchy such as
pa
“street<city< province_or_state <country”, then it is said to be total order.
Country
Province or state
in
City
Street
j
Fig: Partial order for location
22.Define partial order?
re
If the attributes of a dimension which forms a lattice such as
“day<{month<quarter; week}<year, then it is said to be partial order.
m
23.Define schema hierarchy?
A concept hierarchy that is a total (or) partial order among attributes in a database
fro
dimension reduction.
26.What is drill-down operation?
nl
Drill-down is the reverse of roll-up operation. It navigates from less detailed data
to more detailed data. Drill-down operation can be taken place by stepping down a
w
The slice operation performs a selection on one dimension of the cube resulting in
a sub cube.
28.What is dice operation?
The dice operation defines a sub cube by performing a selection on two (or) more
dimensions.
om
30.List out the views in the design of a data warehouse?
_ Top-down view
_ Data source view
_ Data warehouse view
.c
_ Business query view
ul
31.What are the methods for developing large software systems?
_ Waterfall method
pa
_ Spiral method
in
The waterfall method performs a structured and systematic analysis at each step
before proceeding to the next, which is like a waterfall falling from one step to the next.
j
re
33.How the operation is performed in spiral method?
The spiral method involves the rapid generation of increasingly functional
systems, with short intervals between successive releases. This is considered as a good
m
choice for the data warehouse development especially for data marts, because the turn
around time is short, modifications can be done quickly and new designs and
fro
_ Choose the measures that will populate each fact table record.
35.Define ROLAP?
o
36.Define MOLAP?
w
37.Define HOLAP?
The hybrid OLAP approach combines ROLAP and MOLAP technology,
benefiting from the greater scalability of ROLAP and the faster computation of
MOLAP,(i.e.) a HOLAP server may allow large volumes of detail data to be stored in a
relational database, while aggregations are kept in a separate MOLAP store.
om
well as summarized data and can range in size from a few giga bytes to hundreds of giga
bytes, tera bytes (or) beyond.
.c
Data mart is a database that contains a subset of data present in a data warehouse.
Data marts are created to structure the data in a data warehouse according to issues such
ul
as hardware platforms and access control strategies. We can divide a data warehouse into
data marts after the data warehouse has been created. Data marts are usually implemented
pa
on low-cost departmental servers that are UNIX (or) windows/NT based.
in
Dependent data marts are sourced directly from enterprise data warehouses.
Independent data marts are data captured from one (or) more operational systems (or)
j
re
external information providers (or) data generated locally with in particular department
(or) geographic area.
virtual warehouse is easy to build but requires excess capability on operational database
servers.
42.Define indexing?
ed
Indexing is a technique, which is used for efficient data retrieval (or) accessing
data in a faster manner. When a table grows in volume, the indexes also increase in size
ad
_ B-Tree indexing
_ Bit map indexing
nl
_ Join indexing
w
44.Define metadata?
Metadata is used in data warehouse is used for describing data about data.
do
(i.e.) meta data are the data that define warehouse objects. Metadata are created for the
data names and definitions of the given warehouse.
45.Define VLDB?
Very Large Data Base. If a database whose size is greater than 100GB, then
the database is said to be very large database.
om
• Public domain Tools
• Research prototypes
.c
Commercial tools can be defined as the following products and usually are
associated with the consulting activity by the same company:
ul
1. „Intelligent Miner‟ from IBM
2. „SAS‟ System from SAS Institute
pa
3. „Thought‟ from Right Information Systems. etc
in
Public domain Tools are largely freeware with just registration fees:
‟Brute‟ from University of Washington. „MC++‟ from Stanford university, Stanford,
j
California.
5.What is the difference between generic single-task tools and generic multi-task
tools?
Generic single-task tools generally use neural networks or decision trees.
ed
They cover only the data mining part and require extensive pre-processing and
postprocessing
ad
steps.
Generic multi-task tools offer modules for pre-processing and postprocessing
steps and also offer a broad selection of several popular data mining
o
algorithms as clustering.
nl
6. What are the areas in which data warehouses are used in present and in future?
The potential subject areas in which data ware houses may be developed at
w
om
for this data and OLAP techniques can be applied for its analysis
7. What are the other areas for Data warehousing and data mining?
• Agriculture
.c
• Rural development
• Health
ul
• Planning
• Education
pa
• Commerce and Trade
8. Specify some of the sectors in which data warehousing and data mining are used?
in
• Tourism
• Program Implementation
j
• Revenue
• Economic Affairs
• Audit and Accounts
re
m
9. Describe the use of DBMiner.
Used to perform data mining functions, including characterization,
fro
mining system for both OLAP and data mining in relational database and
datawarehouses.
ad
DBMiner
GeoMiner
nl
Multimedia miner
WeblogMiner
w
DNA analysis
Financial data analysis
Retail Industry
Telecommunication industry
Market analysis
Banking industry and Health care analysis.
om
database and corresponds to querying database knowledge including
deduction rules, integrity constraints, generalized rules, frequent patterns and
other regularities.
.c
14.Differentiate direct query answering and intelligent query answering.
Direct query answering means that a query answers by returning exactly what
ul
is being asked.
Intelligent query answering consists of analyzing the intent of query and
pa
providing generalized, neighborhood, or associated information relevant to the
query.
in
15. Define visual data mining
Discovers implicit and useful knowledge from large data sets using data and/
j
or knowledge visualization techniques.
re
Integration of data visualization and data mining.
17.What are the factors involved while choosing data mining system?
nl
Data types
System issues
w
Data sources
Data Mining functions and methodologies
do
om
rules. Also it uses SQl-like syntaxes to mine databases.
.c
data.
Useful in Artificial intelligence and pattern matching
ul
Also known as text mining, knowledge discovery from text, or content
analysis.
pa
20. What does web mining mean
Technique to process information available on web and search for useful data.
in
To discover web pages, text documents , multimedia files, images, and other
types of resources from web.
j
re
Used in several fields such as E-commerce, information filtering, fraud
detection and education and research.
mining.
Used in medical diagnosis, stock markets ,Animation industry, Airline
industry, Traffic management systems, Surveillance systems etc.
o
nl
w
do
UNIT-I
om
1. Explain the evolution of Database technology?
_ Data collection and Database creation
_ Database management systems
_ Advanced database systems
.c
_ Data warehousing and Data Mining
_ Web-based Database systems
ul
_ New generation of Integrated information systems
2.Explain the steps of knowledge discovery in databases?
pa
_ Data cleaning
_ Data integration
_ Data selection
in
_ Data transformation
_ Data mining
j
_ Pattern evaluation
_ Knowledge presentation
re
3. Explain the architecture of data mining system?
_ Database, datawarehouse, or other information repository
m
_ Database or data warehouse server
_ Knowledge base
fro
(Or)
Explain the taxonomy of data mining tasks?
ad
_ Predictive modeling
• Classification
• Regression
• Time series analysis
o
_ Descriptive modeling
nl
• Clustering
• Summarization
w
• Association rules
• Sequence discovery
do
om
_ Hidden markov models
_ Artificial neural networks
_ Genetic algorithms
_ Meta learning
.c
ul
UNIT-II
pa
6.Explain the issues regarding classification and prediction?
_ Preparing the data for classification and prediction
o Data cleaning
in
o Relevance analysis
o Data transformation
j
_ Comparing classification methods
o Predictive accuracy
o Speed
re
o Robustness
m
o Scalability
o Interpretability
fro
_ Pattern definition
_ Objective measures
_ Subjective measures
o
om
11.Explain how the efficiency of apriori is improved?
_ Hash-based technique (hashing item set counts)
_ Transaction reduction (reducing the number of transactions
scanned in future iteration)
.c
_ Partitioning (Partitioning the data to find candidate item sets)
_ Sampling (mining on a subset of the given data)
ul
_ Dynamic item set counting (adding candidate item sets at
different points during a scan)
pa
12.Explain frequent item set without candidate without candidate generation?
_ Frequent patterns growth (or) FP-growth
_ Frequent pattern tree (or) FP-tree
in
_ Algorithm
13. Explain mining Multi-dimensional Boolean association rules from transaction
j
databases?
re
_ Multi-dimensional (or) Multilevel association rules
_ Approaches to mining Multilevel association rules
• Using uniform minimum support for all levels
m
• Using reduced minimum support at lower levels
o Level-by-level independent
fro
_ Dimension/level constraints
_ Interestingness constraints
_ Rule constraints
o
Unit –III
do
om
_ Hypothesis testing
_ Regression
_ Correlation
17. Explain Bayesian classification.
.c
_ Bayesian theorem
_ Naïve Bayesian classification
ul
_ Bayesian belief networks
_ Bayesian learning
pa
18. Discuss the requirements of clustering in data mining.
_ Scalability
_ Ability to deal with different types of attributes
in
_ Discovery of clusters with arbitrary shape
_ Minimal requirements for domain knowledge to determine
j
input parameters
_ Ability to deal with noisy data
re
_ Insensitivity to the order of input records
_ High dimensionality
m
_ Interpretability and usability
_ Interval scaled variables
fro
_ Binary variables
o Symmetric binary variables
o Asymmetric binary variables
_ Nominal variables
ed
_ Ordinal variables
_ Ratio-scaled variables
ad
_ Rules
_ Table
w
_ Crosstab
_ Pie chart
do
_ Bar chart
_ Decision tree
_ Data cube
_ Histogram
_ Quantile plots
_ q-q plots
UNIT IV
om
22. Discuss the components of data warehouse.
_ Subject-oriented
_ Integrated
.c
_ Time-Variant
_ Non-volatile
ul
23. List out the differences between OLTP and OLAP.
_ Users and system orientation
pa
_ Data contents
_ Database design
_ View
in
_ Access patterns
24.Discuss the various schematic representations in multidimensional model.
j
_ Star schema
_ Snow flake schema
_ Fact constellation schema
re
25. Explain the OLAP operations I multidimensional model.
m
_ Roll-up
_ Drill-down
fro
• Top-down view
• Data source view
• Data warehouse view
ad
_ B-Tree indexing
_ Bit-map indexing
_ Join indexing
29.Write notes on metadata repository.
_ Definition
_ Structure of the data warehouse
om
_ Business metadata
30. Write short notes on VLDB.
_ Definition
_ Challenge related to database technologies
.c
_ Issues in VLDB
ul
UNIT V
pa
31.Explain data mining applications for Biomedical and DNA data analysis.
_ Semantic integration of heterogeneous, distributed genome databases
_ Similarity search and comparison among DNA sequences
in
_ Association analysis.
_ Path analysis
j
re
_ Visualization tools and genetic data analysis.
32. Explain data mining applications fro financial data analysis.
_ Loan payment prediction and customer credit policy analysis.
_ Classification and clustering of customers fro targeted marketing.
m
_ Detection of money laundering and other financial crimes.
33. Explain data mining applications for retail industry.
fro
_ Main applications
_ Current status
36. Explain how data mining is used in health care analysis.
_ Health care data mining and its aims
_ Health care data mining technique
_ Segmenting patients into groups
om
_ Predicting medical diagnosis
_ Medical research
_ Hospital administration
_ Applications of data mining in health care
.c
_ Conclusion
37. Explain how data mining is used in banking industry.
ul
_ Data collected by data mining in banking
_ Banking data mining tools
pa
_ Mining customer data of bank
_ Mining for prediction and forecasting
_ Mining for fraud detection
in
_ Mining for cross selling bank services
_ Mining for identifying customer preferences
j
_ Applications of data mining in banking
_ Conclusion
38. Explain the types of data mining.
re
_ Audio data mining
m
_ Video data mining
_ Image data mining
fro