[go: up one dir, main page]

0% found this document useful (0 votes)
16 views21 pages

Topic6 - Pattern Mining Advanced Methods

Uploaded by

oh nambu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views21 pages

Topic6 - Pattern Mining Advanced Methods

Uploaded by

oh nambu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 21

Pattern Mining-

Advanced Methods
Mining various kinds of patterns

• People may like to uncover more complex patterns

• Multilevel associations
• Involve concepts at different abstraction levels

• Multidimensional associations
• Involve more than one dimension or predicate (what customer buys to his age?)

• Quantitative associations
• Involve numeric attributes (eg: age, salary)

• Rate patterns
• Suggest interesting, although rare item combinations

• Negative pattern
• Shows negative correlation between items
Mining multilevel associations

• Strong associations are discovered at high abstraction levels

• (eg: buying bread and milk)

• There may be a need to drill down to find novel patterns at more detailed levels

• (eg: buying what kind of bread and what kind of milk together)

• It is interesting to mine patterns at multiple abstraction levels


Mining multilevel association rules
• Concept hierarchies defines a

• Sequence of mappings from a set of low-level concepts to higher-level

• Has five levels, (0 through 4)

• concept hierarchies for nominal attributes are

• specified by experts

• Generated from data, based on the analysis of


• Product specifications
• Attribute values
• Data distributions

• Concept hierarchies for numeric attributes are generated using discretization techniques
• It is difficult to find interesting patterns in primitive-level data

• Eg:

• Dell Studio XPS 16 Notebook

• Logitech VX Nano Cordless Laser Mouse

• Occurs in very small fraction of transactions

• It is difficult to find strong associations involving specific items

• Easier to find strong associations b/w generalized abstractions as

• Dell Notebook

• Cordless Mouse
Multiple-level/multilevel association rules

• Association rules generated from multiple abstraction levels

• Can be mined using concept hierarchies under support-confidence framework

• In general, top-down strategy can be employed

• Counts are accumulated at each concept level

• Starting at level 1

• Working downward to more specific concept levels

• Until no more frequent itemsets can be found


Multiple-level/multilevel association rules

• Using same/uniform support for all levels

• Using reduced minimum support at lower levels

• Using item-group based minimum support


• Using uniform minimum support for all levels

• Same minimum support threshold is used at each level

• Users specify only one minimum support threshold

• An apriori-like optimization technique can be adopted

• An ancestor is a superset of its descendants

• Search procedure is simplified

• Search avoids examining itemsets that do not have a minimum support


• Drawbacks

• Items at lower abstraction levels will not occur frequently as items at higher abstraction
levels

• If minimum support threshold is too high

• Could miss some meaningful associations occurring at low abstraction levels

• If threshold is too low

• May generate many uninteresting associations occurring at high-level abstraction


levels
• Using reduced minimum support at lower levels

• Each abstraction level has its own minimum support threshold

• The deeper the abstraction level,

• Smaller the corresponding threshold

• For mining multilevel patterns with reduced support

• The support threshold should be minimum at lowest abstraction level

• For the final pattern/rule extraction,

• thresholds associated with the corresponding items should be enforced to print


only interesting associations
• Using item or group-based minimum support

• It is sometimes desirable to set up

• User-specific

• Item-based

• Group-based

• Minimal support thresholds when mining multilevel rules

• Eg:
• User setting up minimum support thresholds based on product price/items of interest

• Camera with price over $1000


• Side effect of multilevel association rules

• Generation of many redundant rules across multiple abstraction levels


Mining Multidimensional associations

• Multidimensional associations

• Association rules containing multiple predicates

• Contains three predicates

• Age

• Occupation

• Buys

• Each of which occurs only once in the rule (no repeated predicates)
• Interdimensional association rules

• Multidimensional association rules with no repeated predicates

• Hybrid-dimensional association rules

• Multidimensional association rules with repeated predicates

• Contain multiple occurrences of some predicates


• Database attributes can be nominal/quantitative

• Nominal

• ‘names or things’

• Have finite number of possible values

• No ordering among the values

• Eg:

• Occupation

• Brand

• color
• Quantitative

• Numeric

• Have implicit ordering among values

• Eg:

• Age

• Income

• Price
• Treatment of quantitative attributes

• Quantitative attributes are discretized using predefined concept hierarchies

• Occurs before mining

• Replace the original numeric values by intervals

• 0….20K, 21…40K, and so on

• Discretization is static and predetermined

• Discretized numeric attributes, treated as nominal attributes

• We refer this as

• mining multidimensional association rules using static discretization of quantitative


attributes
• Treatment of quantitative attributes

• Quantitative attributes are discretized or clustered into ‘bins’ based on data distribution

• Discretization is dynamic

• Treats numeric attribute values as quantities

• We refer this as

• Dynamic quantitative association rules


• Multidimensional association rules

• Search for frequent predicate set

• A k-predicate set

• Contains k conjunctive predicates

• Eg:

• Set of predicates {age, occupation, buys} is a 3-predicate set

You might also like