[go: up one dir, main page]

0% found this document useful (0 votes)
723 views3 pages

Compact Representation of Frequent Itemsets

The document discusses two approaches for representing frequent itemsets: maximal frequent itemsets and closed frequent itemsets. Maximal frequent itemsets provide a compact representation where all frequent itemsets are subsets. However, their support counts do not provide information about subsets. Closed frequent itemsets are frequent itemsets that are closed and have support above a minsup threshold.

Uploaded by

srinu vas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
723 views3 pages

Compact Representation of Frequent Itemsets

The document discusses two approaches for representing frequent itemsets: maximal frequent itemsets and closed frequent itemsets. Maximal frequent itemsets provide a compact representation where all frequent itemsets are subsets. However, their support counts do not provide information about subsets. Closed frequent itemsets are frequent itemsets that are closed and have support above a minsup threshold.

Uploaded by

srinu vas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Compact Representation of Frequent Itemsets :

The number of frequent itemsets generated by the Apriori algorithm can often be very large, so
it is beneficial to identify a small representative set from which every frequent itemset can be
derived. There are two types of approaches we have. One such approach is using maximal
frequent itemsets. And another one is closed frequent item set.
A maximal frequent itemset is a frequent itemset for which none of its immediate supersets
are frequent. To illustrate this concept, consider the example given below:

The support counts are shown on the top left of each node. Assume support count threshold
= 50%, that is, each item must occur in 2 or more transactions. Based on that threshold, the
frequent itemsets are a, b, c, d, ab, ac, and ad (shaded nodes).
Out of these 7 frequent itemsets, 3 are identified as maximal frequent (having red
outline):
 ab: Immediate supersets abc and abd are infrequent.
 ac: Immediate supersets abc and acd are infrequent.
 ad: Immediate supersets abd and bcd are infrequent.
The remaining 4 frequent nodes (a, b, c, and d) cannot be maximal frequent because they all
have at least 1 immediate superset that is frequent.
Advantage: Maximal frequent itemsets provide a compact representation of all the frequent
itemsets for a particular dataset. In the above example, all frequent itemsets are subsets of the
maximal frequent itemsets, since we can obtain sets a, b, c, and d by enumerating subsets of ab,
ac, and ad (including the maximal frequent itemsets themselves).
Disadvantage: The support count of maximal frequent itemsets does not provide any
information about the support count of their subsets. This means that an additional traversal of
data is needed to determine the support count for non-maximal frequent itemsets, which may
be undesirable in certain cases.

Closed Frequent Item set :

It is a frequent itemset that is both closed and its support is greater than or equal to minsup. An
itemset is closed in a data set if there exists no superset that has the same support count as this
original itemset.
In the above example Closed item sets are a, ab, ac, ad, abc, abd.

Closed frequency item set is{ a, ab, ac, ad, abc, abd }

You might also like