Kumaraguru College of Technology, Coimbatore
Department of Computer Science and Engineering
U18CSI6203-DATA WAREHOUSING AND DATA MINING
Academic Year: 2023 – 2024
Association Rule Mining [CO4, K3]
• Implement Python code to apply FP Growth algorithm to mine association rules for the
given dataset. Use suitable packages.
• Use Market_Basket_Optimisation dataset attached.
Code:
import pandas as pd
from mlxtend.preprocessing import TransactionEncoder
from mlxtend.frequent_patterns import fpgrowth, association_rules
import numpy as np
# Load the dataset
df = pd.read_csv('Market_Basket_Optimisation.csv', header=None)
print(df.shape)
print(df.head())
# Convert the dataset into a list of transactions
transactions = []
for i in range(len(df)):
transactions.append([str(item) for item in df.iloc[i].dropna()])
# Use TransactionEncoder to transform the transaction data
te = TransactionEncoder()
te_ary = te.fit(transactions).transform(transactions)
df_te = pd.DataFrame(te_ary, columns=te.columns_)
# Apply the FP-Growth algorithm to find frequent itemsets
frequent_itemsets = fpgrowth(df_te, min_support=0.05, use_colnames=True)
# Generate association rules
rules = association_rules(frequent_itemsets, metric="confidence", min_threshold=0.2)
# Display the frequent itemsets and association rules
print("Frequent Itemsets:")
print(frequent_itemsets)
print("\nAssociation Rules:")
print(rules)
print("\nSorting the Rules based on confidence")
sorted_rules = rules.sort_values(by='confidence', ascending=False)
print(sorted_rules)
Output:
The output shows that {Spaghetti}->{Mineral Water} has the highest confidence and they are
more related to each other.