[go: up one dir, main page]

0% found this document useful (0 votes)
14 views3 pages

Syllabus

The document outlines a course on Mining Massive Datasets, focusing on data mining concepts, algorithms, and applications. It covers topics such as recommendation systems, social networks, data stream mining, and contextual-bandit approaches. The course aims to equip students with the skills to analyze and model real-world data mining problems effectively.

Uploaded by

aburoobhastudy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views3 pages

Syllabus

The document outlines a course on Mining Massive Datasets, focusing on data mining concepts, algorithms, and applications. It covers topics such as recommendation systems, social networks, data stream mining, and contextual-bandit approaches. The course aims to equip students with the skills to analyze and model real-world data mining problems effectively.

Uploaded by

aburoobhastudy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

20AIEL708 MINING MASSIVE DATASETS L T P C​

3 0 0 3

Course Objectives:

​ To understand the basic concepts, principles, and techniques in data mining


​ To learn the classical data mining algorithms
​ To perform systematic analyze the real world data mining problems
​ To model data mining problems and evaluate, visualize and communicate
statistical models

UNIT I INTRODUCTION​ ​ ​ ​ ​ ​ ​ 9

Data Mining – Modeling, Statistical limits on Data Mining, Importance of Words in


Documents, Hash Functions, indexe.s Map Reduce and the New Software Stack –
Distributed File systems, Map Reduce, Algorithms using Map Reduce, Extensions to
Map Reduce, The Communication Cost Model, Complexity „theory For Map Reduce.

UNIT II RECOMMENDATION SYSTEM​ ​ ​ ​ ​ 9

Finding Similar Items – Applications of Set similarity, Shingling of Documents,


Similarities Preserving Summaries of Sets, Locality Sensitive Hashing for Documents,
Distance Measures, theory of locality-sensitive functions, LSH families for Other
Distance Measures, Applications of LSH, Methods for High Degrees of Similarity.

UNIT III SOCIAL NETWORKS​ ​ ​ ​ ​ ​ 9

Mining Data Streams – The Stream Data Model, Sampling Data in a stream, Filtering
Streams, Counting Distinct Elements in a Stream, Estimating Moments, Counting Ones
in a Window, Decaying Windows. Link Analysis -Page Rank, Efficient Computation of
Page Rank, Topic Sensitive Page Rank, Link Spam, Hubs and Authorities

UNIT IV MINING DATA STREAMS​ ​ ​ ​ ​ ​ 9

Frequent Itemsets- Market- Basket Model, A-Priori Algorithm, Handling Larger Datasets
in Main Memory, Limited –Pass Algorithm , Counting Frequent Items in a Stream,
Clustering – Introduction, Hierarchical Clustering, K-Means Clustering, CURE
Algorithm, Clustering in Non- Euclidean Spaces, Clustering for Streams and Parallelism
Advertising on the Web- Issues in online Algorithms, The matching Problem, The
Adwords Problem, Adwords implementation

UNIT V CONTEXTUAL-BANDIT APPROACH​ ​ ​ ​ 9

Recommendation System Content-Based Recommendations-Item Profiles,Discovering


Features of Documents, Obtaining Item Features From Tags,Representing Item
Profiles,User Profiles,Recommending Items to Users Based on Content, Classification
Algorithms, A Contextual-Bandit Approach to Personalized News Article
Recommendation.

TEXTBOOK

1.​ Jure Leskovec, Anand Rajaraman, Jeffrey D. Ullman, “Mining of Massive


Datasets“, 2019.

COURSE OUTCOMES :

1.​ Understand the fundamental concepts and statistical limits of data mining and
MapReduce programming model (K2)
2.​ Understand similarity detection techniques to identify similar items in large
datasets.(K2)
3.​ Apply appropriate algorithms to perform link analysis and data stream mining in
large-scale networks.(K3)
4.​ Analyze clustering methods and frequent itemset mining techniques for handling
large-scale and streaming datasets.(K4)
5.​ Understand content-based recommendation systems and contextual-bandit
approaches for personalization.(K2)
6.​ Analyze the efficiency, scalability, and computational cost of different applications
and platforms.(K4)

CO-PO, PSO MAPPING :

PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO1 PO11 PO12 PSO1 PSO
0 2

CO1 3 - 3 1 - - - - 2 1 3 - 3 2

CO2 3 - 2 3 - - 1 - - - 2 - 2 1

CO3 3 - 3 3 - - - - - - 3 - 3 1

CO4 3 - 3 3 - - - - - - 3 - 2 2
CO5 3 - 3 3 - - - - - - 3 - 3 2
CO6 3 - 3 3 - - - - - - 3 - 3 2

You might also like