17 2017 Lecture1-2 INT312
17 2017 Lecture1-2 INT312
17 2017 Lecture1-2 INT312
FUNDAMENTALS
Lecture #0
Course details
LTP 004 [Four Practicals/week] [BYOD]
CA Category: A0304
Course Orientation: RESEARCH, SOFTWARE SKILL
Weightages: ATT: 5 CA: 25 MTT: 20 ETT: 50
Course details
TEXT BOOKS
No Textbook for this course.
REFERENCE BOOKS
1. BIG DATA by ANIL MAHESHWARI, MCGRAW HILL EDUCATION
2. BIG DATA AND ANALYTICS by SEEMA ACHARYA, SUBHASHINI CHELLAPPAN, WILEY
3. UNDERSTANDING BIG DATA: ANALYTICS FOR ENTERPRISE CLASS HADOOP AND
STREAMING DATA by PAUL C ZIKOPOULOS, IBM, CHRIS EATON, PAUL ZIKOPOULOS,
MC GRAW HILL
4. ORACLE BIG DATA HANDBOOK by TOM PLUNKETT, BRIAN MACDONALD, BRUCE
NELSON, MARK HORNICK, HELEN SUN, KHADER MOHIUDDIN, DEBRA HARDING,
GOKULA MISHRA, ROBERT STACKOWIAK, KEIT, MC GRAW HILL
5. PROFESSIONAL HADOOP SOLUTIONS by BORIS LUBLINSKY, KEVIN T. SMITH, ALEXEY
YAKUBOVICH, WILEY
Course Objectives
recognize the need and importance of fundamental concepts and
principles of Big Data
Course Prerequisite
Prerequisite:
Java Programming / C++
Database basics
7
Volume (Scale)
Data Volume
44x increase from 2009 2020
From 0.8 zettabytes to 35zb
Data volume is increasing exponentially
Exponential increase in
collected/generated data
4.6
30 billion RFID billion
. 12+ TBs tags today
(1.3B in 2005)
camera
of tweet data phones
every day world wide
100s of
millions
data every day
of GPS
? TBs of
enabled
devices
sold
annually
25+ TBs of
log data 2+
every day billion
people on
the Web
76 million smart by end
meters in 2009 2011
200M by 2014
CERNs Large Hydron Collider (LHC) generates 15 PB a year
Maximilien Brice, CERN
2
Variety (Complexity)
Relational Data (Tables/Transaction/Legacy Data)
Text Data (Web)
Semi-structured Data (XML)
Graph Data
Social Network, Semantic Web (RDF),
Streaming Data
You can only scan the data once
Social Banking
Media Finance
Our
Gaming
Customer Known
History
Purchas
Entertain
e
4
Velocity (Speed)
Real-time/Fast Data
Mobile devices
(tracking all objects all the time)
Old Model: Few companies are generating data, all others are consuming data
New Model: all of us are generating data, and all of us are consuming
data
9