[go: up one dir, main page]

0% found this document useful (0 votes)
8 views22 pages

4.ProgrammingModelsForBigData Altintas FINAL

The document discusses programming models for big data, emphasizing their importance in supporting operations like data splitting, fast access, and distributed computations. It outlines key requirements such as fault tolerance, scalability, and optimization for specific data types. The MapReduce model is highlighted as a significant abstraction for handling large data volumes and ensuring fault tolerance while enabling scalability.

Uploaded by

EL MALKI Nabil
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views22 pages

4.ProgrammingModelsForBigData Altintas FINAL

The document discusses programming models for big data, emphasizing their importance in supporting operations like data splitting, fast access, and distributed computations. It outlines key requirements such as fault tolerance, scalability, and optimization for specific data types. The MapReduce model is highlighted as a significant abstraction for handling large data volumes and ensuring fault tolerance while enabling scalability.

Uploaded by

EL MALKI Nabil
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Programming Models for

Big Data
After this video you will be able to..
• Explain the requirements of programming
models for big data and why you should care
about them

• Tell your friends how you can scale the speed


of pasta sauce generation in your kitchen by
applying big data programming models
Network

Data-parallel
Rack

scalability
Rack Network

Data
1 2 3 4 5

Compute
2 5
Rack

1
3 4
Rack Network

1 2
Data
3 4 5 ?
Compute
2 5
Rack

1
3 4
Programming Model = abstractions
Runtime Libraries Programming Languages

Data
1 2 3 4 5

Compute
2 5
Rack

1
3 4
Programming Model for Big Data

Programmability
on top of
Distributed File Systems
Requirements for
Big Data Programming Models
1. Support Big Data Operations

Split volumes of data


1. Support Big Data Operations

Split volumes of data

Access data fast


1. Support Big Data Operations

Split volumes of data

Access data fast

Distribute computations to nodes


2. Handle Fault Tolerance

Replicate data partitions

Recover files when needed


3. Enable Adding More Racks

Data 3

1 2 3 4 5

Compute
3
2 5

Rack
Rack

1
3 4 3
4. Optimized for specific data types

Document Table

Graph
Key-value

Stream
Multimedia
Natural model for independent
parallel tasks over multiple resources!
Coming over
for dinner in half
an hour…
Helpers!
MapReduce

A programming model for Big Data

Many implementations
Programming Model = abstractions
Runtime Libraries Programming Languages

Support large data volumes

Provide fault tolerance


MapReduce
Enable scale out

You might also like