0% found this document useful (0 votes)

23 views4 pages

Intro To Data Science - Week 10 - LAQ's

Apache Cassandra is an open-source, distributed NoSQL database management system known for its high availability, scalability, and performance, making it suitable for applications requiring continuous uptime. It features a decentralized architecture, write optimization, and tunable consistency, allowing it to handle large volumes of data effectively. Common use cases include real-time analytics, IoT data storage, and social media platforms, though it has limitations in complex queries and requires careful management.

Uploaded by

keerthana5958v

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views4 pages

Intro To Data Science - Week 10 - LAQ's

Uploaded by

keerthana5958v

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Cassandra

Apache Cassandra is an open-source, distributed NoSQL database management system

designed to handle large amounts of data across many commodity servers without any
single point of failure. It is known for its high availability, scalability, and performance,
making it ideal for applications that require continuous uptime and can handle huge
volumes of data.

Key Features of Cassandra:

1. Decentralized Architecture:

Cassandra follows a peer-to-peer architecture where all nodes are equal, meaning there is
no master-slave relationship.

Every node in the cluster can handle both read and write requests, ensuring that no single
point of failure exists.

2. Scalability:

Cassandra is designed to scale horizontally by adding more nodes to the cluster. This
allows the database to handle increasing amounts of data and traffic without performance
degradation.

It supports linear scalability, meaning as you add more nodes, the throughput of the
system increases proportionally.

3. High Availability:

Cassandra provides eventual consistency, which prioritizes availability and partition

tolerance (AP in the CAP theorem) over strict consistency. This makes Cassandra highly
available, even in the event of node failures.

It replicates data across multiple nodes, and if one node goes down, the data is still
available on another replica.

4. Data Model:

Cassandra uses a wide-column store data model, which organizes data into tables with
rows and columns. Each row is identified by a unique key (primary key), and columns are
grouped together in column families.

Columns can be added dynamically to rows, offering flexible schema management.

5. Write Optimization:
Cassandra is optimized for write-heavy workloads. It uses a log-structured storage system,
meaning writes are initially written to a commit log and stored in memory. Periodically, data
is flushed to disk in the form of SSTables (Sorted String Tables).

This approach provides high throughput for writes, especially in systems with heavy insert
or update operations.

6. Replication and Fault Tolerance:

Cassandra allows configurable replication strategies. Data can be replicated across

multiple data centers for fault tolerance and disaster recovery.

The replication factor (how many copies of data are kept) can be set per keyspace (a
collection of tables).

7. Tunable Consistency:

Cassandra provides tunable consistency levels, allowing the user to choose between
consistency and performance based on the use case. Consistency can be set at different
levels, such as:

ONE: Only one replica needs to acknowledge the write.

QUORUM: A majority of replicas need to acknowledge the write.

ALL: All replicas must acknowledge the write.

This flexibility lets applications decide the tradeoff between speed and consistency.

8. CQL (Cassandra Query Language):

Cassandra uses CQL, which is similar to SQL but designed for the NoSQL model. CQL is
used to interact with the database, define schemas, insert data, and query data.

9. Secondary Indexes:

Cassandra supports secondary indexes on columns, though they should be used

cautiously due to performance trade-offs. Secondary indexes allow queries to be
performed on columns that are not part of the primary key.

10. Compaction:

Over time, data in Cassandra undergoes compaction, which is the process of merging
SSTables and removing deleted or obsolete data. This helps manage disk space and
optimize read performance.

How Cassandra Works:

1. Write Process:

When a write request is received, it is first written to the commit log for durability. The data
is then stored in memory (in a structure called Memtable) and is eventually flushed to disk
as SSTables.

Data is replicated according to the replication factor defined, ensuring high availability.

2. Read Process:

When a read request is received, Cassandra first checks the Memtables, then checks the
SSTables on disk. It may also involve reading the Bloom filters (which help to quickly
determine if an SSTable contains the requested data) and performing a merge if necessary
to ensure consistency across multiple replicas.

3. Data Replication:

Cassandra uses the Gossip protocol to manage node-to-node communication and monitor
the health of the cluster. It ensures that data is replicated across multiple nodes and that
all replicas are in sync.

4. Partitioning:

Data is partitioned across nodes using a partition key. Cassandra uses a consistent
hashing algorithm to determine which node stores a given piece of data. This ensures even
distribution of data and avoids “hot spots.”

Use Cases:

Real-Time Analytics: Due to its fast write and read capabilities, Cassandra is ideal for
applications that require real-time analytics on massive datasets.

IoT and Time-Series Data: Cassandra is commonly used to store time-series data, as it is
capable of handling high write throughput and large volumes of data from IoT devices.

Social Media and Messaging: Cassandra is well-suited for systems that need to handle
high velocity and volume of messages, such as social media platforms.

Advantages:

High availability and fault tolerance.

Scalable architecture that handles large amounts of data.

Optimized for write-heavy workloads.

Flexible data model that allows dynamic schema changes.

Disadvantages:

Limited support for complex queries (e.g., JOINs and aggregations), making it less suited
for traditional relational use cases.

Tunable consistency can lead to challenges in ensuring data consistency across large
clusters.

Requires careful configuration and management for optimal performance, especially as

the dataset grows.

In summary, Apache Cassandra is a highly scalable, distributed database system that

excels in scenarios requiring high availability, fault tolerance, and large-scale data
processing. It is often chosen for applications that need to handle massive amounts of
data with low-latency reads and writes.

Cassandra Complete Notes
No ratings yet
Cassandra Complete Notes
5 pages
An Overview of Apache Cassandra: Cassandra Essentials Tutorial Series
No ratings yet
An Overview of Apache Cassandra: Cassandra Essentials Tutorial Series
20 pages
NoSQL Apache Cassandra
No ratings yet
NoSQL Apache Cassandra
159 pages
Cassandra
No ratings yet
Cassandra
2 pages
Cassandra Quick Guide
No ratings yet
Cassandra Quick Guide
60 pages
Apache Cassandra Database - Instaclustr
No ratings yet
Apache Cassandra Database - Instaclustr
8 pages
Cassandra PPT Final
No ratings yet
Cassandra PPT Final
23 pages
Cassandra Article Review
No ratings yet
Cassandra Article Review
10 pages
A Study of Cassandra
No ratings yet
A Study of Cassandra
2 pages
Cassandra Tutorial For Beginners: Learn in 3 Days: What Is Apache Cassandra?
No ratings yet
Cassandra Tutorial For Beginners: Learn in 3 Days: What Is Apache Cassandra?
4 pages
Apache Cassandra Nosql SonuJha 04
No ratings yet
Apache Cassandra Nosql SonuJha 04
14 pages
Cassandra
No ratings yet
Cassandra
31 pages
Cassandra: Decentralized Storage System
No ratings yet
Cassandra: Decentralized Storage System
37 pages
Apache Cassandra
No ratings yet
Apache Cassandra
7 pages
Cassandra Preview
No ratings yet
Cassandra Preview
9 pages
Apache Cassandra: Het Patel Kajal Patel
No ratings yet
Apache Cassandra: Het Patel Kajal Patel
8 pages
Apache Cassandra: by Chethan Gowda
No ratings yet
Apache Cassandra: by Chethan Gowda
12 pages
Ch3 Nosql Wordpress
No ratings yet
Ch3 Nosql Wordpress
15 pages
Cassandr 1
No ratings yet
Cassandr 1
8 pages
App Ache
No ratings yet
App Ache
55 pages
Whitepaper - Data Modeling in Apache Cassandra
No ratings yet
Whitepaper - Data Modeling in Apache Cassandra
21 pages
Class 3 Cassandra
No ratings yet
Class 3 Cassandra
64 pages
Cassandra As Used by Facebook
100% (1)
Cassandra As Used by Facebook
12 pages
Dzone Refcard 153 Apache Cassandra 2020
No ratings yet
Dzone Refcard 153 Apache Cassandra 2020
11 pages
9 TH
No ratings yet
9 TH
33 pages
Cassandra Presentation Final
100% (3)
Cassandra Presentation Final
71 pages
Features of Cassandra
No ratings yet
Features of Cassandra
6 pages
Cassandra for Developers & Analysts
No ratings yet
Cassandra for Developers & Analysts
6 pages
Cassandra Architecture PDF
No ratings yet
Cassandra Architecture PDF
112 pages
5 Part2
No ratings yet
5 Part2
7 pages
Cassandra
No ratings yet
Cassandra
25 pages
Cassendra
100% (1)
Cassendra
21 pages
Learning Apache Cassandra - Sample Chapter
No ratings yet
Learning Apache Cassandra - Sample Chapter
20 pages
Cassandra - Module5
No ratings yet
Cassandra - Module5
37 pages
Nosql Cassandra Database: What Is Apache Cassandra?
No ratings yet
Nosql Cassandra Database: What Is Apache Cassandra?
4 pages
Casandra Vs MongoDB
No ratings yet
Casandra Vs MongoDB
5 pages
Intro To NoSQL
No ratings yet
Intro To NoSQL
18 pages
04 Introduction To CassandraDB
No ratings yet
04 Introduction To CassandraDB
19 pages
Cassandra
No ratings yet
Cassandra
7 pages
Cassandra Design Patterns - Sample Chapter
No ratings yet
Cassandra Design Patterns - Sample Chapter
32 pages
Facebook Cassandra
No ratings yet
Facebook Cassandra
10 pages
Deep Dive With Cassandra
No ratings yet
Deep Dive With Cassandra
29 pages
Assign 4
No ratings yet
Assign 4
2 pages
Cassandra Datastax
100% (1)
Cassandra Datastax
10 pages
Apache Cassandra: Database
No ratings yet
Apache Cassandra: Database
55 pages
Module 4
No ratings yet
Module 4
22 pages
Cassandra Data Base1
No ratings yet
Cassandra Data Base1
9 pages
Shravan Apache Cassandra
No ratings yet
Shravan Apache Cassandra
13 pages
W120911A
No ratings yet
W120911A
8 pages
Learn Cassandra
100% (2)
Learn Cassandra
37 pages
Unit 2
No ratings yet
Unit 2
18 pages
Key - Value - Database - (2) (1) (Read-Only)
No ratings yet
Key - Value - Database - (2) (1) (Read-Only)
48 pages
Cassandra Database Overview
No ratings yet
Cassandra Database Overview
37 pages
Cassandra Data Model
No ratings yet
Cassandra Data Model
17 pages
Unit2 Cassandra
No ratings yet
Unit2 Cassandra
15 pages
Introduction to Apache Cassandra
No ratings yet
Introduction to Apache Cassandra
10 pages
Notes On Cassandra: Cassandra Is A Nosql Database
No ratings yet
Notes On Cassandra: Cassandra Is A Nosql Database
34 pages
Cassandra
No ratings yet
Cassandra
5 pages
Apache Cassandra Report
No ratings yet
Apache Cassandra Report
20 pages
System Programming by Dhamdhere Text
No ratings yet
System Programming by Dhamdhere Text
456 pages
Section1 OMMXTS-W00079 1 System Overview 045727
No ratings yet
Section1 OMMXTS-W00079 1 System Overview 045727
40 pages
SSN Data Science Brochure2
No ratings yet
SSN Data Science Brochure2
14 pages
Prisoner Face Detecting System A Java Project.
100% (2)
Prisoner Face Detecting System A Java Project.
138 pages
Manhunt Game Modding Log
No ratings yet
Manhunt Game Modding Log
2 pages
IL230x-B110 Fieldbus Box Modules For EtherCAT
No ratings yet
IL230x-B110 Fieldbus Box Modules For EtherCAT
2 pages
Tourism Recommendation System: A Survey and Future Research Directions
No ratings yet
Tourism Recommendation System: A Survey and Future Research Directions
45 pages
Lab Technician
No ratings yet
Lab Technician
10 pages
8200 V3 Instructions
No ratings yet
8200 V3 Instructions
24 pages
Vocabulary Activities
No ratings yet
Vocabulary Activities
15 pages
Assignment-2 B.Tech-CSE-ALL Subject: Microprocessor and Embedded System Date of Submission: 30 April 2020
100% (1)
Assignment-2 B.Tech-CSE-ALL Subject: Microprocessor and Embedded System Date of Submission: 30 April 2020
3 pages
Template A4 Portrait
No ratings yet
Template A4 Portrait
4 pages
4b0 S4hana2023 Set-Up en XX
No ratings yet
4b0 S4hana2023 Set-Up en XX
74 pages
Python Snake Game Project Report
No ratings yet
Python Snake Game Project Report
11 pages
54-0897 V7HTS Kitpack
No ratings yet
54-0897 V7HTS Kitpack
5 pages
CF-28 Service Repair Manual
No ratings yet
CF-28 Service Repair Manual
204 pages
PLC Course Outline
No ratings yet
PLC Course Outline
3 pages
Custodians and Midwives
No ratings yet
Custodians and Midwives
184 pages
5 Things You Need To Know About Your Application
No ratings yet
5 Things You Need To Know About Your Application
2 pages
PL5 - Course Summary - Pathloss PTP PTMP & Coverage 5 Days (PL5-05)
No ratings yet
PL5 - Course Summary - Pathloss PTP PTMP & Coverage 5 Days (PL5-05)
6 pages
4aa4 9872enw
No ratings yet
4aa4 9872enw
9 pages
Software Engineer's Portfolio
No ratings yet
Software Engineer's Portfolio
1 page
Jensen Ackles Ass Equation 2.0
100% (1)
Jensen Ackles Ass Equation 2.0
7 pages
Google Glass Technical Seminar Report
No ratings yet
Google Glass Technical Seminar Report
30 pages
PVT Unaided Schools - FORM-I - Instructions
No ratings yet
PVT Unaided Schools - FORM-I - Instructions
11 pages
R5105N Series Microprocessor Supervisor
No ratings yet
R5105N Series Microprocessor Supervisor
14 pages
Challenges in Workplace Communication Coursework
100% (2)
Challenges in Workplace Communication Coursework
8 pages
HTML, JavaScript, XML, JSP Quiz
No ratings yet
HTML, JavaScript, XML, JSP Quiz
11 pages
Project Report Final
No ratings yet
Project Report Final
21 pages
Cisco Router Command Guide
No ratings yet
Cisco Router Command Guide
9 pages

Intro To Data Science - Week 10 - LAQ's

Uploaded by

Intro To Data Science - Week 10 - LAQ's

Uploaded by

Cassandra

Apache Cassandra is an open-source, distributed NoSQL database management system

Key Features of Cassandra:

Cassandra provides eventual consistency, which prioritizes availability and partition

Columns can be added dynamically to rows, offering flexible schema management.

6. Replication and Fault Tolerance:

Cassandra allows configurable replication strategies. Data can be replicated across

ONE: Only one replica needs to acknowledge the write.

QUORUM: A majority of replicas need to acknowledge the write.

ALL: All replicas must acknowledge the write.

8. CQL (Cassandra Query Language):

Cassandra supports secondary indexes on columns, though they should be used

How Cassandra Works:

High availability and fault tolerance.

Scalable architecture that handles large amounts of data.

Optimized for write-heavy workloads.

Flexible data model that allows dynamic schema changes.

Requires careful configuration and management for optimal performance, especially as

In summary, Apache Cassandra is a highly scalable, distributed database system that

You might also like