227C4A Data Science
227C4A Data Science
Year: II Semester: IV
Data Science Essentials 227C4A
Credits 5 Lecture Hours: 4 per week
Units Contents
Exploring different types of data and their characteristics - Techniques for assessing
I data quality and reliability - Exploring methods for sourcing and gathering data -
Understanding the concept, sources, and characteristics of Big Data.
Building upon Python basics, delve deeper into advanced concepts such as list
comprehensions, lambda functions, decorators, and iterators - Understanding the usage
and creation of Python modules and packages for efficient data analysis Techniques
II
for handling errors and debugging Python code in the context of data science
applications - Exploring methods for reading and writing data from various file
formats.
Utilizing lists for managing CSV file data efficiently, including nested lists for
structured data representation - Leveraging tuples for immutable storage of CSV data
and handling specific data segments effectively - Exploring array implementations to
perform numerical computations and analysis on CSV data - Advanced dictionary
III
techniques for storing, indexing, and transforming CSV data efficiently - Using sets to
maintain data integrity, handle duplicates, and perform unique value operations when
working with CSV files - Techniques for parsing CSV data, manipulating strings, and
using regular expressions for pattern matching during data extraction and cleaning.
Leveraging mathematical and statistical functions for data analysis - nd-Arrays and
Array Operations: Understanding Numpy's powerful array operations - Pandas
IV Features: Introduction to series and data frames for efficient data manipulation - Data
Manipulation with Pandas: Techniques for creating, transforming, and manipulating
data frames.
UNIVERSITY OF MADRAS
B.Sc. DEGREE PROGRAMME IN COMPUTER SCIENCE WITH
DATA SCIENCE
SYLLABUS WITH EFFECT FROM 2023-2024
Matplotlib for Plotting: Creating different types of plots and customizing their
attributes.
Seaborn for Statistical Visualization: Utilizing Seaborn for advanced statistical
V
plotting.
Interactive Visualization: Introduction to Plotly and Dash for interactive data
visualization.
Learning Resources:
TEXT BOOK:
Joel grus, “Data Science from Scratch”, O’Reilly,2015
Mark Lutz, “ Programming in Python”, O’Reilly,2010
REFERENCES:
Wes Mckinney, “Python for data Analysis”, O’Reilly,2012
Shai vaingast,”Beginning Python on visualization, A Press 2014
WEB REFERENCES
NPTEL online course– Data Science for Engineers - https://nptel.ac.in/courses/106106179/