Module 1: Core Python for Data Science (Weeks 1-2)
1.1 Python Fundamentals for Data Analysis
Essential Data Structures:
- Lists: Mutable, ordered sequences. Example: example = [1, 2, 3]
- Tuples: Immutable sequences. Example: example = (1, 2, 3)
- Dictionaries: Key-value pairs. Example: example = {'name': 'Komal', 'age': 24}
- Sets: Unordered collections of unique elements. Example: example = {1, 2, 3}
List/Dict Comprehensions:
squares = [x**2 for x in range(5)]
even_dict = {x: x%2 == 0 for x in range(5)}
Generators:
gen = (x**2 for x in range(1000000))
Classes and Objects:
class Person:
def __init__(self, name):
self.name = name
def __str__(self):
return f"Person({self.name})"
Functional Programming:
lambda: anonymous functions
map(), filter(), reduce() used in pipelines
from functools import reduce
result = reduce(lambda x, y: x+y, [1, 2, 3, 4])
Error Handling:
try:
risky_code()
except ValueError:
print("Handled ValueError")
finally:
print("Cleanup done")
Logging:
import logging
logging.basicConfig(level=logging.INFO)
logging.info("Logging an event")
1.2 Scientific Computing & Data Manipulation
NumPy:
import numpy as np
a = np.array([1, 2, 3])
a + 10 # [11 12 13]
Pandas:
import pandas as pd
df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
df.groupby('A').sum()
Data Cleaning:
df.dropna(), df.fillna(0)
1.3 Data Storage & Retrieval
SQL:
SELECT name, salary FROM employees WHERE salary > 50000;
NoSQL (MongoDB):
"name": "Alice",
"skills": ["Python", "SQL"]
Interview Questions & Answers - Module 1
1. Difference between a list and a tuple?
Lists are mutable; tuples are immutable.
2. What is a generator?
A generator yields values one at a time, using less memory.
3. Explain map(), filter(), reduce().
Functional tools for transformation, filtering, and aggregation.
4. What are comprehensions?
Syntax to create lists or dictionaries using iteration logic.
5. Error handling in Python?
try-except-finally blocks. You can also raise custom exceptions.
6. Why use logging?
To track events and debug applications.
7. Why is NumPy faster?
Due to vectorization and low-level implementation in C.
8. Common Pandas cleaning methods?
dropna(), fillna(), astype()
9. SQL vs NoSQL?
SQL is structured; NoSQL is schema-free and good for unstructured data.
10. What are window functions?
SQL functions like RANK() or ROW_NUMBER() operating on row sets.