0% found this document useful (0 votes)

7 views4 pages

Python DataScience Theory and Codes

Uploaded by

gobikaa.om

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views4 pages

Python DataScience Theory and Codes

Uploaded by

gobikaa.om

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Data Science Python Theory + Code Notes

8 Mark Questions and Answers

1. How to handle missing values?

- Mean Imputation: Replace missing values with the mean of the column.

- Dropping Rows: Remove rows that contain any missing value.

2. Python code for log transformation and z-score standardization:

import numpy as np

from sklearn.preprocessing import StandardScaler

data = np.array([1, 10, 100, 1000])

log_data = np.log(data)

scaler = StandardScaler()

standardized = scaler.fit_transform(log_data.reshape(-1, 1))

3. Code for 2x2 subplot:

import matplotlib.pyplot as plt

fig, axs = plt.subplots(2, 2)

axs[0, 0].plot([1, 2], [3, 4])

axs[0, 1].bar([1, 2], [3, 4])

axs[1, 0].scatter([1, 2], [3, 4])

axs[1, 1].hist([1, 2, 2, 3])

plt.tight_layout()

plt.show()

4. Code for year vs sales (line) and year vs products (bar):

import matplotlib.pyplot as plt

year = [2020, 2021, 2022]

sales = [200, 250, 300]

products = [20, 30, 25]

plt.plot(year, sales, label='Sales')

plt.bar(year, products, alpha=0.5, label='Products')

Data Science Python Theory + Code Notes

plt.legend()

plt.show()

16 Mark Questions and Answers

1. 3D Plot in Python:

from mpl_toolkits.mplot3d import Axes3D

import matplotlib.pyplot as plt

import numpy as np

fig = plt.figure()

ax = fig.add_subplot(111, projection='3d')

x = np.linspace(-5, 5, 100)

y = np.linspace(-5, 5, 100)

X, Y = np.meshgrid(x, y)

Z = np.sin(np.sqrt(X**2 + Y**2))

ax.plot_surface(X, Y, Z, cmap='viridis')

plt.show()

2. Data cleaning & filtering code:

import pandas as pd

df = pd.DataFrame({'Name': ['Nina', ' Alex ', 'Nate', 'Sam'], 'Division': ['north', 'east', 'south', 'west']})

df['Name'] = df['Name'].str.strip()

starts_with_N = df[df['Name'].str.startswith('N')]

df['Division'] = df['Division'].str.upper()

# Outlier removal using IQR

Q1 = df['some_column'].quantile(0.25)

Q3 = df['some_column'].quantile(0.75)

IQR = Q3 - Q1

df = df[(df['some_column'] >= Q1 - 1.5 * IQR) & (df['some_column'] <= Q3 + 1.5 * IQR)]

Data Science Python Theory + Code Notes

Blackboard Questions Code

1. y = x^2 from -10 to 10:

import matplotlib.pyplot as plt

x = list(range(-10, 11))

y = [i**2 for i in x]

plt.plot(x, y)

plt.title('y = x^2')

plt.grid()

plt.show()

2. Bar chart of subjects and scores:

subjects = ['Math', 'English', 'History', 'Science']

scores = [90, 75, 88, 92]

plt.bar(subjects, scores)

plt.title('Scores by Subject')

plt.show()

3. Sine and Cosine curves with legend:

import numpy as np

x = np.linspace(0, 2*np.pi, 100)

plt.plot(x, np.sin(x), label='Sine')

plt.plot(x, np.cos(x), label='Cosine')

plt.legend()

plt.grid()

plt.show()

4. Seaborn pairplot with Iris:

import seaborn as sns

df = sns.load_dataset('iris')

sns.pairplot(df, hue='species')
Data Science Python Theory + Code Notes

plt.show()

5. Random scatter plot with numpy:

import numpy as np

x = np.random.rand(50)

y = np.random.rand(50)

plt.scatter(x, y)

plt.title('Random Scatter Plot')

plt.show()

Basic Pandas Theory

- Series: 1D labeled array (like a column).

- DataFrame: 2D labeled data (like an Excel sheet).

- Read CSV: pd.read_csv('file.csv')

- Head/Tail: df.head(), df.tail()

- Selection: df['column'], df.iloc[0], df.loc[0, 'col']

- Missing Values: df.dropna(), df.fillna(), df.isnull()

- Mean Imputation: df['col'].fillna(df['col'].mean())

- Grouping: df.groupby('col').mean(), df['col'].sum()

- Text Ops: df['Name'].str.startswith('N'), df['Name'].str.strip()

- Outlier Removal: IQR method using quantile()

- Uppercase Transformation: df['Division'] = df['Division'].str.upper()

- Merge: pd.merge(df1, df2, on='col')

- Concatenate: pd.concat([df1, df2])

AI & Data Science Lab Record
No ratings yet
AI & Data Science Lab Record
28 pages
Data Analysis
No ratings yet
Data Analysis
20 pages
CSE445 NSU Week - 3
No ratings yet
CSE445 NSU Week - 3
48 pages
PDS - Chapter 4
No ratings yet
PDS - Chapter 4
25 pages
DSP LAB-3 (Part-A)
No ratings yet
DSP LAB-3 (Part-A)
16 pages
Data Prep & EDA for Python Users
No ratings yet
Data Prep & EDA for Python Users
12 pages
IntroToPython Unit 5
No ratings yet
IntroToPython Unit 5
42 pages
Jetlearn Practice - Dimitrina Grazhdani-JL9124415155
No ratings yet
Jetlearn Practice - Dimitrina Grazhdani-JL9124415155
62 pages
ML (Sudhanshu)
No ratings yet
ML (Sudhanshu)
24 pages
Ap Python
No ratings yet
Ap Python
12 pages
Ip Study
No ratings yet
Ip Study
18 pages
PR Final File
No ratings yet
PR Final File
70 pages
Summary: Introduction To Data Visualization Tools
No ratings yet
Summary: Introduction To Data Visualization Tools
13 pages
Pandas Complete + Visualisation Summary of IBM Visualization
No ratings yet
Pandas Complete + Visualisation Summary of IBM Visualization
21 pages
Python Unit IV
No ratings yet
Python Unit IV
12 pages
CS1010S Lecture 11 - Visualising Data
No ratings yet
CS1010S Lecture 11 - Visualising Data
68 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
43 pages
23bet10114 Naman Gupta Assignment-1
No ratings yet
23bet10114 Naman Gupta Assignment-1
17 pages
Set-C AnsKey CT2
No ratings yet
Set-C AnsKey CT2
10 pages
Time Series Analysis Group 9
No ratings yet
Time Series Analysis Group 9
16 pages
AD3301 DEV Lab Manual
No ratings yet
AD3301 DEV Lab Manual
26 pages
Data Manipulation & Visualization
No ratings yet
Data Manipulation & Visualization
7 pages
Fundamental - Python
No ratings yet
Fundamental - Python
3 pages
Python For Data Analysis Jan 28
No ratings yet
Python For Data Analysis Jan 28
105 pages
DA Lab
No ratings yet
DA Lab
27 pages
Data Visualization
No ratings yet
Data Visualization
19 pages
NumPy and Pandas
No ratings yet
NumPy and Pandas
12 pages
Set-B - CT2 - AnswerKey
No ratings yet
Set-B - CT2 - AnswerKey
10 pages
Python in Research
No ratings yet
Python in Research
18 pages
Data Science Python Cheat Sheet
No ratings yet
Data Science Python Cheat Sheet
25 pages
Question Bank2 1722502558363
No ratings yet
Question Bank2 1722502558363
6 pages
IDML Lab Programs
No ratings yet
IDML Lab Programs
5 pages
Data Cleaning - Cheatsheet
100% (2)
Data Cleaning - Cheatsheet
8 pages
Datascience
No ratings yet
Datascience
26 pages
Course - Introduction To Data Science (SD211105)
No ratings yet
Course - Introduction To Data Science (SD211105)
10 pages
Pandas and Numpy
No ratings yet
Pandas and Numpy
9 pages
CLASS1
No ratings yet
CLASS1
7 pages
Eda Code Snippets
No ratings yet
Eda Code Snippets
17 pages
Python Libraries for Data Science
No ratings yet
Python Libraries for Data Science
96 pages
ML Unit 2
No ratings yet
ML Unit 2
52 pages
FDS Record-1-4
No ratings yet
FDS Record-1-4
18 pages
Practical Exam - Class 12 IP Cbse
No ratings yet
Practical Exam - Class 12 IP Cbse
6 pages
S08 Slides
No ratings yet
S08 Slides
14 pages
EDA Exp 2 Outout
No ratings yet
EDA Exp 2 Outout
7 pages
Class X Practical-2025 - Jupyter Notebook
No ratings yet
Class X Practical-2025 - Jupyter Notebook
6 pages
Pandas
No ratings yet
Pandas
5 pages
Experiment No: 1 Title:: Creating Vectors and Data Frames and Implementing Data Summary Functions
No ratings yet
Experiment No: 1 Title:: Creating Vectors and Data Frames and Implementing Data Summary Functions
8 pages
Commands SQL, Python (BASICS)
No ratings yet
Commands SQL, Python (BASICS)
7 pages
EXP1-siddhant Gupta (23 - SE - 148)
No ratings yet
EXP1-siddhant Gupta (23 - SE - 148)
17 pages
Pandas
No ratings yet
Pandas
21 pages
Document (4) - 1
No ratings yet
Document (4) - 1
15 pages
FDS Model
No ratings yet
FDS Model
4 pages
Unit 2
No ratings yet
Unit 2
36 pages
12 Ip Practical List With Solution Complete
No ratings yet
12 Ip Practical List With Solution Complete
5 pages
EX-02-Data Manipulation Pandas Matplot
No ratings yet
EX-02-Data Manipulation Pandas Matplot
9 pages
IP Practic MINE
No ratings yet
IP Practic MINE
30 pages
SA77 TDRN100L4 ProductData en DE
No ratings yet
SA77 TDRN100L4 ProductData en DE
1 page
Icmp 23101012
No ratings yet
Icmp 23101012
7 pages
Streetwear
No ratings yet
Streetwear
6 pages
Cables 2 Wheeler MRP LIST
No ratings yet
Cables 2 Wheeler MRP LIST
89 pages
Diet Free of Cinnamon and Benzoates - All Users
No ratings yet
Diet Free of Cinnamon and Benzoates - All Users
3 pages
KC Tools Aug-Oct 2014 Catalogue
No ratings yet
KC Tools Aug-Oct 2014 Catalogue
5 pages
SEC6 Partes y Planos
100% (1)
SEC6 Partes y Planos
380 pages
Final Kism Fellowship Policy 2025
No ratings yet
Final Kism Fellowship Policy 2025
20 pages
Dragon Lab
No ratings yet
Dragon Lab
38 pages
Hypothesis Testing: Erwin L. Medina
33% (3)
Hypothesis Testing: Erwin L. Medina
8 pages
Inbound Marketing & SEO Services
No ratings yet
Inbound Marketing & SEO Services
3 pages
ACER Coursebook 2019 English
No ratings yet
ACER Coursebook 2019 English
58 pages
The Art of The Nudge Perfecting Push in 2025 Pugpig Reports
No ratings yet
The Art of The Nudge Perfecting Push in 2025 Pugpig Reports
31 pages
Spice Non-Volatile Extract Method
No ratings yet
Spice Non-Volatile Extract Method
8 pages
2019 - Shang - Towards Less Energy Intensive Heavy-Duty Machine Tools - Power Consumption Characteristics and Energy-Saving Strategies
No ratings yet
2019 - Shang - Towards Less Energy Intensive Heavy-Duty Machine Tools - Power Consumption Characteristics and Energy-Saving Strategies
14 pages
BOB Advertisement - Specialist Officers - 2016-17
No ratings yet
BOB Advertisement - Specialist Officers - 2016-17
19 pages
Automotive Specs & Standards
100% (1)
Automotive Specs & Standards
22 pages
ICICI Prudential Life Insurance Company Is A Joint Venture Between ICICI Bank
No ratings yet
ICICI Prudential Life Insurance Company Is A Joint Venture Between ICICI Bank
49 pages
Cash Flow and Financial Ratios Guide
No ratings yet
Cash Flow and Financial Ratios Guide
9 pages
Draft Spec. For Set of Panels For LHB Non AC EOG Coaches PDF
0% (1)
Draft Spec. For Set of Panels For LHB Non AC EOG Coaches PDF
27 pages
DownLoadFiles Programming Example CGPA
No ratings yet
DownLoadFiles Programming Example CGPA
8 pages
Bheema Cements: ISO Certified Cement Manufacturer
No ratings yet
Bheema Cements: ISO Certified Cement Manufacturer
12 pages
Xmarter TIBCO Spotfire Overview PDF
No ratings yet
Xmarter TIBCO Spotfire Overview PDF
9 pages
Puritan Bennett 980 Ventilator Manual
33% (3)
Puritan Bennett 980 Ventilator Manual
476 pages
John Rockefeller Quotes
0% (1)
John Rockefeller Quotes
41 pages
Tech Sites Traffic 50k+
No ratings yet
Tech Sites Traffic 50k+
8 pages
Practical Guide Rac
No ratings yet
Practical Guide Rac
63 pages
"Mommy, What's An Exculpatory Contract?" - New Hampshire Trial Bar News
No ratings yet
"Mommy, What's An Exculpatory Contract?" - New Hampshire Trial Bar News
6 pages
Ethtre
No ratings yet
Ethtre
5 pages
13.2.6 - Exact Equations and Integrating Factors
No ratings yet
13.2.6 - Exact Equations and Integrating Factors
17 pages

Python DataScience Theory and Codes

Uploaded by

Python DataScience Theory and Codes

Uploaded by

Data Science Python Theory + Code Notes

8 Mark Questions and Answers

1. How to handle missing values?

- Dropping Rows: Remove rows that contain any missing value.

2. Python code for log transformation and z-score standardization:

from sklearn.preprocessing import StandardScaler

data = np.array([1, 10, 100, 1000])

standardized = scaler.fit_transform(log_data.reshape(-1, 1))

3. Code for 2x2 subplot:

import matplotlib.pyplot as plt

fig, axs = plt.subplots(2, 2)

axs[0, 0].plot([1, 2], [3, 4])

axs[0, 1].bar([1, 2], [3, 4])

axs[1, 0].scatter([1, 2], [3, 4])

axs[1, 1].hist([1, 2, 2, 3])

4. Code for year vs sales (line) and year vs products (bar):

import matplotlib.pyplot as plt

year = [2020, 2021, 2022]

sales = [200, 250, 300]

products = [20, 30, 25]

plt.plot(year, sales, label='Sales')

plt.bar(year, products, alpha=0.5, label='Products')

16 Mark Questions and Answers

from mpl_toolkits.mplot3d import Axes3D

import matplotlib.pyplot as plt

2. Data cleaning & filtering code:

# Outlier removal using IQR

df = df[(df['some_column'] >= Q1 - 1.5 * IQR) & (df['some_column'] <= Q3 + 1.5 * IQR)]

Blackboard Questions Code

1. y = x^2 from -10 to 10:

import matplotlib.pyplot as plt

2. Bar chart of subjects and scores:

subjects = ['Math', 'English', 'History', 'Science']

scores = [90, 75, 88, 92]

3. Sine and Cosine curves with legend:

x = np.linspace(0, 2*np.pi, 100)

plt.plot(x, np.sin(x), label='Sine')

plt.plot(x, np.cos(x), label='Cosine')

4. Seaborn pairplot with Iris:

import seaborn as sns

5. Random scatter plot with numpy:

plt.title('Random Scatter Plot')

Basic Pandas Theory

- Series: 1D labeled array (like a column).

- DataFrame: 2D labeled data (like an Excel sheet).

- Read CSV: pd.read_csv('file.csv')

- Head/Tail: df.head(), df.tail()

- Selection: df['column'], df.iloc[0], df.loc[0, 'col']

- Missing Values: df.dropna(), df.fillna(), df.isnull()

- Mean Imputation: df['col'].fillna(df['col'].mean())

- Grouping: df.groupby('col').mean(), df['col'].sum()

- Text Ops: df['Name'].str.startswith('N'), df['Name'].str.strip()

- Outlier Removal: IQR method using quantile()

- Uppercase Transformation: df['Division'] = df['Division'].str.upper()

- Merge: pd.merge(df1, df2, on='col')

- Concatenate: pd.concat([df1, df2])

You might also like