0% found this document useful (0 votes)

117 views33 pages

Practical 1 and 2-1

The document contains code snippets and output related to data analysis tasks using NumPy and Pandas. The tasks involve computing statistics of arrays, manipulating multi-dimensional arrays, handling missing values in data frames, sorting and filtering data frames. Correlation and covariance are also calculated between columns of a data frame.

Uploaded by

SURAJ BISWAS

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

117 views33 pages

Practical 1 and 2-1

Uploaded by

SURAJ BISWAS

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

31/01/2024

Practical → 1st (a)

Q1 Write programs in Python using NumPy library to do the following:
A. Compute the mean, standard deviation, and variance of a two-dimensional
random integer array along the second axis.
CODE:
import numpy as np
x=np.arange(6)
print("\nOriginal array:")
print(x)
r1=np.mean(x)
r2=np.average(x)
print("\nMean:",r1)
r1=np.std(x)
r2=np.sqrt(np.mean((x-np.mean(x))**2))
print("\nstd:",r1)
r1=np.var(x)
r2=np.mean((x-np.mean(x))**2)
print("\nvariance;",r1)

OUTPUT:
01/02/2024

Practical → 1st (b)

Q1 Write programs in Python using NumPy library to do the following:
B. Create a 2-dimensional array of size m x n integer elements, also print the shape,
type and data type of the array and then reshape it into an n x m array, where n
and m are user inputs given at the run time
CODE:
import numpy as np
x = np.array([[2, 4, 6], [6, 8, 10]])
print("First Array : ")
print(x)
print("Type of Array")
print(type(x))
print("Shape of Array")
print(x.shape)
print(x.dtype)
reshaped2 = np.reshape(x, (3, 2))
print("Second Reshaped Array : ")
print(reshaped2)

OUTPUT:
21/02/2024

Practical → 1st (c)

Q1 Write programs in Python using NumPy library to do the following:
C. Test whether the elements of a given 1D array are zero, non-zero, and NaN.
Record the indices of these elements in three separate arrays.
CODE:
import numpy as np
arr = np.array([0, 1, 2, 0, np.nan, 5, 0])
zero_indices = np.where(arr == 0)[0]
non_zero_indices = np.where(arr != 0)[0]
nan_indices = np.where(np.isnan(arr))
print("Zero indices:", zero_indices)
print("Non-zero indices:", non_zero_indices)
print("NaN indices:", nan_indices)

OUTPUT:
21/02/2024

Practical → 1st (d)

Q1 Write programs in Python using NumPy library to do the following:
D. Create three random arrays of the same size: Array1, Array2, and
Array3. Subtract Array 2 from Array3 and store in Array4. Create
another array Array5 having two times the values in Array1. Find
Covariance and Correlation of Array1 with Array4 and Array5
respectively
CODE:
import numpy as np

# Create three random arrays of the same size

size = 100
Array1 = np.random.rand(size)
Array2 = np.random.rand(size)
Array3 = np.random.rand(size)

# Subtract Array2 from Array3 and store in Array4

Array4 = Array3 - Array2

# Create Array5 having two times the values in Array1

Array5 = 2 * Array1

# Find Covariance of Array1 with Array4

covariance_1_4 = np.cov(Array1, Array4)[0][1]

# Find Correlation of Array1 with Array5

correlation_1_5 = np.corrcoef(Array1, Array5)[0][1]

print("Covariance of Array1 with Array4:", covariance_1_4)

print("Correlation of Array1 with Array5:", correlation_1_5)

OUTPUT:
21/02/2024

Practical → 1st (e)

Q1 Write programs in Python using NumPy library to do the following:
E. Create two random arrays of the same size 10: Array1, and Array2. Find the sum
of the first half of both the arrays and the product of the second half of both
arrays.
CODE:
import numpy as np

# Create two random arrays of the same size 10

size = 10
Array1 = np.random.rand(size)
Array2 = np.random.rand(size)

# Find the sum of the first half of both arrays

sum_first_half_Array1 = np.sum(Array1[:size//2])
sum_first_half_Array2 = np.sum(Array2[:size//2])

# Find the product of the second half of both arrays

product_second_half_Array1 = np.prod(Array1[size//2:])
product_second_half_Array2 = np.prod(Array2[size//2:])

print("Sum of the first half of Array1:", sum_first_half_Array1)

print("Sum of the first half of Array2:", sum_first_half_Array2)
print("Product of the second half of Array1:", product_second_half_Array1)
print("Product of the second half of Array2:", product_second_half_Array2)

OUTPUT:
28/02/2024

Practical → 2nd (a)

Q2 Do the following using PANDAS Series:
A. Create a series with 5 elements. Display the series sorted on index and also sorted
on values separately.
CODE:
import pandas as pd

# Create a series with 5 elements

series = pd.Series([10, 5, 8, 2, 7], index=['e', 'a', 'd', 'c', 'b'])

# Display the series sorted on index

sorted_by_index = series.sort_index()

# Display the series sorted on values

sorted_by_values = series.sort_values()

print("Series sorted on index:")

print(sorted_by_index)

print("\nSeries sorted on values:")

print(sorted_by_values)

OUTPUT:
28/02/2024

Practical → 2nd (b)

Q2 Do the following using PANDAS Series:
B. Create a series with N elements with some duplicate values. Find the minimum
and maximum ranks assigned to the values using the ‘first’ and ‘max’ methods.
CODE:
import pandas as pd
series = pd.Series([2, 4, 6, 2, 8, 6, 3, 7, 4, 5])
min_ranks_first = series.rank(method='first')
max_ranks_max = series.rank(method='max')

print("Series:")
print(series)
print("\nMinimum ranks (using 'first' method):")
print(min_ranks_first)
print("\nMaximum ranks (using 'max' method):")
print(max_ranks_max)

OUTPUT:
28/02/2024

Practical → 2nd (c)

Q2 Do the following using PANDAS Series:
C. Display the index value of the minimum and maximum element of a
Series.
CODE:
import pandas as pd

# Create a sample series

series = pd.Series([10, 5, 8, 2, 7])

# Find index value of the minimum element

min_index = series.idxmin()

# Find index value of the maximum element

max_index = series.idxmax()

print("Index value of the minimum element:", min_index)

print("Index value of the maximum element:", max_index)

OUTPUT:
06/03/2024

Practical → 3rd (a)

Q3 Create a data frame having at least 3 columns and 50 rows to store numeric data
generated using a random function. Replace 10% of the values by null values whose
index positions are generated using random function. Do the following:
a. Identify and count missing values in a data frame.
CODE:
import pandas as pd
import numpy as np
data = np.random.randn(50, 3)
df = pd.DataFrame(data, columns=['Column1', 'Column2', 'Column3'])
null_indices = np.random.choice(df.index, size=int(0.1 * len(df)), replace=False)
for idx in null_indices:
col_idx = np.random.randint(0, 3)
df.iloc[idx, col_idx] = np.nan
missing_values_count = df.isnull().sum()
print("Missing values count:")
print(missing_values_count)

OUTPUT:
06/03/2024

Practical → 3rd (b)

b. Drop the column having more than 5 null values.
CODE:
import pandas as pd
import numpy as np
data = np.random.randn(50, 3)
df = pd.DataFrame(data, columns=['Column1', 'Column2', 'Column3'])
null_indices = np.random.choice(df.index, size=int(0.1 * len(df)), replace=False)
for idx in null_indices:
col_idx = np.random.randint(0, 3)
df.iloc[idx, col_idx] = np.nan
df=df.dropna(axis=1, thresh=45)
print(df)

OUTPUT:
06/03/2024

Practical → 3rd (c)

c. Identify the row label having maximum of the sum of all values in a row and drop that
row.
CODE:
import pandas as pd
import numpy as np
data = np.random.randn(50, 3)
df = pd.DataFrame(data, columns=['Column1', 'Column2', 'Column3'])
null_indices = np.random.choice(df.index, size=int(0.1 * len(df)), replace=False)
for idx in null_indices:
col_idx = np.random.randint(0, 3)
df.iloc[idx, col_idx] = np.nan
row_sums = df.sum(axis=1)
max_row_label = row_sums.idxmax()
df = df.drop(index=max_row_label)
print(df)

OUTPUT:
06/03/2024

Practical → 3rd (d)

d. Sort the data frame on the basis of the first column.
CODE:
import pandas as pd
import numpy as np
data = np.random.randn(50, 3)
df = pd.DataFrame(data, columns=['Column1', 'Column2', 'Column3'])
null_indices = np.random.choice(df.index, size=int(0.1 * len(df)), replace=False)
for idx in null_indices:
col_idx = np.random.randint(0, 3)
df.iloc[idx, col_idx] = np.nan
df_sorted = df.sort_values(by='Column1')
print(df_sorted)

OUTPUT:
06/03/2024

Practical → 3rd (e)

e. Remove all duplicates from the first column.
CODE:
import pandas as pd
import numpy as np
data = np.random.randn(50, 3)
df = pd.DataFrame(data, columns=['Column1', 'Column2', 'Column3'])
null_indices = np.random.choice(df.index, size=int(0.1 * len(df)), replace=False)
for idx in null_indices:
col_idx = np.random.randint(0, 3)
df.iloc[idx, col_idx] = np.nan
df_unique=df.drop_duplicates(subset=['Column1'])
print(df_unique)

OUTPUT:
06/03/2024

Practical → 3rd (f)

f. Find the correlation between first and second column and covariance between second
and third column.
CODE:
import pandas as pd
import numpy as np
data = np.random.randn(50, 3)
df = pd.DataFrame(data, columns=['Column1', 'Column2', 'Column3'])
null_indices = np.random.choice(df.index, size=int(0.1 * len(df)), replace=False)
for idx in null_indices:
col_idx = np.random.randint(0, 3)
df.iloc[idx, col_idx] = np.nan
# Calculate correlation between first and second column
correlation_first_second = df['Column1'].corr(df['Column2'])

# Calculate covariance between second and third column

covariance_second_third = df['Column2'].cov(df['Column3'])

print("Correlation between first and second column:", correlation_first_second)

print("Covariance between second and third column:", covariance_second_third)

OUTPUT:
06/03/2024

Practical → 3rd (g)

g. Discretize the second column and create 5 bins.
CODE:
import pandas as pd
import numpy as np
data = np.random.randn(50, 3)
df = pd.DataFrame(data, columns=['Column1', 'Column2', 'Column3'])
null_indices = np.random.choice(df.index, size=int(0.1 * len(df)), replace=False)
for idx in null_indices:
col_idx = np.random.randint(0, 3)
df.iloc[idx, col_idx] = np.nan
df['Column2_bins'] = pd.cut(df['Column2'], bins=5)
print(df)

OUTPUT:
13/03/2024

Practical → 6th
Q Consider the following data frame containing a family name, gender of the family
member and her/his monthly income in each record.

CODE:
import pandas as pd

# Creating the DataFrame

data = {
'Name': ['Shah', 'Vats', 'Vats', 'Kumar', 'Vats', 'Kumar', 'Shah', 'Shah', 'Kumar',
'Vats'],
'Gender': ['Male', 'Male', 'Female', 'Female', 'Female', 'Male', 'Male', 'Female',
'Female', 'Male'],
'MonthlyIncome': [114000.00, 65000.00, 43150.00, 69500.00, 155000.00,
103000.00, 55000.00, 112400.00, 81030.00, 71900.00]
}

df = pd.DataFrame(data)
print(df)
OUTPUT:
13/03/2024

Practical → 6th (a)

Q Write a program in Python using Pandas to perform the following:
a. Calculate and display familywise gross monthly income.
CODE:
import pandas as pd
data = {
'Name': ['Shah', 'Vats', 'Vats', 'Kumar', 'Vats', 'Kumar', 'Shah', 'Shah', 'Kumar',
'Vats'],
'Gender': ['Male', 'Male', 'Female', 'Female', 'Female', 'Male', 'Male', 'Female',
'Female', 'Male'],
'MonthlyIncome': [114000.00, 65000.00, 43150.00, 69500.00, 155000.00,
103000.00, 55000.00, 112400.00, 81030.00, 71900.00]
}
df = pd.DataFrame(data)
family_income = df.groupby('Name')['MonthlyIncome'].sum()
print("Familywise gross monthly income:")
print(family_income)
print()
OUTPUT:
13/03/2024

Practical → 6th (b)

Q Write a program in Python using Pandas to perform the following:
b. Calculate and display the member with the highest monthly income.
CODE:
import pandas as pd
data = {
'Name': ['Shah', 'Vats', 'Vats', 'Kumar', 'Vats', 'Kumar', 'Shah', 'Shah', 'Kumar',
'Vats'],
'Gender': ['Male', 'Male', 'Female', 'Female', 'Female', 'Male', 'Male', 'Female',
'Female', 'Male'],
'MonthlyIncome': [114000.00, 65000.00, 43150.00, 69500.00, 155000.00,
103000.00, 55000.00, 112400.00, 81030.00, 71900.00]
}
df = pd.DataFrame(data)
highest_income_member = df.loc[df['MonthlyIncome'].idxmax()]
print("Member with the highest monthly income:")
print(highest_income_member)
print()
OUTPUT:
13/03/2024

Practical → 6th (c)

Q Write a program in Python using Pandas to perform the following:
c. Calculate and display monthly income of all members with income greater than Rs.
60000.00.
CODE:
import pandas as pd
data = {
'Name': ['Shah', 'Vats', 'Vats', 'Kumar', 'Vats', 'Kumar', 'Shah', 'Shah', 'Kumar',
'Vats'],
'Gender': ['Male', 'Male', 'Female', 'Female', 'Female', 'Male', 'Male', 'Female',
'Female', 'Male'],
'MonthlyIncome': [114000.00, 65000.00, 43150.00, 69500.00, 155000.00,
103000.00, 55000.00, 112400.00, 81030.00, 71900.00]
}
df = pd.DataFrame(data)
high_income_members = df[df['MonthlyIncome'] > 60000.00]
print("Monthly income of members with income greater than Rs. 60000.00:")
print(high_income_members[['Name', 'MonthlyIncome']])
print()
OUTPUT:
13/03/2024

Practical → 6th (d)

Q Write a program in Python using Pandas to perform the following:
d. Calculate and display the average monthly income of the female members
CODE:
import pandas as pd
data = {
'Name': ['Shah', 'Vats', 'Vats', 'Kumar', 'Vats', 'Kumar', 'Shah', 'Shah', 'Kumar',
'Vats'],
'Gender': ['Male', 'Male', 'Female', 'Female', 'Female', 'Male', 'Male', 'Female',
'Female', 'Male'],
'MonthlyIncome': [114000.00, 65000.00, 43150.00, 69500.00, 155000.00,
103000.00, 55000.00, 112400.00, 81030.00, 71900.00]
}
df = pd.DataFrame(data)
female_avg_income = df[df['Gender'] == 'Female']['MonthlyIncome'].mean()
print("Average monthly income of female members:", female_avg_income)

OUTPUT:
21/03/2024

Practical → 7th (a)

Q7. Using Titanic dataset, to do the following:
A. Find the total number of passengers with age less than 30.

CODE:
import pandas as pd
titanic_df = pd.read_csv("C:/Users/DELL/Downloads/titanic.csv")
# a. Total number of passengers with age less than 30
passengers_under_30 = titanic_df[titanic_df['Age'] < 30]
total_passengers_under_30 = passengers_under_30.shape[0]
print("Total number of passengers with age less than 30:",
total_passengers_under_30)

OUTPUT:
21/03/2024

Practical → 7th (b)

Q7. Using Titanic dataset, to do the following:
B. Find total fare paid by passengers of first class.

CODE:
import pandas as pd
titanic_df = pd.read_csv("C:/Users/DELL/Downloads/titanic.csv")
# b. Total fare paid by passengers of first class
first_class_fare = titanic_df[titanic_df['Pclass'] == 1]['Fare'].sum()
print("Total fare paid by passengers of first class:", first_class_fare)

OUTPUT:
21/03/2024

Practical → 7th (c)

Q7. Using Titanic dataset, to do the following:
C. Compare number of survivors of each passenger class

CODE:
import pandas as pd
titanic_df = pd.read_csv("C:/Users/DELL/Downloads/titanic.csv")
# c. Number of survivors of each passenger class
survivors_by_class = titanic_df.groupby('Pclass')['Survived'].sum()
print("Number of survivors of each passenger class:")
print(survivors_by_class)

OUTPUT:
21/03/2024

Practical → 7th (d)

Q7. Using Titanic dataset, to do the following:
D. Compute descriptive statistics for any numeric attribute genderwise

CODE:
import pandas as pd
titanic_df = pd.read_csv("C:/Users/DELL/Downloads/titanic.csv")
# d. Descriptive statistics for age attribute genderwise
descriptive_stats_genderwise = titanic_df.groupby('Sex')['Age'].describe()
print("Descriptive statistics for age attribute genderwise:")
print(descriptive_stats_genderwise)

OUTPUT:
17/04/2024

Practical → 4th
Q4. Consider two Excel files having an attendance of two workshops. Each file has three
fields ‘Name’, ‘Date, duration (in minutes) where names are unique within a file. Note
that duration may take one of three values (30, 40, 50) only. Import the data into two
data frames.
CODE:
import pandas as pd

# Create dummy data for workshop1

workshop1_data = {
'Name': ['Alice', 'Bob', 'Charlie', 'David'],
'Date': ['2024-04-01', '2024-04-02', '2024-04-03', '2024-04-04'],
'Duration': [30, 40, 50, 30]
}

# Create dummy data for workshop2

workshop2_data = {
'Name': ['Alice', 'Eve', 'Charlie', 'Frank'],
'Date': ['2024-04-03', '2024-04-04', '2024-04-05', '2024-04-06'],
'Duration': [30, 40, 50, 30]
}

# Create data frames from the dummy data

df1 = pd.DataFrame(workshop1_data)
df2 = pd.DataFrame(workshop2_data)

# Display the first few rows of each data frame to verify the data
print("Data Frame 1:")
print(df1)

print("\nData Frame 2:")

print(df2)

OUTPUT:
Q. Import the data into two data frames and do the following:
a. Perform a merging of the two data frames to find the names of students
who had attended both workshops.
b. Find the names of all students who have attended a single workshop only.
c. Merge two data frames row-wise and find the total number of records in
the data frame.
d. Merge two data frames row-wise and use two columns viz. names and
dates as multi-row indexes. Generate descriptive statistics for this
hierarchical data frame.
CODE:
# a. Perform merging of the two data frames to find the names of students who had attended
both workshops.
attended_both = pd.merge(df1, df2, how='inner', on='Name')
print("\nNames of students who attended both workshops:")
print(attended_both['Name'].unique())

# b. Find names of all students who have attended a single workshop only.
attended_either = pd.merge(df1, df2, how='outer', on='Name', indicator=True)
attended_single = attended_either[attended_either['_merge'].isin(['left_only', 'right_only'])]
print("\nNames of students who attended a single workshop only:")
print(attended_single['Name'].unique())

# c. Merge two data frames row-wise and find the total number of records in the data frame.
merged_df = pd.concat([df1, df2], ignore_index=True)
print("\nTotal number of records in the merged data frame:", len(merged_df))

# d. Merge two data frames row-wise and use two columns viz. names and dates as multi-row
indexes.
# Generate descriptive statistics for this hierarchical data frame.
merged_df_multi_index = pd.concat([df1.set_index(['Name', 'Date']), df2.set_index(['Name',
'Date'])], axis=0)
print("\nDescriptive statistics for the hierarchical data frame:")
print(merged_df_multi_index.describe())
OUTPUT:
24/04/2024

Practical → 5th (a)

Q5. Using Iris data, plot the following with proper legend and axis labels: (Download
IRIS data from: https://archive.ics.uci.edu/ml/datasets/iris or import it from sklearn
datasets)
a. Plot bar chart to show the frequency of each class label in the data.
CODE:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# Load Iris dataset
iris_df = sns.load_dataset('iris')
# a. Plot bar chart to show the frequency of each class label in the data.
plt.figure(figsize=(8, 6))
sns.countplot(x='species', data=iris_df)
plt.title('Frequency of each class label')
plt.xlabel('Species')
plt.ylabel('Frequency')
plt.show()

OUTPUT:
24/04/2024

Practical → 5th (b)

Q5. Using Iris data, plot the following with proper legend and axis labels: (Download
IRIS data from: https://archive.ics.uci.edu/ml/datasets/iris or import it from sklearn
datasets)
b. Draw a scatter plot for Petal width vs sepal width and fit a regression line .
CODE:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# Load Iris dataset
iris_df = sns.load_dataset('iris')
# b. Draw a scatter plot for Petal width vs sepal width and fit a regression line
plt.figure(figsize=(8, 6))
sns.regplot(x='petal_width', y='sepal_width', data=iris_df)
plt.title('Petal width vs Sepal width')
plt.xlabel('Petal width (cm)')
plt.ylabel('Sepal width (cm)')
plt.show()

OUTPUT:
24/04/2024

Practical → 5th (c)

Q5. Using Iris data, plot the following with proper legend and axis labels: (Download
IRIS data from: https://archive.ics.uci.edu/ml/datasets/iris or import it from sklearn
datasets)
c. Plot density distribution for feature petal length.
CODE:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# Load Iris dataset
iris_df = sns.load_dataset('iris')
# c. Plot density distribution for feature petal length.
plt.figure(figsize=(8, 6))
sns.kdeplot(iris_df['petal_length'], shade=True)
plt.title('Density distribution of Petal length')
plt.xlabel('Petal length (cm)')
plt.ylabel('Density')
plt.show()

OUTPUT:
24/04/2024

Practical → 5th (d)

Q5. Using Iris data, plot the following with proper legend and axis labels: (Download
IRIS data from: https://archive.ics.uci.edu/ml/datasets/iris or import it from sklearn
datasets)
d. Use a pair plot to show pairwise bivariate distribution in the Iris Dataset.
CODE:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# Load Iris dataset
iris_df = sns.load_dataset('iris')
# d. Use a pair plot to show pairwise bivariate distribution in the Iris Dataset.
plt.figure(figsize=(10, 8))
sns.pairplot(iris_df, hue='species')
plt.show()

OUTPUT:
24/04/2024

Practical → 5th (e)

Q5. Using Iris data, plot the following with proper legend and axis labels: (Download
IRIS data from: https://archive.ics.uci.edu/ml/datasets/iris or import it from sklearn
datasets)
e. Draw heatmap for the four numeric attributes.
CODE:
mport pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# Load Iris dataset
iris_df = sns.load_dataset('iris')
# Drop the 'species' column
numeric_df = iris_df.drop(columns='species')
# Draw heatmap for the four numeric attributes
plt.figure(figsize=(8, 6))
sns.heatmap(numeric_df.corr(), annot=True, cmap='coolwarm', linewidths=0.5)
plt.title('Heatmap for numeric attributes')
plt.show()

OUTPUT:
24/04/2024

Practical → 5th (f)

Q5. Using Iris data, plot the following with proper legend and axis labels: (Download
IRIS data from: https://archive.ics.uci.edu/ml/datasets/iris or import it from sklearn
datasets)
f. Compute mean, mode, median, standard deviation, confidence interval and
standard error for each feature
CODE:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# Load Iris dataset
iris_df = sns.load_dataset('iris')
# Compute statistics for numeric features
numeric_stats = iris_df.describe().transpose()
# Compute mode for non-numeric features separately
non_numeric_modes =
iris_df.select_dtypes(include=['object']).mode().transpose().iloc[:, 0]
# Compute standard error
standard_error = iris_df.select_dtypes(include=['number']).sem().values
# Compute 95% confidence interval
n = iris_df.shape[0]
confidence_interval = 1.96 * iris_df.select_dtypes(include=['number']).std() / (n ** 0.5)
# Combine statistics
feature_stats = pd.concat([numeric_stats, pd.DataFrame(non_numeric_modes,
columns=['mode']),
pd.DataFrame(standard_error, columns=['standard error']),
pd.DataFrame((numeric_stats['mean'] - confidence_interval).values,
columns=['95% CI (low)']),
pd.DataFrame((numeric_stats['mean'] + confidence_interval).values,
columns=['95% CI (high)'])], axis=1)
print("\nFeature statistics:")
print(feature_stats)

OUTPUT:
24/04/2024

Practical → 5th (g)

Q5. Using Iris data, plot the following with proper legend and axis labels: (Download
IRIS data from: https://archive.ics.uci.edu/ml/datasets/iris or import it from sklearn
datasets)
g. Compute correlation coefficients between each pair of features and plot heatmap.
CODE:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# Load Iris dataset
iris_df = sns.load_dataset('iris')
# Exclude non-numeric column 'species'
numeric_columns = iris_df.select_dtypes(include=[float, int]).columns
iris_numeric_df = iris_df[numeric_columns]
# Compute correlation coefficients between each pair of features and plot heatmap
correlation_matrix = iris_numeric_df.corr()
plt.figure(figsize=(8, 6))
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', linewidths=0.5)
plt.title('Correlation heatmap')
plt.show()
OUTPUT:

RBSE Board Book Class 6 Science
No ratings yet
RBSE Board Book Class 6 Science
156 pages
CUET UG Computer Science 20 Sets Book 2025
No ratings yet
CUET UG Computer Science 20 Sets Book 2025
219 pages
The Consultant Next Door
100% (1)
The Consultant Next Door
255 pages
Jeevan Tarun Plan 734
No ratings yet
Jeevan Tarun Plan 734
20 pages
Annual Report English 21 22
No ratings yet
Annual Report English 21 22
272 pages
IP 12 2024-25 BluePrint-QsPattern
No ratings yet
IP 12 2024-25 BluePrint-QsPattern
4 pages
Health 10 - 1st Quarter
No ratings yet
Health 10 - 1st Quarter
17 pages
AIDS - DS - Lab Manual
No ratings yet
AIDS - DS - Lab Manual
13 pages
Real Analysis Imp Questions - 240827 - 223420
No ratings yet
Real Analysis Imp Questions - 240827 - 223420
4 pages
Practical Questions Mysql For Record 2023-24
0% (1)
Practical Questions Mysql For Record 2023-24
4 pages
Gec Practicals
No ratings yet
Gec Practicals
31 pages
DBMS
No ratings yet
DBMS
100 pages
Chapter 3 - Highway Capacity and Level of Service
100% (2)
Chapter 3 - Highway Capacity and Level of Service
74 pages
Vanshika Goyal Gec Practicals
No ratings yet
Vanshika Goyal Gec Practicals
31 pages
Business Quantitative Analysis
100% (2)
Business Quantitative Analysis
4 pages
Cyber Security Past Year Question Papers
No ratings yet
Cyber Security Past Year Question Papers
14 pages
Class 12 Computer Science Sample Paper Set 9
No ratings yet
Class 12 Computer Science Sample Paper Set 9
13 pages
Public Land Act of 1902 CA 141
100% (3)
Public Land Act of 1902 CA 141
33 pages
Xii CS QP PP-3 2024-25
No ratings yet
Xii CS QP PP-3 2024-25
7 pages
Class Xii (Informatics Practices) Half Yearly QP Chennai Region
No ratings yet
Class Xii (Informatics Practices) Half Yearly QP Chennai Region
4 pages
Higher Engineering Mathematics - B. S. Grewal Companion Text
80% (5)
Higher Engineering Mathematics - B. S. Grewal Companion Text
197 pages
Florida Gaters
No ratings yet
Florida Gaters
196 pages
Class 12 (IP) PT.1question Paper2024-25
No ratings yet
Class 12 (IP) PT.1question Paper2024-25
3 pages
Viscoseal Product Training
No ratings yet
Viscoseal Product Training
45 pages
DAV Practical
No ratings yet
DAV Practical
12 pages
Class 12 Computer Science Sample Paper Set 8
No ratings yet
Class 12 Computer Science Sample Paper Set 8
11 pages
Special Probability Distributions and Densities
100% (1)
Special Probability Distributions and Densities
43 pages
DAV Practicals
No ratings yet
DAV Practicals
26 pages
Discrete Mathematics - MA3354 - Hand Written Notes - Unit 5 - Lattices and Boolean Algebra
No ratings yet
Discrete Mathematics - MA3354 - Hand Written Notes - Unit 5 - Lattices and Boolean Algebra
41 pages
1 Oxford English Grammar Course Basic 4 12
No ratings yet
1 Oxford English Grammar Course Basic 4 12
9 pages
12cs Ernakulam SQP 2223 Solved QP
No ratings yet
12cs Ernakulam SQP 2223 Solved QP
68 pages
ML Unit 3
No ratings yet
ML Unit 3
40 pages
A Better Return On Self-Awareness
No ratings yet
A Better Return On Self-Awareness
8 pages
Oracle: Oracle Global Human Resources Cloud 2016
No ratings yet
Oracle: Oracle Global Human Resources Cloud 2016
52 pages
Latex Tutorial
No ratings yet
Latex Tutorial
32 pages
Sol2e ELEM Progress Tests Answer Key A
50% (2)
Sol2e ELEM Progress Tests Answer Key A
10 pages
Grade 12 - Data Handling Using Pandas 1-Worksheet 1
No ratings yet
Grade 12 - Data Handling Using Pandas 1-Worksheet 1
2 pages
Cash and Receivables: True-False
100% (2)
Cash and Receivables: True-False
44 pages
XII-IP-QuickRevision 2 in 1
No ratings yet
XII-IP-QuickRevision 2 in 1
13 pages
Code
No ratings yet
Code
3 pages
Data Handling Using Pandas-1
No ratings yet
Data Handling Using Pandas-1
25 pages
Page 1 of 10
No ratings yet
Page 1 of 10
10 pages
Class 12 Ip Sample Question Paper
No ratings yet
Class 12 Ip Sample Question Paper
9 pages
Paciano Rizal Elementary School First Summative Test T.L.E
No ratings yet
Paciano Rizal Elementary School First Summative Test T.L.E
2 pages
Grade X AI Sample Paper-3 (2024-2025)
No ratings yet
Grade X AI Sample Paper-3 (2024-2025)
4 pages
Key Marketing Issues For New Venture
100% (4)
Key Marketing Issues For New Venture
24 pages
Motilal Nehru College University of Delhi B.Sc. (H) Physics, III Semester (Scilab)
No ratings yet
Motilal Nehru College University of Delhi B.Sc. (H) Physics, III Semester (Scilab)
2 pages
358 33 Powerpoint Slides DSC Chapter 15
No ratings yet
358 33 Powerpoint Slides DSC Chapter 15
55 pages
Reason Assertion Based Questions
No ratings yet
Reason Assertion Based Questions
6 pages
Impact of Emergency Diesel Generator Reliability On Microgrids and Building-Tied Systems
No ratings yet
Impact of Emergency Diesel Generator Reliability On Microgrids and Building-Tied Systems
11 pages
1 Mark MCQs - 75
No ratings yet
1 Mark MCQs - 75
10 pages
Mining Class Comparisions and Mining Descriptive Statistical Measures
No ratings yet
Mining Class Comparisions and Mining Descriptive Statistical Measures
24 pages
Combinatorics Student Booklet - Solutions
No ratings yet
Combinatorics Student Booklet - Solutions
24 pages
18bge14a U4
No ratings yet
18bge14a U4
16 pages
DLL - English 5 - Q1 - W3-D1
No ratings yet
DLL - English 5 - Q1 - W3-D1
2 pages
Definitive Surgical Trauma Skills (DSTS) - Royal College of Surgeons
No ratings yet
Definitive Surgical Trauma Skills (DSTS) - Royal College of Surgeons
1 page
HW 2 Sol
No ratings yet
HW 2 Sol
5 pages
Lab-5-1-Regression and Multiple Regression
100% (2)
Lab-5-1-Regression and Multiple Regression
8 pages
Coca Cola - Killian Farrell & Luis Honsel
No ratings yet
Coca Cola - Killian Farrell & Luis Honsel
1 page
Computer Practical File
No ratings yet
Computer Practical File
18 pages
Ip Sample Paper 10
No ratings yet
Ip Sample Paper 10
9 pages
What Is Performance Analysis of An Algorithm?
No ratings yet
What Is Performance Analysis of An Algorithm?
16 pages
Class-XII-IP-First Pre Board
No ratings yet
Class-XII-IP-First Pre Board
7 pages
A Literature Review On The Impact of Online Games in Learning Vocabulary
No ratings yet
A Literature Review On The Impact of Online Games in Learning Vocabulary
7 pages
Chapter 5 - Eigen Vector
No ratings yet
Chapter 5 - Eigen Vector
12 pages
Assignment Problem: Abu Bashar
No ratings yet
Assignment Problem: Abu Bashar
26 pages
FDS Iat-2 Part-B
No ratings yet
FDS Iat-2 Part-B
4 pages
The Importance of Motor Skills
No ratings yet
The Importance of Motor Skills
4 pages
Answer Key of KVS PGT Computer Science Exam Dated 14 December 2013 - RAS MAIN EXAM
100% (2)
Answer Key of KVS PGT Computer Science Exam Dated 14 December 2013 - RAS MAIN EXAM
6 pages
Rr210501 Discrete Structures and Graph Theory
No ratings yet
Rr210501 Discrete Structures and Graph Theory
7 pages
Question Paper 2014 Delhi Cbse Class 12 Informatics Practices
No ratings yet
Question Paper 2014 Delhi Cbse Class 12 Informatics Practices
8 pages
Graph Theory and Algorithms: Pratima Panigrahi Department of Mathematics Indian Institute of Technology Kharagpur 721302
No ratings yet
Graph Theory and Algorithms: Pratima Panigrahi Department of Mathematics Indian Institute of Technology Kharagpur 721302
18 pages
Analysis of Demand, Supply & Elasticity of Coca Cola
No ratings yet
Analysis of Demand, Supply & Elasticity of Coca Cola
8 pages
DAV Guidelines
No ratings yet
DAV Guidelines
4 pages
Unit 7 PDF
No ratings yet
Unit 7 PDF
15 pages
Advance Python Question Paper 2023
No ratings yet
Advance Python Question Paper 2023
2 pages
Content Beyond Syllabus Unit 2
No ratings yet
Content Beyond Syllabus Unit 2
3 pages
Unit 2 0r Material
No ratings yet
Unit 2 0r Material
5 pages
Quiz 1 220212022
No ratings yet
Quiz 1 220212022
1 page
Pandas Dataframe Assignment No 3 - Answerkey
No ratings yet
Pandas Dataframe Assignment No 3 - Answerkey
10 pages
Dokumen - Tips - Melservo Solutions Flying Shear
No ratings yet
Dokumen - Tips - Melservo Solutions Flying Shear
4 pages
Chapter 3 - Reading Material
No ratings yet
Chapter 3 - Reading Material
11 pages
OC 5 Explains Why There Is A Breakdown in Communication
No ratings yet
OC 5 Explains Why There Is A Breakdown in Communication
1 page
New OOPS Assignment 1
No ratings yet
New OOPS Assignment 1
4 pages
Nominalisation in Academic Writing
No ratings yet
Nominalisation in Academic Writing
5 pages
50th PAMET Convention Invites
No ratings yet
50th PAMET Convention Invites
2 pages
Rap Rubric
No ratings yet
Rap Rubric
1 page
Sample Solution: Data Field Identifier Data Type Example Your Reason For Using This Data Type
No ratings yet
Sample Solution: Data Field Identifier Data Type Example Your Reason For Using This Data Type
1 page
Java University Paper Questions MCA Mumbai University
No ratings yet
Java University Paper Questions MCA Mumbai University
2 pages
Proposal For CSR Consultancy
100% (1)
Proposal For CSR Consultancy
2 pages

Practical 1 and 2-1

Uploaded by

Practical 1 and 2-1

Uploaded by

31/01/2024

Practical → 1st (a)

Practical → 1st (b)

Practical → 1st (c)

Practical → 1st (d)

# Create three random arrays of the same size

# Subtract Array2 from Array3 and store in Array4

# Create Array5 having two times the values in Array1

# Find Covariance of Array1 with Array4

# Find Correlation of Array1 with Array5

print("Covariance of Array1 with Array4:", covariance_1_4)

Practical → 1st (e)

# Create two random arrays of the same size 10

# Find the sum of the first half of both arrays

# Find the product of the second half of both arrays

print("Sum of the first half of Array1:", sum_first_half_Array1)

Practical → 2nd (a)

# Create a series with 5 elements

# Display the series sorted on index

# Display the series sorted on values

print("Series sorted on index:")

print("\nSeries sorted on values:")

Practical → 2nd (b)

Practical → 2nd (c)

# Create a sample series

# Find index value of the minimum element

# Find index value of the maximum element

print("Index value of the minimum element:", min_index)

Practical → 3rd (a)

Practical → 3rd (b)

Practical → 3rd (c)

Practical → 3rd (d)

Practical → 3rd (e)

Practical → 3rd (f)

# Calculate covariance between second and third column

print("Correlation between first and second column:", correlation_first_second)

Practical → 3rd (g)

# Creating the DataFrame

Practical → 6th (a)

Practical → 6th (b)

Practical → 6th (c)

Practical → 6th (d)

Practical → 7th (a)

Practical → 7th (b)

Practical → 7th (c)

Practical → 7th (d)

# Create dummy data for workshop1

# Create dummy data for workshop2

# Create data frames from the dummy data

print("\nData Frame 2:")

Practical → 5th (a)

Practical → 5th (b)

Practical → 5th (c)

Practical → 5th (d)

Practical → 5th (e)

Practical → 5th (f)

Practical → 5th (g)

You might also like