[go: up one dir, main page]

0% found this document useful (0 votes)
6 views3 pages

Data_Science_Assignment_1_Answers

The document provides an overview of essential Python libraries for data science, specifically NumPy, SciPy, Pandas, and Matplotlib. It highlights the functionalities and applications of these libraries, including array manipulation, optimization, data handling, and visualization. Additionally, it includes code examples demonstrating how to use these libraries for various data science tasks.

Uploaded by

malisenrichard80
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views3 pages

Data_Science_Assignment_1_Answers

The document provides an overview of essential Python libraries for data science, specifically NumPy, SciPy, Pandas, and Matplotlib. It highlights the functionalities and applications of these libraries, including array manipulation, optimization, data handling, and visualization. Additionally, it includes code examples demonstrating how to use these libraries for various data science tasks.

Uploaded by

malisenrichard80
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Assignment 1: Python Libraries for Data Science

NumPy

1. What is NumPy, and why is it essential for numerical computations in Python?

NumPy (Numerical Python) is a powerful Python library used for numerical computing. It provides:

- A high-performance multidimensional array object (ndarray)

- Mathematical functions to operate on arrays

- Tools for integrating C/C++ and Fortran code

Importance:

- It supports vectorized operations, making code faster.

- It forms the foundation for many other libraries like Pandas, SciPy, and scikit-learn.

2. Create a 3x3 NumPy array filled with random integers between 1 and 10.

import numpy as np

array = np.random.randint(1, 11, (3, 3))

array.sum(), array.mean(), array.std()

3. Reshape the array into a 1x9 vector.

vector = array.reshape(1, 9)

SciPy

1. Discuss the main modules of SciPy and their applications.

SciPy modules include:

- scipy.integrate: Integration routines

- scipy.optimize: Optimization algorithms


- scipy.linalg: Linear algebra operations

- scipy.signal: Signal processing

- scipy.stats: Statistical functions

- scipy.fft: Fast Fourier Transforms

2. Use scipy.optimize to find the minimum of f(x) = x² + 5x + 6.

from scipy.optimize import minimize

f = lambda x: x**2 + 5*x + 6

minimize(f, x0=0)

3. Plot the function using Matplotlib.

import matplotlib.pyplot as plt

x = np.linspace(-10, 5, 100)

y = f(x)

plt.plot(x, y)

Pandas

1. What are the two primary data structures in Pandas?

- Series: 1D labeled array.

- DataFrame: 2D labeled table.

2. Load a CSV (or create DataFrame) with Name, Age, Salary.

import pandas as pd

df = pd.DataFrame({'Name':['A','B'],'Age':[25,30],'Salary':[60000,45000]})

3. Filter rows where Salary > 50000.

df[df['Salary'] > 50000]


4. Group the data by Age and calculate the average salary.

df.groupby('Age')['Salary'].mean()

5. Handle missing values.

df.fillna(value), df.dropna()

Matplotlib

1. Key features of Matplotlib

- Line, bar, pie charts

- Labels, titles, legends

- Subplots and gridlines

2. Generate a line plot for population growth.

years = np.arange(2015, 2025)

population = [10000, 11000, ..., 30000]

plt.plot(years, population)

3. Create a figure with bar and pie chart.

fig, axs = plt.subplots(1, 2)

axs[0].bar(...)

axs[1].pie(...)

You might also like