0% found this document useful (0 votes)

10 views11 pages

Pandas Questions

Pandas Questions Assignment

Uploaded by

mail2sharma.kriti

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views11 pages

Pandas Questions

Pandas Questions Assignment

Uploaded by

mail2sharma.kriti

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 11

Pandas Questions

Question1 ) What is Pandas, and what is its primary purpose in Python data analysis?
Answer )

Pandas :
Pandas is an open source Python package that is most widely used for data science/data analysis
and machine learning tasks.

It is built on top of another package named Numpy, which provides support for multi-dimensional
arrays. As one of the most popular data wrangling packages, Pandas works well with many other
data science modules inside the Python ecosystem, and is typically included in every Python
distribution, from those that come with your operating system to commercial vendor distributions
like ActiveState’s ActivePython.

Purpose :
Pandas makes it simple to do many of the time consuming, repetitive tasks associated with working
with data, including:

* Data cleansing
* Data fill
* Data normalization
* Merges and joins
* Data visualization
* Statistical analysis
* Data inspection
* Loading and saving data
* And much more

In Data Analysis the purpose of pandas is :

Data Analysis is the technique of collecting, transforming, and organizing data to make future
predictions and informed data-driven decisions. It also helps to find possible solutions for a
business problem

There are various steps for Data Analysis where pandas play important Role . They are:

Prepare or Collect Data

Clean and Process
Analyze
Share
Act or Report
Question 2 ) How do you install Pandas in your Python environment?
Answer )

It requires Python 3.6, 3.7, or 3.8 or later versions as a prerequisite for installation.

* Install Pandas using pip :

Step 1 : Launch Command Prompt

Step 2 : Run the Command

“ pip install pandas “

This will start the pip installation. After downloading the necessary files,
Pandas will be set to operate on your computer.
Question 3 ) What are the two primary data structures in Pandas, and how do they differ?
Answer )

Pandas provides two essential data structures: Series and DataFrame .

1 . Series :
A Pandas Series is a one-dimensional array-like object that can hold data of any
type (integer, float, string, etc.). It is labelled, meaning each element has a unique
identifier called an index.

Series are a fundamental data structure in Pandas and are commonly used for data
manipulation and analysis tasks. They can be created from lists, arrays, dictionaries, and
existing Series objects

Creating a Series data structure from a list, dictionary, and custom index

# Initializing a Series from a list

data = [1, 2, 3, 4, 5]
series_from_list = pd.Series(data)
print(series_from_list)

# Initializing a Series from a dictionary

data = {'a': 1, 'b': 2, 'c': 3}
series_from_dict = pd.Series(data)
print(series_from_dict)

# Initializing a Series with custom index

data = [1, 2, 3, 4, 5]
index = ['a', 'b', 'c', 'd', 'e']
series_custom_index = pd.Series(data, index=index)
print(series_custom_index)

2 . Data-frame :
A Pandas DataFrame is a two-dimensional, tabular data structure with rows and columns.

the DataFrame has three main components: the data, which is stored in rows and columns;
the rows, which are labeled by an index; and the columns, which are labeled and contain
the actual data.

Indexing:
DataFrame provides flexible indexing options, allowing access to rows
,columns, or individual elements based on labels or integer positions
# Initializing a Data-frame from a dictionary
data = {'Name': ['John', 'Alice', 'Bob'],
'Age': [25, 30, 35],
'City': ['New York', 'Los Angeles', 'Chicago']}
df = pd.DataFrame(data)
print(df)

# Initializing a Data-frame from a list of lists

data = [['John', 25, 'New York'],
['Alice', 30, 'Los Angeles'],
['Bob', 35, 'Chicago']]
columns = ['Name', 'Age', 'City']
df = pd.DataFrame(data, columns=columns)
print(df)

Series DataFrame

One- dimensional Two- dimensional

Series elements must be homogenous. Can be heterogeneous.

Mutable(size can be
Immutable(size cannot be changed).
changeable).

Element wise computations. Column wise computations.

Functionality is less. Functionality is more.

Alignment not supported. Alignment is supported.

Question 4 ) How can you read a CSV file into a Pandas DataFrame? Provide an example.
Answer )

CSV files (comma separated files) :

We can read a CSV file using pandas. for Eg.

import pandas as pd
df = pd.read_csv('data.csv')
print(df)

1. Firstly we import pandas libraby

2. We use the “read_csv” command to read the csv file store it in Dataframe
3. Print the csv File

Question 5 ) What is the difference between the loc[] and iloc[] methods when selecting data
from a Pandas DataFrame?

Answer )

LOC :
There are two arguments we need to pass when we are using this function. The first arguments
represent the row label and the second argument represents a column label. We can even use colon
(:) if we want to select all rows or columns. We use boolean expressions to solve it.

1. Syntax :
Dataframe.loc[specific rows, specific columns]

2. Selecting A Subset Of Rows And Columns :

Subset = df.loc[df [‘Department’] == ‘Marketing’, [‘Name’,’Salary’]

Here are a few advantages of loc.

Allowed in cases like labeled-based indexing. It is easy to read and understand.

It can be used with Boolean arrays to solve problems.
Can be used on both single and multiple indexes.
ILOC :
The iloc function in Python is an index-based function. In this function, we select an integer position
instead of selecting rows or columns. It can also work across multiple DataFrame Objects.

"iloc" method is a valuable tool for selecting rows and columns by an integer. It can also access
specific values in a DataFrame.It does not accept the boolean data. We have to follow the syntax
below:

Syntax :
df.iloc[row_index_value, column_index_value]

‘iloc’ method in Pandas is a valuable tool for selecting rows and column values using integer values.
Many operations can be held using the 'iloc' method.

loc function Iloc function

Select rows and columns by labels Select rows and columns by integer positions
Slicing with labels Slicing with integer positions
Use Boolean arrays Does not uses Boolean arrays
Label-based indexing Position based indexing
Syntax : Dataframe.loc[specific rows, specific Syntax : Dataframe.iloc[row_index_value,
columns] column_index_value]
Example:
Example:
df.loc[2, 'Salary']
df.iloc[2, 2]
df.loc[df['Department'] == 'Marketing', ['Name',
df.iloc[1:3, :2]
'Salary']]
Question 6 ) How do you handle missing data (NaN or None) in a Pandas DataFrame?
Ans )
In Pandas, missing values are represented by None or NaN, which can occur due to uncollected
data or incomplete entries

To identify and handle the missing values, Pandas provides two useful functions:
isnull() and notnull()

1 ) isnull() : returns a DataFrame of Boolean values, where True represents missing data (NaN).
missing_values = df.isnull()

2 ) notnull() : returns a DataFrame of Boolean values, where True indicates non-missing data.

Handeling Missing Value :

the fillna(), replace() functions are commonly used to fill NaN values

(1) Fillna() : The fillna() function is used to replace missing values (NaN) with a
specified value. For example, you can fill missing values with 0.
Syntax : df.fillna(0)

(2) replace() : Use replace() to replace NaN values with a specific value

Syntax : data.replace(to_replace=np.nan, value=-99)

( Replace NaN with -99 )

Question 7 ) What is the function of the groupby() method in Pandas, and how is it typically
used in data analysis?
Answer )

Pandas groupby splits all the records from your data set into different categories or groups so that
you can analyze the data by these groups. When you use the .groupby() function on any categorical
column of DataFrame, it returns a GroupBy object, which you can use other methods on to group
the data.

In the real world, you’ll usually work with large amounts of data and need to do similar operations
over different groups of data. Pandas groupby() is handy in all those scenarios and gives you
insights making it extremely efficient and a must know function in data analysis.

If we have certain requirement to get the sum and count on the groups then we can also do that

1. ) Number of Groups :
To know how many different groups your data is now divided into.
Then we can use the nunique() function on any column, which gives you a number of
unique values in that column. As many unique values as there are in a column, the data
will be divided into that many groups.
Eg : df.Product_Category.nunique()

2. ) Group Sizes :
The number of rows in each group of a GroupBy object can be easily obtained
using the function .size().
Eg: df.groupby("Product_Category").size()

3. ) Aggregate Multiple Columns :

Applying an aggregate function on columns in each group is one of the most widely used
practices After grouping the data by product_category, suppose you want to see what the
average unit price and quantity in each product category is

#Create a groupby object

df_group = df.groupby("Product_Category")

#Select only required columns

df_columns = df_group[["UnitPrice(USD)","Quantity"]]

#Apply aggregate function

df_columns.mean()

Question 8 ) How can you merge or join two Pandas DataFrames based on a common column or
Key?
Answer )

Merge :
The merge function in Pandas is used to combine two DataFrames based on a common column or
index

import pandas as pd

# Creating DataFrame 1]
df1 = pd.DataFrame({
'Name': ['Raju', 'Rani', 'Geeta', 'Sita', 'Sohit'],
'Marks': [80, 90, 75, 88, 59]
})

# Creating DataFrame 2
df2 = pd.DataFrame({
'Name': ['Raju', 'Divya', 'Geeta', 'Sita'],
'Grade': ['A', 'A', 'B', 'A'],
'Rank': [3, 1, 4, 2],
'Gender': ['Male', 'Female', 'Female', 'Female']
})

# Display DataFrames
print("DataFrame 1:")
print(df1)
print("\nDataFrame 2:")
print(df2)

# Merging 2 dataframes
df_merged = df1.merge(df2[['Name', 'Grade', 'Rank']], on='Name')
print("\nMerged DataFrame:")
print(df_merged)

Question 9 ) Explain the concept of pivot tables in Pandas and how they can be created
Answer )

The pivot table function takes in a data frame and the parameters detailing the shape you want the
data to take. Then it outputs summarized data in the form of a pivot table.

pivot tables in pandas are very effective way to analyze and summarize data.

To create a pivot table from a pandas DataFrame :

How to Create a Pandas Pivot Table

A pandas pivot table has three main elements:

Index : This specifies the row-level grouping.

Column : This specifies the column level grouping.
Values : These are the numerical values you are looking to summarize.

* Pivot tables can be multi-level. We can use multiple indexes and column
level groupings to create more powerful summaries of a data set

Eg )

Question 10 )What is the purpose of the apply() function in Pandas, and when might you use it in
data transformation?
Answer )

The apply() method is one of the most common methods of data preprocessing. It simplifies
applying a function on each element in a pandas Series and each row or column in a pandas
DataFrame.

Pandas.apply allow the users to pass a function and apply it on every single value of the Pandas
series. this function helps to segregate data according to the conditions required due to which it is
efficiently used in data science and machine learning.

Syntax :
s.apply(func, convert_dtype=True, args=())

func: apply takes a function and applies it to all values of pandas series.
convert_dtype: Convert dtype as per the function’s operation.
args=(): Additional arguments to pass to function instead of series.
Return Type: Pandas Series after applied function/operation.

EG :
import pandas as pd
s = pd.read_csv("stock.csv", squeeze = True)

# adding 5 to each value

new = s.apply(lambda num : num + 5)

PYTHON UNIT IV- PANDAS
No ratings yet
PYTHON UNIT IV- PANDAS
36 pages
AIoT_Playbook
No ratings yet
AIoT_Playbook
294 pages
MCQ - 9 PDF
No ratings yet
MCQ - 9 PDF
4 pages
CMSF Game Manual
No ratings yet
CMSF Game Manual
196 pages
Graylog Sample
No ratings yet
Graylog Sample
41 pages
Wcdma
No ratings yet
Wcdma
22 pages
UNIT II Notes (1)
No ratings yet
UNIT II Notes (1)
23 pages
2021 DSA-Sec Template For Applicants Studying in MOE Mainstream Schools
No ratings yet
2021 DSA-Sec Template For Applicants Studying in MOE Mainstream Schools
2 pages
LC863548B 2
No ratings yet
LC863548B 2
18 pages
P 272
No ratings yet
P 272
1 page
A Different Approach To Time Calculations in SSAS
No ratings yet
A Different Approach To Time Calculations in SSAS
22 pages
Immediate download Prometheus: Up & Running - Infrastructure and Application Performance Monitoring Julien Pivotto ebooks 2024
No ratings yet
Immediate download Prometheus: Up & Running - Infrastructure and Application Performance Monitoring Julien Pivotto ebooks 2024
40 pages
Chapter 2
No ratings yet
Chapter 2
43 pages
Fall Detection System Using Accelerometer and Gyroscope Based On Smartphone
No ratings yet
Fall Detection System Using Accelerometer and Gyroscope Based On Smartphone
7 pages
Pandas
No ratings yet
Pandas
26 pages
Top Python Questions 1735201448
No ratings yet
Top Python Questions 1735201448
25 pages
Mob Visual and Electronic Communication Presentation
No ratings yet
Mob Visual and Electronic Communication Presentation
15 pages
SAP CO-PC Material Cost Estimate - EUG
No ratings yet
SAP CO-PC Material Cost Estimate - EUG
38 pages
Fiori: Technical Installation and Configuration of One App From A - Z
No ratings yet
Fiori: Technical Installation and Configuration of One App From A - Z
40 pages
Practical 1
No ratings yet
Practical 1
6 pages
5CS037 WS02 PandasForDataAnalysis
No ratings yet
5CS037 WS02 PandasForDataAnalysis
30 pages
Master Sheet: Android / iOS Developer
No ratings yet
Master Sheet: Android / iOS Developer
19 pages
Python Pandas
No ratings yet
Python Pandas
13 pages
Design and Implementation of A Low Cost 3D Printed Humanoid Robotic Platform
No ratings yet
Design and Implementation of A Low Cost 3D Printed Humanoid Robotic Platform
6 pages
UNIT 1 COA Part1
No ratings yet
UNIT 1 COA Part1
21 pages
Phantom UserGuide
No ratings yet
Phantom UserGuide
27 pages
Class 6 Pandas
No ratings yet
Class 6 Pandas
13 pages
Pandas DataFrame
No ratings yet
Pandas DataFrame
70 pages
Bev S4hana2022 BPD en Ae
No ratings yet
Bev S4hana2022 BPD en Ae
31 pages
Pandas Viva Questions
No ratings yet
Pandas Viva Questions
23 pages
Pandas
No ratings yet
Pandas
40 pages
Python Ques
No ratings yet
Python Ques
5 pages
Pandas
No ratings yet
Pandas
29 pages
04-Data Manipulation With Pandas
No ratings yet
04-Data Manipulation With Pandas
28 pages
Pandas Interview Questions
No ratings yet
Pandas Interview Questions
21 pages
GECS Escalator Control Sigma Vera
No ratings yet
GECS Escalator Control Sigma Vera
19 pages
Assignment On Huuman Computer Interaction
No ratings yet
Assignment On Huuman Computer Interaction
4 pages
Python Unit 2 Question Bank (2)
No ratings yet
Python Unit 2 Question Bank (2)
5 pages
ThinkPad T14s Gen 1 Intel Datasheet EN
No ratings yet
ThinkPad T14s Gen 1 Intel Datasheet EN
2 pages
_8th_of_10_Python_Resources_PANDAS_Interview_Q_A_?_1737825285
No ratings yet
_8th_of_10_Python_Resources_PANDAS_Interview_Q_A_?_1737825285
19 pages
Unit-4Introduction To Pandas
No ratings yet
Unit-4Introduction To Pandas
44 pages
Data Handling using Pandas - Revision Notes
No ratings yet
Data Handling using Pandas - Revision Notes
6 pages
PANDAS Python
No ratings yet
PANDAS Python
2 pages
DAP_3_module
No ratings yet
DAP_3_module
62 pages
PPT for Assignment-3 (Final_Pandas_Lab)
No ratings yet
PPT for Assignment-3 (Final_Pandas_Lab)
40 pages
Building Robots With Raspberry Pi and Python
100% (1)
Building Robots With Raspberry Pi and Python
6 pages
Python Pandas Tutorial For Beginners
No ratings yet
Python Pandas Tutorial For Beginners
203 pages
Pandas
No ratings yet
Pandas
42 pages
2_Pandas
No ratings yet
2_Pandas
22 pages
Pandas 1705297450
No ratings yet
Pandas 1705297450
21 pages
Starting Out With Pandas - Ext
No ratings yet
Starting Out With Pandas - Ext
18 pages
JOINS (1)
No ratings yet
JOINS (1)
10 pages
Pandas Notes(1)
No ratings yet
Pandas Notes(1)
44 pages
Pandas
No ratings yet
Pandas
9 pages
DIY Nintendo GAMEBOY Classic Flash Cartridge UPDATED 2
No ratings yet
DIY Nintendo GAMEBOY Classic Flash Cartridge UPDATED 2
11 pages
Class XII IP Key Points (Python Pandas)
No ratings yet
Class XII IP Key Points (Python Pandas)
5 pages
MTCTE Scheme by TEC - TUV India
0% (1)
MTCTE Scheme by TEC - TUV India
57 pages
Pandas
No ratings yet
Pandas
25 pages
Pandas - Digitalocean
No ratings yet
Pandas - Digitalocean
15 pages
B92x_Operation_Manual_enUS_19051068555
No ratings yet
B92x_Operation_Manual_enUS_19051068555
2 pages
Python 3rd unit question and answer
No ratings yet
Python 3rd unit question and answer
25 pages
Pandas (Ziad)
No ratings yet
Pandas (Ziad)
38 pages
The Pandas Library
No ratings yet
The Pandas Library
39 pages
Pandas
No ratings yet
Pandas
13 pages
Python Data Frame New
No ratings yet
Python Data Frame New
32 pages
DevOps Session 3 Pandas.pptx
No ratings yet
DevOps Session 3 Pandas.pptx
33 pages
Python Pandas Tutorial
No ratings yet
Python Pandas Tutorial
6 pages
Python Pandas Interview Questions
100% (1)
Python Pandas Interview Questions
17 pages
Pandas Basics
No ratings yet
Pandas Basics
84 pages
Python Pandas Demo PDF
100% (2)
Python Pandas Demo PDF
23 pages
Unit 4
No ratings yet
Unit 4
36 pages
Pandas
No ratings yet
Pandas
13 pages
Loki Temp PPT Pandas 2
No ratings yet
Loki Temp PPT Pandas 2
31 pages
Python Pandas Interview Questions and Answers
No ratings yet
Python Pandas Interview Questions and Answers
20 pages
Lab-3 Pandas Library
No ratings yet
Lab-3 Pandas Library
14 pages
Pandas
No ratings yet
Pandas
12 pages
FDS Module 2 Notes
No ratings yet
FDS Module 2 Notes
24 pages
Business Blueprint
0% (1)
Business Blueprint
10 pages
Pandas
No ratings yet
Pandas
41 pages
All Document Reader 1715619870900
No ratings yet
All Document Reader 1715619870900
6 pages
Introduction To Pandas For Data Analysis
No ratings yet
Introduction To Pandas For Data Analysis
6 pages
Direct TM EWM Integration
No ratings yet
Direct TM EWM Integration
4 pages
Pandas Dataframe Export The CSV File
No ratings yet
Pandas Dataframe Export The CSV File
9 pages
Day64 - Pandas Interview Questions
No ratings yet
Day64 - Pandas Interview Questions
5 pages
Pandas
No ratings yet
Pandas
5 pages
Algorithms and Data Structures: An Easy Guide to Programming Skills
From Everand
Algorithms and Data Structures: An Easy Guide to Programming Skills
Rigdon Jonathan
No ratings yet
Mastering Data Structures and Algorithms in Python & Java
From Everand
Mastering Data Structures and Algorithms in Python & Java
Sachin Naha
No ratings yet
Mastering Data Structures and Algorithms in C and C++
From Everand
Mastering Data Structures and Algorithms in C and C++
Sachin Naha
No ratings yet
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Advanced C Concepts and Programming: First Edition
From Everand
Advanced C Concepts and Programming: First Edition
Gayatri
3/5 (1)