0% found this document useful (0 votes)

13 views23 pages

UNIT II Notes (1)

The document provides an introduction to the Pandas library in Python, covering data structures such as Series and DataFrames, data importing and exporting, data cleaning, and manipulation techniques. It includes installation instructions, code examples for creating and accessing data structures, and methods for handling missing data and duplicates. Additionally, it outlines how to read from and write to various file formats like CSV, Excel, and JSON.

Uploaded by

Sneha Rawat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views23 pages

UNIT II Notes (1)

Uploaded by

Sneha Rawat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 23

UNIT II

Introduction to Pandas

Introduction to Pandas: Understanding data structures in Pandas (Series, Data Frames).

Data Importing and Exporting: Reading data from CSV, Excel, and other file formats,
writing data to files.
Data Cleaning: Handling missing data, removing duplicates, and data filtering.
Data Manipulation: Sorting, indexing, and grouping data, merging and concatenating
Data Frames.
Working with Dates and Times: Handling date and time data, extracting and modifying date and time
components.

To begin using pandas in Python on your system, follow these steps:

✅ 1. Install Python (if not already installed)

Check if Python is installed:

python --version

If not installed, download and install it from python.org.

✅ 2. Install pandas

Open a terminal (Command Prompt, PowerShell, or a terminal on macOS/Linux) and run:

pip install pandas

If you're using Jupyter or Anaconda, pandas is often pre-installed.

✅ 3. Verify Installation

Open a Python shell or script and try importing pandas:

import pandas as pd
print(pd.__version__)

✅ 4. Write Your First pandas Code

Create a file like test_pandas.py and add:

import pandas as pd

# Create a simple DataFrame

data = {
'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35]
}
df = pd.DataFrame(data)

print(df)

Run it:

python test_pandas.py

✅ 5. Next Steps to Learn

 Load CSV: pd.read_csv('yourfile.csv')

 Explore data: df.head(), df.describe(), df.info()
 Manipulate columns: df['Age'] + 1, df['Name'].str.upper()

Would you like a beginner-friendly project or tutorial to follow next?

Introduction to Pandas: Understanding Data Structures in Pandas (Series,

DataFrames)

Pandas is one of the most popular libraries in Python for data manipulation and analysis. It
provides efficient and easy-to-use data structures for handling and analyzing structured data. The
primary data structures in Pandas are Series and DataFrame, each serving a unique purpose in
working with data.

1. Pandas Series

A Series is a one-dimensional labeled array that can hold any data type (integers, strings, floats,
etc.). It's similar to a list or an array but with additional functionality provided by Pandas, such as
labels (indices) that allow easy access to the data.

Key Features of Series:

 Homogeneous data: All elements in a Series are of the same type.
 Indexing: Series comes with an implicit or explicit index.
 Can hold any data type: It can store integers, floats, strings, etc.

Creating a Series:

import pandas as pd

# Creating a Series with a list of integers

data = [1, 2, 3, 4]
series = pd.Series(data)
print(series)

Output:

0 1
1 2
2 3
3 4
dtype: int64

Creating a Series with Custom Index:

data = [10, 20, 30, 40]

index = ['a', 'b', 'c', 'd']
series = pd.Series(data, index=index)
print(series)

Output:

a 10
b 20
c 30
d 40
dtype: int64

2. Pandas DataFrame

A DataFrame is a two-dimensional, size-mutable, and potentially heterogeneous tabular data

structure with labeled axes (rows and columns). It’s essentially a collection of Series, where each
column is a Series that can hold different types of data.

Key Features of DataFrame:

 Two-dimensional: It has both rows and columns.

 Heterogeneous data: Different columns can hold different data types (e.g., one column
can have integers, another can have strings, etc.).
 Indexing: Similar to Series, DataFrames can have both row and column labels (indices).

Creating a DataFrame:
import pandas as pd

# Creating a DataFrame from a dictionary

data = {
'Name': ['Alice', 'Bob', 'Charlie', 'David'],
'Age': [25, 30, 35, 40],
'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']
}

df = pd.DataFrame(data)
print(df)

Output:

Name Age City

0 Alice 25 New York
1 Bob 30 Los Angeles
2 Charlie 35 Chicago
3 David 40 Houston

Creating a DataFrame from a List of Lists (with Column Names):

data = [
['Alice', 25, 'New York'],
['Bob', 30, 'Los Angeles'],
['Charlie', 35, 'Chicago'],
['David', 40, 'Houston']
]

df = pd.DataFrame(data, columns=['Name', 'Age', 'City'])

print(df)

Output:

Name Age City

0 Alice 25 New York
1 Bob 30 Los Angeles
2 Charlie 35 Chicago
3 David 40 Houston

Accessing Data in a DataFrame:

 Accessing Columns:
 print(df['Name'])

Output:

0 Alice
1 Bob
2 Charlie
3 David
Name: Name, dtype: object
 Accessing Rows (using .iloc[] or .loc[]):
 print(df.iloc[1]) # Accessing the second row (index 1)

Output:

Name Bob
Age 30
City Los Angeles
Name: 1, dtype: object
print(df.loc[1]) # Accessing the second row (with label 1)

Output:

Name Bob
Age 30
City Los Angeles
Name: 1, dtype: object

Key Differences between Series and DataFrame:

Feature Series DataFrame

Structure 1D array with an index 2D table (rows and columns)
Homogeneous (all values are the same Heterogeneous (columns can have different
Data
type) types)
Indexing Single axis (index) Two axes (rows and columns)

Conclusion:

 Series is a simple, one-dimensional labeled array, useful for handling a single column of
data.
 DataFrame is a powerful, two-dimensional structure that can handle multiple columns
and rows, and is more suited for working with tabular data.

These data structures, Series and DataFrame, are the foundation of data manipulation and
analysis with Pandas, allowing you to efficiently work with real-world datasets.

Data Importing and Exporting in Pandas

Pandas makes it very easy to read data from different file formats and save it back to various
formats. Whether your data is stored in a CSV, Excel, JSON, or other formats, Pandas provides
efficient methods to import, manipulate, and export data.

Here's an overview of how to read and write data using Pandas.

1. Reading Data from Files

1.1 Reading from CSV Files

CSV (Comma Separated Values) is one of the most common formats for storing tabular data.
Pandas provides the read_csv() function to load data from CSV files.

Example:
import pandas as pd

# Reading from a CSV file

df = pd.read_csv('file.csv')

# Displaying the first 5 rows

print(df.head())
Common Parameters:

 filepath_or_buffer: The path to the CSV file.

 sep: The delimiter (default is ,, but can be changed for different formats).
 header: Row number(s) to use as column names (default is 0).
 index_col: Column to set as the index.
 usecols: Specify a subset of columns to read.

1.2 Reading from Excel Files

Excel files are another common format for data storage. Pandas provides the read_excel()
function for reading Excel files. You will need the openpyxl or xlrd library installed to handle
Excel files.

Example:
import pandas as pd

# Reading from an Excel file (single sheet)

df = pd.read_excel('file.xlsx')

# Displaying the first 5 rows

print(df.head())
Common Parameters:

 sheet_name: Name of the sheet to read (defaults to the first sheet).

 usecols: Columns to read.
 header: Row number(s) to use as column names.
1.3 Reading from JSON Files

JSON (JavaScript Object Notation) is a lightweight data interchange format. You can use the
read_json() function to load data from a JSON file.

Example:
import pandas as pd

# Reading from a JSON file

df = pd.read_json('file.json')

# Displaying the first 5 rows

print(df.head())
Common Parameters:

 orient: String indicating the format of the JSON data.

 typ: Type of object to return (frame for DataFrame, series for Series).

1.4 Reading from SQL Databases

Pandas can read data directly from SQL databases using the read_sql() function. You'll need a
connection to your database, and Pandas will execute a query and return the result as a
DataFrame.

Example:
import pandas as pd
import sqlite3

# Create a database connection

conn = sqlite3.connect('database.db')

# Reading data from an SQL query

df = pd.read_sql('SELECT * FROM table_name', conn)

# Displaying the first 5 rows

print(df.head())
Common Parameters:

 sql: The SQL query or table name to read from.

 con: The database connection object.

2. Writing Data to Files

Pandas also makes it easy to save your DataFrame to different file formats, such as CSV, Excel,
JSON, and more.
2.1 Writing to CSV Files

To export a DataFrame to a CSV file, you can use the to_csv() method.

Example:
import pandas as pd

# Writing data to a CSV file

df.to_csv('output.csv', index=False)
Common Parameters:

 path_or_buffer: The file path to write to.

 sep: The delimiter (default is ,).
 index: Whether to write row names (index) (default is True).
 header: Whether to write column names (default is True).

2.2 Writing to Excel Files

To export a DataFrame to an Excel file, you can use the to_excel() method. You need the
openpyxl library installed for .xlsx files.

Example:
import pandas as pd

# Writing to an Excel file (single sheet)

df.to_excel('output.xlsx', index=False)

# Writing to an Excel file with multiple sheets

with pd.ExcelWriter('output_multiple_sheets.xlsx') as writer:
df.to_excel(writer, sheet_name='Sheet1')
df.to_excel(writer, sheet_name='Sheet2')
Common Parameters:

 excel_writer: The path or an ExcelWriter object to save the file.

 index: Whether to write row names (index) (default is True).
 sheet_name: Name of the sheet to write to.

2.3 Writing to JSON Files

To save a DataFrame to a JSON file, you can use the to_json() method.

Example:
import pandas as pd

# Writing to a JSON file

df.to_json('output.json', orient='records', lines=True)
Common Parameters:

 orient: Determines the format of the JSON data (options: 'split', 'records', 'index',
'columns').
 lines: Whether to write each record on a separate line (default is False).

2.4 Writing to SQL Databases

You can write data from a DataFrame to an SQL database using the to_sql() method. It
requires a connection object to the database.

Example:
import pandas as pd
import sqlite3

# Create a database connection

conn = sqlite3.connect('database.db')

# Writing data to an SQL table

df.to_sql('table_name', conn, if_exists='replace', index=False)
Common Parameters:

 name: The name of the table to write to.

 con: The database connection object.
 if_exists: What to do if the table already exists ('fail', 'replace', 'append').
 index: Whether to write row names (index) (default is True).

Summary of Common File Formats and Functions:

File Format Function to Read Function to Write

CSV pd.read_csv() df.to_csv()

Excel pd.read_excel() df.to_excel()

JSON pd.read_json() df.to_json()

SQL pd.read_sql() df.to_sql()

Conclusion:
Pandas provides a versatile set of functions to read data from a variety of file formats (CSV,
Excel, JSON, SQL) and export data back to these formats. This ability to handle different data
sources seamlessly is one of the key strengths of Pandas in data analysis and manipulation tasks.

Data Cleaning in Pandas: Handling Missing Data, Removing Duplicates, and

Data Filtering

Data cleaning is an essential step in the data analysis process. Raw data often contains missing or
duplicate values, as well as other inconsistencies that can skew analysis. Pandas provides a
variety of functions to handle these issues, enabling effective data cleaning and preparation.

In this section, we'll cover:

1. Handling Missing Data

2. Removing Duplicates
3. Data Filtering

1. Handling Missing Data

Missing data can occur due to various reasons (e.g., not recorded, data entry errors). Pandas
provides several methods to handle missing values (NaN), allowing you to either fill them with
certain values or drop them entirely.

1.1 Detecting Missing Data

You can detect missing data in a DataFrame using isnull() or notnull() methods.

import pandas as pd

# Sample DataFrame with missing data

data = {'Name': ['Alice', 'Bob', 'Charlie', None],
'Age': [25, None, 35, 40],
'City': ['New York', 'Los Angeles', None, 'Houston']}

df = pd.DataFrame(data)

# Check for missing values

print(df.isnull()) # True if data is missing

Output:

Name Age City

0 False False False
1 False True False
2 False False True
3 True False False

1.2 Dropping Missing Data

You can drop rows or columns with missing values using the dropna() method.

 Drop rows with any missing data:

df_cleaned = df.dropna() # Drops any row with NaN values

print(df_cleaned)

 Drop rows where specific columns have missing data:

df_cleaned = df.dropna(subset=['Age']) # Drops rows where 'Age' is missing

print(df_cleaned)

 Drop columns with any missing data:

df_cleaned = df.dropna(axis=1) # Drops columns with NaN values

print(df_cleaned)

1.3 Filling Missing Data

You can fill missing data with a specific value using fillna().

 Fill missing values with a constant value:

df_filled = df.fillna(value={'Age': 30, 'City': 'Unknown'})

print(df_filled)

 Fill missing values using forward-fill or backward-fill:

df_filled = df.fillna(method='ffill') # Forward fill (propagate previous

values)
df_filled = df.fillna(method='bfill') # Backward fill (propagate next values)
print(df_filled)

 Fill missing values with the mean, median, or mode:

df['Age'] = df['Age'].fillna(df['Age'].mean()) # Fill with mean value of

'Age'
print(df)

2. Removing Duplicates

Duplicate data can arise from data entry errors or merging data from multiple sources. Pandas
offers a simple way to identify and remove duplicates from your DataFrame.
2.1 Identifying Duplicates

You can detect duplicate rows using the duplicated() method, which returns a boolean Series
indicating whether each row is a duplicate.

# Sample DataFrame with duplicate rows

data = {'Name': ['Alice', 'Bob', 'Alice', 'Charlie'],
'Age': [25, 30, 25, 35],
'City': ['New York', 'Los Angeles', 'New York', 'Chicago']}

df = pd.DataFrame(data)

# Check for duplicates

print(df.duplicated()) # True if row is a duplicate

Output:

0 False
1 False
2 True
3 False
dtype: bool

2.2 Removing Duplicates

You can remove duplicates from your DataFrame using the drop_duplicates() method.

 Remove all duplicates:

df_unique = df.drop_duplicates() # Removes all duplicate rows

print(df_unique)

 Remove duplicates based on specific columns:

df_unique = df.drop_duplicates(subset=['Name']) # Remove rows with duplicate

'Name'
print(df_unique)

 Keep the last occurrence of the duplicates:

df_unique = df.drop_duplicates(keep='last') # Keeps the last occurrence of

each duplicate
print(df_unique)

3. Data Filtering

Data filtering allows you to select rows from a DataFrame based on certain conditions.

3.1 Filtering by Condition

You can filter rows based on conditions using boolean indexing.

 Filter rows where the 'Age' is greater than 30:

df_filtered = df[df['Age'] > 30]

print(df_filtered)

 Filter rows where 'City' is 'New York':

df_filtered = df[df['City'] == 'New York']

print(df_filtered)

 Multiple conditions with & (AND) or | (OR):

df_filtered = df[(df['Age'] > 25) & (df['City'] == 'New York')]

print(df_filtered)

 Using isin() for filtering specific values:

df_filtered = df[df['City'].isin(['New York', 'Chicago'])]

print(df_filtered)

3.2 Filtering by String Methods

Pandas provides string functions to filter rows based on string patterns (like contains,
startswith, endswith).

 Filter rows where 'City' contains 'New':

df_filtered = df[df['City'].str.contains('New')]
print(df_filtered)

 Filter rows where 'Name' starts with 'A':

df_filtered = df[df['Name'].str.startswith('A')]
print(df_filtered)

Summary of Key Functions:

Task Function
Detect Missing Data isnull(), notnull()
Drop Missing Data dropna()
Fill Missing Data fillna()
Detect Duplicates duplicated()
Remove Duplicates drop_duplicates()
Filter by Condition Boolean indexing (df[condition])
Task Function
Filter by String str.contains(), str.startswith()

Conclusion:

Data cleaning in Pandas is a crucial step in preparing data for analysis. By handling missing data,
removing duplicates, and applying filters, you can ensure that your dataset is accurate,
consistent, and ready for further analysis. Pandas provides efficient and flexible tools for each of
these tasks, making data cleaning fast and straightforward.

Data Manipulation in Pandas: Sorting, Indexing, Grouping, Merging, and

Concatenating DataFrames

Pandas is powerful when it comes to manipulating data. Whether you need to sort your data,
group it based on certain columns, merge data from different sources, or concatenate multiple
datasets, Pandas offers various methods that make these tasks easy and efficient.

In this guide, we’ll go over the following topics:

1. Sorting Data
2. Indexing Data
3. Grouping Data
4. Merging DataFrames
5. Concatenating DataFrames

1. Sorting Data

Sorting data is essential to analyze patterns or prepare data for visualizations or reporting.

1.1 Sorting by Index

You can sort a DataFrame by its index using the sort_index() method.

 Sort by index (ascending):

import pandas as pd
# Sample DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
'Age': [25, 30, 35, 40]}

df = pd.DataFrame(data, index=['a', 'd', 'c', 'b'])

# Sorting by index
df_sorted = df.sort_index(ascending=True)
print(df_sorted)

Output:

Name Age
a Alice 25
b David 40
c Charlie 35
d Bob 30

1.2 Sorting by Column(s)

You can sort by one or more columns using the sort_values() method.

 Sort by a single column (ascending):

df_sorted = df.sort_values(by='Age', ascending=True)

print(df_sorted)

Output:

Name Age
a Alice 25
d Bob 30
c Charlie 35
b David 40

 Sort by multiple columns:

data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],

'Age': [25, 30, 25, 40],
'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']}

df = pd.DataFrame(data)

df_sorted = df.sort_values(by=['Age', 'Name'], ascending=[True, False]) #

Sort by 'Age' (asc), then 'Name' (desc)
print(df_sorted)

Output:

Name Age City

0 Alice 25 New York
2 Charlie 25 Chicago
1 Bob 30 Los Angeles
3 David 40 Houston

2. Indexing Data

Indexing refers to the ability to access rows or columns based on their labels or positions.

2.1 Accessing Columns

You can access columns directly as attributes or by using the column name.

 Access a column by name:

print(df['Name']) # Access the 'Name' column

 Access a column as an attribute (if the column name is a valid attribute):

print(df.Name)

2.2 Accessing Rows by Label (loc)

Use .loc[] to access rows by their index label.

 Access a row by index label:

print(df.loc[0]) # Access the first row (index 0)

2.3 Accessing Rows by Position (iloc)

Use .iloc[] to access rows by their position (integer-based indexing).

 Access a row by integer position:

print(df.iloc[1]) # Access the second row (position 1)

 Access multiple rows (range):

print(df.iloc[1:3]) # Access rows at positions 1 and 2

2.4 Setting a New Index

You can set a column as the index of the DataFrame using the set_index() method.

 Set 'Name' column as the index:

df_indexed = df.set_index('Name')
print(df_indexed)
Output:

Age City
Name
Alice 25 New York
Bob 30 Los Angeles
Charlie 25 Chicago
David 40 Houston

2.5 Resetting the Index

To reset the index, you can use the reset_index() method.

df_reset = df_indexed.reset_index()
print(df_reset)

3. Grouping Data

Grouping is useful for performing aggregation operations like sum, mean, count, etc., based on
some criteria.

3.1 Basic Grouping

You can use the groupby() method to group data based on one or more columns.

 Group by 'Age' and calculate the mean of each group:

df_grouped = df.groupby('Age').mean() # Group by 'Age' and calculate mean for

each group
print(df_grouped)

Output:

Age
Age
25 25.0
30 30.0
35 35.0
40 40.0

3.2 Multiple Aggregations

You can apply multiple aggregation functions on the grouped data.

 Group by 'Age' and apply multiple functions:

df_grouped = df.groupby('Age').agg({'City': 'first', 'Name': 'count'})

print(df_grouped)
Output:

City Name
Age
25 New York 1
30 Los Angeles 1
35 Chicago 1
40 Houston 1

3.3 Using groupby() with Multiple Columns

You can group by multiple columns to create subgroups.

 Group by 'Age' and 'City':

df_grouped = df.groupby(['Age', 'City']).size()

print(df_grouped)

Output:

Age City
25 Chicago 1
New York 1
30 Los Angeles 1
35 Chicago 1
40 Houston 1
dtype: int64

4. Merging DataFrames

Merging DataFrames is a common operation when you want to combine data from multiple
sources. Pandas provides the merge() function for this purpose, similar to SQL joins.

4.1 Merge on a Single Key

You can merge DataFrames using the merge() function by specifying a common column (key).

 Merge two DataFrames on a common column (e.g., 'Name'):

df1 = pd.DataFrame({'Name': ['Alice', 'Bob', 'Charlie'],

'Age': [25, 30, 35]})

df2 = pd.DataFrame({'Name': ['Alice', 'Bob', 'David'],

'City': ['New York', 'Los Angeles', 'Houston']})

df_merged = pd.merge(df1, df2, on='Name')

print(df_merged)

Output:
Name Age City
0 Alice 25 New York
1 Bob 30 Los Angeles

4.2 Merge with Different Keys

You can merge DataFrames with different column names using left_on and right_on.

df_merged = pd.merge(df1, df2, left_on='Name', right_on='Name', how='inner')

4.3 Types of Joins

You can specify the type of join using the how parameter:

 'left': Left join

 'right': Right join
 'outer': Outer join (union)
 'inner': Inner join (intersection)

5. Concatenating DataFrames

You can concatenate multiple DataFrames along a particular axis (either row-wise or column-
wise) using the concat() function.

5.1 Concatenate Row-wise

To stack DataFrames vertically (add more rows):

df1 = pd.DataFrame({'Name': ['Alice', 'Bob'], 'Age': [25, 30]})

df2 = pd.DataFrame({'Name': ['Charlie', 'David'], 'Age': [35, 40]})

df_concat = pd.concat([df1, df2], axis=0, ignore_index=True)

print(df_concat)

Output:

Name Age
0 Alice 25
1 Bob 30
2 Charlie 35
3 David 40

5.2 Concatenate Column-wise

To concatenate DataFrames horizontally (add more columns):

df1 = pd.DataFrame({'Name': ['Alice', 'Bob'], 'Age': [25, 30]})

df2 = pd.DataFrame({'City': ['New York', 'Los Angeles']})

df_concat = pd.concat([df1, df2], axis=1)

print(df_concat)

Output:

Name Age City

0 Alice 25 New York
1 Bob 30 Los Angeles

Summary of Key Functions:

Operation Function
Sorting sort_index(), sort_values()
Indexing loc[], iloc[], set_index()
Grouping groupby(), agg(), size()
Merging DataFrames merge()
Concatenating DataFrames concat()

Conclusion:

Pandas provides a variety of powerful tools to manipulate and transform data. Sorting, indexing,
grouping, merging, and concatenating DataFrames are common tasks that allow you to clean,
organize, and analyze data more effectively. With these tools, you can perform complex data
operations in just a few lines of code, making Pandas an essential library for data manipulation in
Python.

In Pandas, working with dates and times is made easy with the datetime functionality, which
includes converting strings to datetime objects, extracting components like year, month, day,
etc., and performing operations on date and time data.

1. Converting Strings to Dates/Times

To convert strings to datetime objects, you can use the pd.to_datetime() function. This will
automatically recognize the date and time format.

import pandas as pd

# Example string date

date_str = '2025-03-11 14:25:30'

# Convert to datetime object

date_time = pd.to_datetime(date_str)

print(date_time)
2. Extracting Date and Time Components

Once you have a datetime object, you can easily extract individual components like the year,
month, day, etc.

# Assuming date_time is a pandas datetime object

print(date_time.year) # Year
print(date_time.month) # Month
print(date_time.day) # Day
print(date_time.hour) # Hour
print(date_time.minute) # Minute
print(date_time.second) # Second

3. Handling Series of Dates and Times

When working with a DataFrame or Series that contains date and time data, you can apply the
same functions across the column. For example:

data = {'date': ['2025-03-01 10:00:00', '2025-03-02 12:30:00', '2025-03-03

14:45:00']}
df = pd.DataFrame(data)

# Convert to datetime
df['date'] = pd.to_datetime(df['date'])

# Extract Year, Month, Day from the datetime column

df['year'] = df['date'].dt.year
df['month'] = df['date'].dt.month
df['day'] = df['date'].dt.day
df['hour'] = df['date'].dt.hour

print(df)

4. Adding or Subtracting Time

You can use the timedelta class to add or subtract time from a datetime object. For example,
adding one week to a date:

from datetime import timedelta

# Add 7 days to the date

new_date = date_time + timedelta(days=7)
print(new_date)

5. Date and Time Arithmetic

You can perform date and time arithmetic to find the difference between two datetime objects.
The result will be a Timedelta object.

date1 = pd.to_datetime('2025-03-01')
date2 = pd.to_datetime('2025-03-11')
# Find difference
difference = date2 - date1
print(difference) # Output: 10 days

6. Handling Time Zones

You can also handle time zones by using the tz_localize() and tz_convert() functions.

# Localize to a specific timezone

date_time = pd.to_datetime('2025-03-11 14:25:30')
date_time = date_time.tz_localize('UTC')

# Convert to another time zone

date_time_ny = date_time.tz_convert('US/Eastern')
print(date_time_ny)

7. Formatting Dates/Times

To format a datetime object as a string, use the strftime() method. This allows you to
customize the output format.

# Format datetime as string

formatted_date = date_time.strftime('%Y-%m-%d %H:%M:%S')
print(formatted_date)

8. Handling Missing Dates

If a date or time value is missing, Pandas uses NaT (Not a Time) to represent it, similar to how
NaN works for numerical values.

# Handle missing dates

df['date'] = pd.to_datetime(df['date'], errors='coerce') # Invalid dates
become NaT

9. Working with Date Ranges

Pandas also provides functionality to create a range of dates with date_range().

# Generate a range of dates

date_range = pd.date_range(start='2025-03-01', end='2025-03-10', freq='D')
print(date_range)

This covers a lot of basic and advanced operations you might need when working with dates and
times in Pandas! Let me know if you'd like to dive deeper into any specific aspect of working
with dates.

Pandas
No ratings yet
Pandas
27 pages
Data Handling Using Pandas-1
No ratings yet
Data Handling Using Pandas-1
60 pages
Data Analysis with Pandas
No ratings yet
Data Analysis with Pandas
122 pages
lecture-9-pandas
No ratings yet
lecture-9-pandas
176 pages
Manual Referencia Pic18
No ratings yet
Manual Referencia Pic18
976 pages
1-RM Sampling Generalization
No ratings yet
1-RM Sampling Generalization
79 pages
DAP_3_module
No ratings yet
DAP_3_module
62 pages
Python Notes
No ratings yet
Python Notes
68 pages
Stratigraphic Lexicon: The Onshore Cenozoic Sedimentary Formations of The Republic of Panama
No ratings yet
Stratigraphic Lexicon: The Onshore Cenozoic Sedimentary Formations of The Republic of Panama
173 pages
Unit 4.2
No ratings yet
Unit 4.2
24 pages
Art and Culture 3
No ratings yet
Art and Culture 3
40 pages
DF_1
No ratings yet
DF_1
17 pages
MANAGERIAL ECONOMICS UNIT 1
No ratings yet
MANAGERIAL ECONOMICS UNIT 1
40 pages
Class Xii Information Practices Ppt on Data Handling Using Pandas-i
No ratings yet
Class Xii Information Practices Ppt on Data Handling Using Pandas-i
64 pages
DataFrame.docx
No ratings yet
DataFrame.docx
95 pages
Chapter 10 Python
No ratings yet
Chapter 10 Python
45 pages
Week 4.1
No ratings yet
Week 4.1
16 pages
Pandas
No ratings yet
Pandas
7 pages
Unit 5 - Machine Learning - WWW - Rgpvnotes.in
No ratings yet
Unit 5 - Machine Learning - WWW - Rgpvnotes.in
12 pages
Notes_finance & Corporate Finance
No ratings yet
Notes_finance & Corporate Finance
15 pages
User Manual Arckey 2.0
No ratings yet
User Manual Arckey 2.0
24 pages
05 Pandas Data Frames
No ratings yet
05 Pandas Data Frames
33 pages
Pandas Notes(1)
No ratings yet
Pandas Notes(1)
44 pages
18_Pandas
No ratings yet
18_Pandas
33 pages
Python Pandas Module - Introduction-07-11-2023
No ratings yet
Python Pandas Module - Introduction-07-11-2023
84 pages
Python Pandas
No ratings yet
Python Pandas
13 pages
Pandas
No ratings yet
Pandas
82 pages
Murata Ferrite Bead
No ratings yet
Murata Ferrite Bead
93 pages
2_Pandas
No ratings yet
2_Pandas
22 pages
Physics Stage 6 Syllabus 2017
No ratings yet
Physics Stage 6 Syllabus 2017
29 pages
Real Time Operating Systems - UNIT 1 JNTU Notes
0% (1)
Real Time Operating Systems - UNIT 1 JNTU Notes
5 pages
Pandas
No ratings yet
Pandas
16 pages
Python Pandas New Sylabus
No ratings yet
Python Pandas New Sylabus
53 pages
Python Pandas ch-2
No ratings yet
Python Pandas ch-2
56 pages
The Pandas Library
No ratings yet
The Pandas Library
39 pages
Normalization
No ratings yet
Normalization
3 pages
UNIT - 3 Pandas
No ratings yet
UNIT - 3 Pandas
21 pages
Past Paper
No ratings yet
Past Paper
24 pages
Pandas
No ratings yet
Pandas
25 pages
Class 12 Panda Project
No ratings yet
Class 12 Panda Project
13 pages
1745516832930-Pandas-Handbook
No ratings yet
1745516832930-Pandas-Handbook
33 pages
Pandas 1705297450
No ratings yet
Pandas 1705297450
21 pages
Datasheet Plcstampmicro2 Rev14 1
No ratings yet
Datasheet Plcstampmicro2 Rev14 1
20 pages
Pandas
No ratings yet
Pandas
13 pages
unit-3(FODS)
No ratings yet
unit-3(FODS)
34 pages
Python 3rd unit question and answer
No ratings yet
Python 3rd unit question and answer
25 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
4 pages
Pandas
No ratings yet
Pandas
8 pages
Unit 4
No ratings yet
Unit 4
36 pages
Unit6 - Working With Data
No ratings yet
Unit6 - Working With Data
29 pages
06 MGMT 590 Fall 2019 Data Handling With Pandas(1)
No ratings yet
06 MGMT 590 Fall 2019 Data Handling With Pandas(1)
14 pages
14 Lubricator S
No ratings yet
14 Lubricator S
26 pages
Heuristic Approach To Warehouse
No ratings yet
Heuristic Approach To Warehouse
13 pages
Pandas
No ratings yet
Pandas
41 pages
Class 9 Important Questions
No ratings yet
Class 9 Important Questions
11 pages
Data Series
No ratings yet
Data Series
3 pages
Encryption: To Study and Implement Columnar Transposition Cipher
No ratings yet
Encryption: To Study and Implement Columnar Transposition Cipher
4 pages
Pandas (Ziad)
No ratings yet
Pandas (Ziad)
38 pages
All About POWER BI Charts
No ratings yet
All About POWER BI Charts
5 pages
Pandas - Digitalocean
No ratings yet
Pandas - Digitalocean
15 pages
FF Exp 6
No ratings yet
FF Exp 6
6 pages
14oct Pandas 2024
No ratings yet
14oct Pandas 2024
13 pages
Pandas PDF(2)
No ratings yet
Pandas PDF(2)
25 pages
JOINS (1)
No ratings yet
JOINS (1)
10 pages
ABHISHEK GARG-Male30 YEAR-373658
No ratings yet
ABHISHEK GARG-Male30 YEAR-373658
2 pages
It Skills (Dimpal)
No ratings yet
It Skills (Dimpal)
1 page
mail merger
No ratings yet
mail merger
1 page
Unable To Encrypt SSL Message-Java - Security - InvalidKeyException
No ratings yet
Unable To Encrypt SSL Message-Java - Security - InvalidKeyException
2 pages
Forming, Working and Heat-Treating Metal
No ratings yet
Forming, Working and Heat-Treating Metal
2 pages
Lab-3 Pandas Library
No ratings yet
Lab-3 Pandas Library
14 pages
Pandas Basics
No ratings yet
Pandas Basics
84 pages
Pandas python
No ratings yet
Pandas python
11 pages
Lab 9
No ratings yet
Lab 9
9 pages
Python Data Frame New
No ratings yet
Python Data Frame New
32 pages
FDS Module 2 Notes
No ratings yet
FDS Module 2 Notes
24 pages
IP 12th Chapter 3
No ratings yet
IP 12th Chapter 3
9 pages
2nd IA Question Paper 17CV52
No ratings yet
2nd IA Question Paper 17CV52
2 pages
Penalties For Success: Reactions To Women Who Succeed at Male Gender-Typed Tasks
No ratings yet
Penalties For Success: Reactions To Women Who Succeed at Male Gender-Typed Tasks
12 pages
L32, 33 Pandas
No ratings yet
L32, 33 Pandas
7 pages
Pandas
No ratings yet
Pandas
12 pages
01-Units, Physical Quantities, and Vectors
No ratings yet
01-Units, Physical Quantities, and Vectors
21 pages
Exp1 - Manipulating Datasets Using Pandas
No ratings yet
Exp1 - Manipulating Datasets Using Pandas
15 pages
Computer Test Full Syllabus
No ratings yet
Computer Test Full Syllabus
3 pages
Boundary Wall Design - Final For RFC
100% (1)
Boundary Wall Design - Final For RFC
10 pages
F5C1 Newton Part 2 Note 2
No ratings yet
F5C1 Newton Part 2 Note 2
5 pages
Introduction To Pandas For Data Analysis
No ratings yet
Introduction To Pandas For Data Analysis
6 pages
p2016 p2017 Jeep
100% (1)
p2016 p2017 Jeep
18 pages
All Document Reader 1715619870900
No ratings yet
All Document Reader 1715619870900
6 pages
Pandas
No ratings yet
Pandas
4 pages
Favorite Prime Numbers To Base The QRD, N 7, 13 and 41
No ratings yet
Favorite Prime Numbers To Base The QRD, N 7, 13 and 41
4 pages
Pandas DataFrame Notes
100% (1)
Pandas DataFrame Notes
6 pages
The Importance of Deep-Water Drilling Main Problems Solutions Dowell Experience
No ratings yet
The Importance of Deep-Water Drilling Main Problems Solutions Dowell Experience
15 pages
Pandas Dataframe Export The CSV File
No ratings yet
Pandas Dataframe Export The CSV File
9 pages
Cheat Sheet: The Pandas Dataframe Object I: Preliminaries Get Your Data Into A Dataframe
No ratings yet
Cheat Sheet: The Pandas Dataframe Object I: Preliminaries Get Your Data Into A Dataframe
12 pages
Cheat Sheet: The Pandas Dataframe Object: Column Index (DF - Columns)
No ratings yet
Cheat Sheet: The Pandas Dataframe Object: Column Index (DF - Columns)
6 pages
Class Notes: Class: XII Date: 7-Apr-2020 Subject: Informatics Practices Topic: 2. Python Pandas
No ratings yet
Class Notes: Class: XII Date: 7-Apr-2020 Subject: Informatics Practices Topic: 2. Python Pandas
4 pages
Mechanical Vapor Recompression (MVR) Evaporation Process - ENCON Evaporators
No ratings yet
Mechanical Vapor Recompression (MVR) Evaporation Process - ENCON Evaporators
2 pages
Assist. Director Air Traffic Controller (AD-ATC) Civil Aviation Authority (CAA) Revised Syllabus
100% (1)
Assist. Director Air Traffic Controller (AD-ATC) Civil Aviation Authority (CAA) Revised Syllabus
1 page
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet