0% found this document useful (0 votes)

45 views38 pages

Pandas (Ziad)

Pandas is a Python library used for data manipulation and analysis. It provides data structures like Series and DataFrames. A DataFrame is a two-dimensional data structure with labeled rows and columns for working with structured data. Pandas allows loading data from various file formats into DataFrames for analysis and manipulation. Common operations on DataFrames include viewing data, getting summary information, handling duplicates, and transforming the data.

Uploaded by

kkr.nitpy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

45 views38 pages

Pandas (Ziad)

Uploaded by

kkr.nitpy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 38

Introduction to Pandas’

DataFrame
A Library that is Used for Data Manipulation and Analysis Tool
Using Powerful Data Structures

By
Dr. Ziad Al-Sharif
Pandas First Steps: install and import
• Pandas is an easy package to install. Open up your terminal program (shell or cmd)
and install it using either of the following commands:
$ conda install pandas
OR
$ pip install pandas

• For jupyter notebook users, you can run this cell: !pip install pandas
• The ! at the beginning runs cells as if they were in a terminal.

• To import pandas we usually import it with a shorter name since it's used so much:

import pandas as pd

Installation: https://pandas.pydata.org/pandas-docs/stable/getting_started/install.html
pandas: Data Table Representation
Core components of pandas: Series & DataFrames
• The primary two components of pandas are the Series and DataFrame.
• Series is essentially a column, and
• DataFrame is a multi-dimensional table made up of a collection of Series.
• DataFrames and Series are quite similar in that many operations that you can do with one you can do
with the other, such as filling in null values and calculating the mean.
• A Data frame is a two-dimensional data structure, i.e., data is aligned in a tabular fashion in rows and columns.

columns
• Features of DataFrame
• Potentially columns are of different types
• Size – Mutable
• Labeled axes (rows and columns)
• Can Perform Arithmetic operations on rows
and columns
rows
Types of Data Structure in Pandas
Data Structure Dimensions Description

Series 1 1D labeled homogeneous array with immutable size

Data Frames 2 General 2D labeled, size mutable tabular structure

with potentially heterogeneously typed columns.
Panel 3 General 3D labeled, size mutable array.

• Series & DataFrame

• Series is a one-dimensional array (1D Array) like structure with homogeneous data.
• DataFrame is a two-dimensional array (2D Array) with heterogeneous data.
• Panel
• Panel is a three-dimensional data structure (3D Array) with heterogeneous data.
• It is hard to represent the panel in graphical representation.
• But a panel can be illustrated as a container of DataFrame
pandas.DataFrame
pandas.DataFrame(data, index , columns , dtype , copy )

• data: data takes various forms like ndarray, series, map, lists, dict, constants and also another DataFrame.
• index: For the row labels, that are to be used for the resulting frame, Optional, Default is np.arrange(n)if no index is passed.
• columns: For column labels, the optional default syntax is - np.arrange(n). This is only true if no index is passed.
• dtype: Data type of each column.
• copy: This command (or whatever it is) is used for copying of data, if the default is False.

• Create DataFrame
• A pandas DataFrame can be created using various inputs like −
• Lists
• dict
• Series
• Numpy ndarrays
• Another DataFrame
Creating a DataFrame from scratch
Creating a DataFrame from scratch
• There are many ways to create a DataFrame from scratch, but a great option is to just use a
simple dict. But first you must import pandas.
import pandas as pd
• Let's say we have a fruit stand that sells apples and oranges. We want to have a column for each
fruit and a row for each customer purchase. To organize this as a dictionary for pandas we could
do something like:
data = { 'apples':[3, 2, 0, 1] , 'oranges':[0, 3, 7, 2] }

• And then pass it to the pandas DataFrame constructor:

df = pd.DataFrame(data)
How did that work?
• Each (key, value) item in data corresponds to a column in the resulting DataFrame.
• The Index of this DataFrame was given to us on creation as the numbers 0-3, but we could also
create our own when we initialize the DataFrame.
• E.g. if you want to have customer names as the index:

df = pd.DataFrame(data, index=['Ahmad', 'Ali', 'Rashed', 'Hamza'])

• So now we could locate a customer's

order by using their names:

df.loc['Ali']
pandas.DataFrame.from_dict
pandas.DataFrame.from_dict(data, orient='columns', dtype=None, columns=None)

• data : dict
• Of the form {field:array-like} or {field:dict}.

• orient : {‘columns’, ‘index’}, default ‘columns’

• The “orientation” of the data.
• If the keys of the passed dict should be the columns of the resulting DataFrame, pass ‘columns’ (default).
• Otherwise if the keys should be rows, pass ‘index’.

• dtype : dtype, default None

• Data type to force, otherwise infer.

• columns : list, default None

• Column labels to use when orient='index'. Raises a ValueError if used with
orient='columns'.

https://pandas.pydata.org/pandas-docs/version/0.23/generated/pandas.DataFrame.from_dict.html
pandas’ orient keyword
data = {'col_1':[3, 2, 1, 0], 'col_2':
['a','b','c','d']}
pd.DataFrame.from_dict(data)

data = {'row_1':[3, 2, 1, 0], 'row_2':

['a','b','c','d']}
pd.DataFrame.from_dict(data,
orient='index')

data = {'row_1':[3, 2, 1, 0], 'row_2':

['a','b','c','d']}
pd.DataFrame.from_dict(data,
orient = 'index',
columns = ['A','B','C','D'])
Loading a DataFrame from files
Reading data from a CSV file
Reading data from CSVs
• With CSV files, all you need is a single line to load in the data:

df = pd.read_csv('dataset.csv')

• CSVs don't have indexes like our DataFrames, so all we need to do is just designate the
index_col when reading:

df = pd.read_csv('dataset.csv', index_col=0)

• Note: here we're setting the index to be column zero.

Reading data from JSON
• If you have a JSON file — which is essentially a stored Python dict — pandas can read this just as
easily:

df = pd.read_json('dataset.json')

• Notice this time our index came with us correctly since using JSON allowed indexes to work
through nesting.
• Pandas will try to figure out how to create a DataFrame by analyzing structure of your JSON, and
sometimes it doesn't get it right.
• Often you'll need to set the orient keyword argument depending on the structure
Example #1:Reading data from JSON
Example #2: Reading data from JSON
Example #3: Reading data from JSON
Converting back to a CSV or JSON
• So after extensive work on cleaning your data, you’re now ready to save it as a file of your choice.
Similar to the ways we read in data, pandas provides intuitive commands to save it:

df.to_csv('new_dataset.csv')
df.to_json('new_dataset.json')
df.to_sql('new_dataset', con)

• When we save JSON and CSV files, all we have to input into those functions is our desired
filename with the appropriate file extension.

Reference: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_json.html
Most important
DataFrame operations
• DataFrames possess hundreds of methods and other operations that are crucial to any analysis.
• As a beginner, you should know the operations that:
• that perform simple transformations of your data and those
• that provide fundamental statistical analysis on your data.
Loading dataset
• We're loading this dataset from a CSV and designating the movie titles to be our
index.

movies_df = pd.read_csv("movies.csv", index_col="title")

https://grouplens.org/datasets/movielens/
Viewing your data
• The first thing to do when opening a new dataset is print out a few rows to keep as a visual
reference. We accomplish this with .head():
movies_df.head()

• .head() outputs the first five rows of your DataFrame by default, but we could also pass a number
as well: movies_df.head(10) would output the top ten rows, for example.

• To see the last five rows use .tail() that also accepts a number, and in this case we printing
the bottom two rows.:
movies_df.tail(2)
Getting info about your data
• .info() should be one of the very first commands you run after loading your data
• .info() provides the essential details about your dataset, such as the number of rows and
columns, the number of non-null values, what type of data is in each column, and how much
memory your DataFrame is using.
movies_df.info()

movies_df.shape
Handling duplicates
• This dataset does not have duplicate rows, but it is always important to verify you aren't
aggregating duplicate rows.
• To demonstrate, let's simply just double up our movies DataFrame by appending it to itself:
• Using append() will return a copy without affecting the original DataFrame. We are capturing
this copy in temp so we aren't working with the real data.
• Notice call .shape quickly proves our DataFrame rows have doubled.

temp_df = movies_df.append(movies_df)
temp_df.shape

Now we can try dropping duplicates:

temp_df = temp_df.drop_duplicates()
temp_df.shape
Handling duplicates
• Just like append(), the drop_duplicates() method will also return a copy of your
DataFrame, but this time with duplicates removed. Calling .shape confirms we're back to the
1000 rows of our original dataset.
• It's a little verbose to keep assigning DataFrames to the same variable like in this example. For this
reason, pandas has the inplace keyword argument on many of its methods. Using
inplace=True will modify the DataFrame object in place:

temp_df.drop_duplicates(inplace=True)

• Another important argument for drop_duplicates() is keep, which has three possible
options:
• first: (default) Drop duplicates except for the first occurrence.
• last: Drop duplicates except for the last occurrence.
• False: Drop all duplicates.

https://www.learndatasci.com/tutorials/python-pandas-tutorial-complete-introduction-for-beginners/
Understanding your variables
• Using .describe() on an entire DataFrame we can get a summary of the distribution of
continuous variables:
movies_df.describe()

• .describe() can also be used on a categorical variable to get the count of rows, unique
count of categories, top category, and freq of top category:
movies_df['genre'].describe()

• This tells us that the genre column has 207 unique values, the top value is Action/Adventure/Sci-
Fi, which shows up 50 times (freq).
More Examples

import pandas as pd
data = [1,2,3,10,20,30]
df = pd.DataFrame(data)
print(df)

import pandas as pd
data = {'Name' : ['AA', 'BB'], 'Age': [30,45]}
df = pd.DataFrame(data)
print(df)
More Examples

import pandas as pd a b c
data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}]
df = pd.DataFrame(data) 0 1 2 NaN
print(df) 1 5 10 20.0

import pandas as pd a b c
data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}]
df = pd.DataFrame(data, index=['first', 'second']) first 1 2 NaN
print(df) second 5 10 20.0
More Examples
E.g. This shows how to create a DataFrame with a list of dictionaries, row indices, and column indices.

import pandas as pd
data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}]

#With two column indices, values same as dictionary keys

df1 = pd.DataFrame(data,index=['first','second'],columns=['a','b'])

#With two column indices with one index with other name
df2 = pd.DataFrame(data,index=['first','second'],columns=['a','b1'])
a b
print(df1) first 1 2
print('...........')
print(df2) second 5 10
...........
a b1
first 1 NaN
second 5 NaN
More Examples:
Create a DataFrame from Dict of Series
import pandas as pd
d = {'one' : pd.Series([1, 2, 3] , index=['a', 'b', 'c']),
'two' : pd.Series([1,2, 3, 4], index=['a', 'b', 'c', 'd'])
}
df = pd.DataFrame(d)
print(df)

one two
a 1.0 1
b 2.0 2
c 3.0 3
d NaN 4
More Examples: Column Addition
import pandas as pd Adding a column using Series:
d = {'one':pd.Series([1,2,3], index=['a','b','c']),
'two':pd.Series([1,2,3,4], index=['a','b','c','d']) one two three
} a 1.0 1 10.0
df = pd.DataFrame(d)
# Adding a new column to an existing DataFrame object b 2.0 2 20.0
# with column label by passing new series c 3.0 3 30.0

print("Adding a new column by passing as Series:") d NaN 4 NaN

df['three'] = pd.Series([10,20,30],index=['a','b','c'])
print(df)
Adding a column using columns:
print("Adding a column using an existing columns in one two three four
DataFrame:")
df['four'] = df['one']+df['three'] a 1.0 1 10.0 11.0
print(df) b 2.0 2 20.0 22.0
c 3.0 3 30.0 33.0
d NaN 4 NaN NaN
More Examples: Column Deletion
# Using the previous DataFrame, we will delete a column
# using del function
import pandas as pd Our dataframe is:
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']), one two three
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd']), a 1.0 1 10.0
'three' : pd.Series([10,20,30], index=['a','b','c']) b 2.0 2 20.0
} c 3.0 3 30.0
df = pd.DataFrame(d) d NaN 4 NaN
print ("Our dataframe is:")
print(df) Deleting the first column:
two three
# using del function a 1 10.0
print("Deleting the first column using DEL function:") b 2 20.0
del df['one'] c 3 30.0
print(df) d 4 NaN

# using pop function Deleting another column:

print("Deleting another column using POP function:") a 10.0
df.pop('two') b 20.0
print(df) c 30.0
d NaN
More Examples: Slicing in DataFrames
import pandas as pd
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c','d'])
}
df = pd.DataFrame(d)
print(df[2:4])

one two
c 3.0 3
d NaN 4
More Examples: Addition of rows
import pandas as pd one two
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c','d']) a 1.0 1
} b 2.0 2
df = pd.DataFrame(d)
print(df) c 3.0 3
d NaN 4
df2 = pd.DataFrame([[5,6], [7,8]], columns = ['a', 'b'])
df = df.append(df2 )
print(df) one two a b
a 1.0 1.0 NaN NaN
b 2.0 2.0 NaN NaN
c 3.0 3.0 NaN NaN
d NaN 4.0 NaN NaN
0 NaN NaN 5.0 6.0
1 NaN NaN 7.0 8.0
More Examples: Deletion of rows
import pandas as pd one two
a 1.0 1
d = {'one':pd.Series([1, 2, 3], index=['a','b','c']), b 2.0 2
'two':pd.Series([1, 2, 3, 4], index=['a','b','c','d']) c 3.0 3
} d NaN 4
df = pd.DataFrame(d)
print(df) one two a b
a 1.0 1.0 NaN NaN
df2 = pd.DataFrame([[5,6], [7,8]], columns = ['a', 'b']) b 2.0 2.0 NaN NaN
c 3.0 3.0 NaN NaN
df = df.append(df2 )
d NaN 4.0 NaN NaN
print(df) 0 NaN NaN 5.0 6.0
1 NaN NaN 7.0 8.0
df = df.drop(0)
print(df) one two a b
a 1.0 1.0 NaN NaN
b 2.0 2.0 NaN NaN
c 3.0 3.0 NaN NaN
d NaN 4.0 NaN NaN
1 NaN NaN 7.0 8.0
More Examples: Reindexing • Pandas
dataframe.reindex_like()
function return an object with
import pandas as pd matching indices to myself.
# Creating the first dataframe
• Any non-matching indexes are filled
df1 = pd.DataFrame({"A":[1, 5, 3, 4, 2],
"B":[3, 2, 4, 3, 4], with NaN values.
"C":[2, 2, 7, 3, 4],
"D":[4, 3, 6, 12, 7]},
index =["A1", "A2", "A3", "A4", "A5"])

# Creating the second dataframe

df2 = pd.DataFrame({"A":[10, 11, 7, 8, 5],
"B":[21, 5, 32, 4, 6],
"C":[11, 21, 23, 7, 9],
"D":[1, 5, 3, 8, 6]},
index =["A1", "A3", "A4", "A7", "A8"])

# Print the first dataframe

print(df1)
print(df2)
# find matching indexes
df1.reindex_like(df2)
More Examples:
Concatenating Objects (Data Frames)

import pandas as pd
df1 = pd.DataFrame({'Name':['A','B'], 'SSN':[10,20], 'marks':[90, 95] })
df2 = pd.DataFrame({'Name':['B','C'], 'SSN':[25,30], 'marks':[80, 97] })
df3 = pd.concat([df1, df2])
df3
References
• pandas documentation
• https://pandas.pydata.org/pandas-docs/stable/index.html
• pandas: Input/output
• https://pandas.pydata.org/pandas-docs/stable/reference/io.html
• pandas: DataFrame
• https://pandas.pydata.org/pandas-docs/stable/reference/frame.html
• pandas: Series
• https://pandas.pydata.org/pandas-docs/stable/reference/series.html
• pandas: Plotting
• https://pandas.pydata.org/pandas-docs/stable/reference/plotting.html

Python Pandas Tutorial For Beginners
No ratings yet
Python Pandas Tutorial For Beginners
203 pages
DevOps Session 3 Pandas
No ratings yet
DevOps Session 3 Pandas
33 pages
18 Pandas
No ratings yet
18 Pandas
33 pages
UNIT II Notes
No ratings yet
UNIT II Notes
23 pages
Exp1 - Manipulating Datasets Using Pandas
No ratings yet
Exp1 - Manipulating Datasets Using Pandas
15 pages
Pandas Summarized Visually in 8
100% (2)
Pandas Summarized Visually in 8
8 pages
Pandas Basics
No ratings yet
Pandas Basics
84 pages
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
100% (1)
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
12 pages
Pandas
No ratings yet
Pandas
13 pages
Pandas Notes
No ratings yet
Pandas Notes
6 pages
Cheat Sheet
No ratings yet
Cheat Sheet
10 pages
Pandas 1705297450
No ratings yet
Pandas 1705297450
21 pages
2 Pandas
No ratings yet
2 Pandas
22 pages
Data Handling Using Pandas-1
No ratings yet
Data Handling Using Pandas-1
60 pages
Python Pandas
No ratings yet
Python Pandas
13 pages
Python Pandas ch-2
No ratings yet
Python Pandas ch-2
56 pages
Pandas
No ratings yet
Pandas
16 pages
The Pandas Library
No ratings yet
The Pandas Library
39 pages
DataFrame Ac Win Final
No ratings yet
DataFrame Ac Win Final
30 pages
For Assignment-3 (Final - Pandas - Lab)
No ratings yet
For Assignment-3 (Final - Pandas - Lab)
40 pages
Introduction To Pandas
No ratings yet
Introduction To Pandas
14 pages
Data Analysis With Pandas
No ratings yet
Data Analysis With Pandas
122 pages
Mdad - Numpy ML
No ratings yet
Mdad - Numpy ML
85 pages
DF 1
No ratings yet
DF 1
17 pages
4 Data Transformation Using Pandas
No ratings yet
4 Data Transformation Using Pandas
59 pages
Handout Pandas
No ratings yet
Handout Pandas
33 pages
Pandas Basics
No ratings yet
Pandas Basics
21 pages
Lab-3 Pandas Library
No ratings yet
Lab-3 Pandas Library
14 pages
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
100% (1)
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
10 pages
06 MGMT 590 Fall 2019 Data Handling With Pandas
No ratings yet
06 MGMT 590 Fall 2019 Data Handling With Pandas
14 pages
DAP 3 Module
No ratings yet
DAP 3 Module
62 pages
Loki Temp PPT Pandas 2
No ratings yet
Loki Temp PPT Pandas 2
31 pages
FDS Module 2 Notes
No ratings yet
FDS Module 2 Notes
24 pages
Pandas Worksheets ALL
100% (1)
Pandas Worksheets ALL
8 pages
Pandas Handbook
No ratings yet
Pandas Handbook
33 pages
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
13 pages
Pandas
No ratings yet
Pandas
41 pages
Data Frame
No ratings yet
Data Frame
95 pages
Pandas Notes
No ratings yet
Pandas Notes
44 pages
Class XII IP Key Points (Python Pandas)
No ratings yet
Class XII IP Key Points (Python Pandas)
5 pages
Pandas Learndatasci
No ratings yet
Pandas Learndatasci
86 pages
Python Pandas New Sylabus
No ratings yet
Python Pandas New Sylabus
53 pages
Pandas Dataframe Export The CSV File
No ratings yet
Pandas Dataframe Export The CSV File
9 pages
Python Pandas Tutorial
No ratings yet
Python Pandas Tutorial
45 pages
CHP 8 Pandas
No ratings yet
CHP 8 Pandas
49 pages
Unit 4
No ratings yet
Unit 4
36 pages
Cheat Sheet: The Pandas Dataframe Object I: Preliminaries Get Your Data Into A Dataframe
No ratings yet
Cheat Sheet: The Pandas Dataframe Object I: Preliminaries Get Your Data Into A Dataframe
12 pages
Python Pandas Module - Introduction-07-11-2023
No ratings yet
Python Pandas Module - Introduction-07-11-2023
84 pages
Pandas DataFrame Notes
100% (1)
Pandas DataFrame Notes
10 pages
Pandas
No ratings yet
Pandas
13 pages
Pandas
No ratings yet
Pandas
41 pages
Lecture 7 Understanding Dataframes in Python and R
No ratings yet
Lecture 7 Understanding Dataframes in Python and R
17 pages
Python 3rd Unit Question and Answer
No ratings yet
Python 3rd Unit Question and Answer
25 pages
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
10 pages
Python Data Frame New
No ratings yet
Python Data Frame New
32 pages
Pandas Python
No ratings yet
Pandas Python
11 pages
(FREE PDF Sample) Time Series Analysis Ebooks
100% (2)
(FREE PDF Sample) Time Series Analysis Ebooks
14 pages
Pandas
No ratings yet
Pandas
12 pages
CA Foundation Economics - Notes
No ratings yet
CA Foundation Economics - Notes
87 pages
Internship Report
No ratings yet
Internship Report
6 pages
Mattila Malinen 2024 Exploring The Normative Structure of Finnish Soldiers Home Association Understanding An Auxiliary
No ratings yet
Mattila Malinen 2024 Exploring The Normative Structure of Finnish Soldiers Home Association Understanding An Auxiliary
23 pages
CH - 5 Gas Well Testing
No ratings yet
CH - 5 Gas Well Testing
51 pages
NESA Key Terms Posters
No ratings yet
NESA Key Terms Posters
21 pages
PUBLIC ADDRESS AND GENERAL Alarm System
No ratings yet
PUBLIC ADDRESS AND GENERAL Alarm System
3 pages
Math05 CO6.1 Module
No ratings yet
Math05 CO6.1 Module
14 pages
Fluid Mechanics PDF
No ratings yet
Fluid Mechanics PDF
65 pages
Midterm
No ratings yet
Midterm
11 pages
Pandas DataFrameObject
No ratings yet
Pandas DataFrameObject
4 pages
PowerPoint MP 500 User Manual
No ratings yet
PowerPoint MP 500 User Manual
21 pages
Information of The Products
No ratings yet
Information of The Products
3 pages
C and Z Sections - AEP
No ratings yet
C and Z Sections - AEP
90 pages
Sanathana Trust Chirala - Google Search
No ratings yet
Sanathana Trust Chirala - Google Search
1 page
Antonio Vivaldi - Violin Concerto in B-Flat Major, RV 583
No ratings yet
Antonio Vivaldi - Violin Concerto in B-Flat Major, RV 583
9 pages
STT13005D: High Voltage Fast-Switching NPN Power Transistor
No ratings yet
STT13005D: High Voltage Fast-Switching NPN Power Transistor
10 pages
KAPWA: A Core Concept in Filipino Psychology
No ratings yet
KAPWA: A Core Concept in Filipino Psychology
15 pages
The Pros and Cons of Social Media
No ratings yet
The Pros and Cons of Social Media
4 pages
Homeroom Guidance
No ratings yet
Homeroom Guidance
5 pages
C V K C: Urriculum Itae Alyani Hadha
No ratings yet
C V K C: Urriculum Itae Alyani Hadha
7 pages
Synopsis Format Propeller Shaft
No ratings yet
Synopsis Format Propeller Shaft
4 pages
Check - Circle: Thumb - Up Thumb - Down
0% (1)
Check - Circle: Thumb - Up Thumb - Down
3 pages
English Lesson Form 4 11 Nov
No ratings yet
English Lesson Form 4 11 Nov
1 page
Population and Development Integration: A Planning Approach
No ratings yet
Population and Development Integration: A Planning Approach
14 pages
CF5092 Talleres de Escoriaza SAU - 379770....
No ratings yet
CF5092 Talleres de Escoriaza SAU - 379770....
7 pages
1St Business Strategy Assignment Brief 2018
No ratings yet
1St Business Strategy Assignment Brief 2018
9 pages
Minion Rush. Hahaha
No ratings yet
Minion Rush. Hahaha
6 pages
12IP and CS BOTH - 100 - VIVA Qs - CS 12 by Lovejeet Arora
No ratings yet
12IP and CS BOTH - 100 - VIVA Qs - CS 12 by Lovejeet Arora
8 pages
Cognitive Evaluation Theory: Explanation
No ratings yet
Cognitive Evaluation Theory: Explanation
1 page
Position Title SG Training Experience Education Eligibility
No ratings yet
Position Title SG Training Experience Education Eligibility
4 pages
Algorithms and Data Structures: An Easy Guide to Programming Skills
From Everand
Algorithms and Data Structures: An Easy Guide to Programming Skills
Rigdon Jonathan
No ratings yet
Mastering DynamoDB
From Everand
Mastering DynamoDB
Tanmay Deshpande
No ratings yet

Pandas (Ziad)

Uploaded by

Pandas (Ziad)

Uploaded by

Introduction to Pandas’

Series 1 1D labeled homogeneous array with immutable size

Data Frames 2 General 2D labeled, size mutable tabular structure

• Series & DataFrame

• And then pass it to the pandas DataFrame constructor:

df = pd.DataFrame(data, index=['Ahmad', 'Ali', 'Rashed', 'Hamza'])

• So now we could locate a customer's

• orient : {‘columns’, ‘index’}, default ‘columns’

• dtype : dtype, default None

• columns : list, default None

data = {'row_1':[3, 2, 1, 0], 'row_2':

data = {'row_1':[3, 2, 1, 0], 'row_2':

• Note: here we're setting the index to be column zero.

movies_df = pd.read_csv("movies.csv", index_col="title")

Now we can try dropping duplicates:

#With two column indices, values same as dictionary keys

print("Adding a new column by passing as Series:") d NaN 4 NaN

# using pop function Deleting another column:

# Creating the second dataframe

# Print the first dataframe

You might also like