0% found this document useful (0 votes)

27 views90 pages

Unit - V

The document provides information about NumPy (Numerical Python) and Pandas for data science. It discusses: - NumPy is a fundamental package for high performance computing and data analysis in Python. It provides multidimensional arrays and vectorization capabilities. - NumPy arrays store data in contiguous memory locations and are more efficient than regular Python lists. - Pandas builds on NumPy and allows working with labeled data and tables similar to Excel. It provides data structures like Series and DataFrame. - The document covers NumPy indexing, slicing, data types, operations, descriptive statistics like percentile and variance calculations, and comparisons between NumPy and Pandas performance.

Uploaded by

gomathinayagam755

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views90 pages

Unit - V

Uploaded by

gomathinayagam755

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 90

INSTITUTE OF SCIENCE

ANDTECHNOLOGY,
CHENNAI.
SRM
21CSS101J – Programming for Problem Solving
Unit 5
INSTITUTE OF SCIENCE
ANDTECHNOLOGY,
CHENNAI.
SRM
LEARNING RESOURCES

S. No TEXT BOOKS

Python Datascience Handbook, Oreilly,Jake VanderPlas, 2017.

1. [Chapters 2 &3]

Python For Beginners, Timothy C.Needham,2019. [Chapters 1

2. to 4]

3. https://www.tutorialspoint.com/python/index.htm

4. https://www.w3schools.com/python/
INSTITUTE OF SCIENCE ANDTECHNOLOGY,
CHENNAI.

SRM UNIT V

(TOPICS COVERED)
Creating NumPy Array -Numpy Indexing - Numpy Array
attributes - Slicing using Numpy - Descriptive Statistics in
Numpy: Percentile - Variance in Numpy –

Introduction to Pandas - Creating Series Objects, Data Frame

Objects – Simple Operations with Data frames - Querying from
Data Frames -Applying Functions to Data frames - Comparison
between Numpy and Pandas - Speed Testing between Numpy
and Pandas - Other Python Libraries
21CSS101J

PROGRAMMING FOR PROBLEM SOLVING

UNIT-5
Numpy
(Numerical Python)
NumPy
Stands for Numerical Python
Is the fundamental package required for high performance
computing and data analysis
NumPy is so important for numerical computations in Python is
because it is designed for efficiency on large arrays of data.
It provides
ndarray for creating multiple dimensional arrays
Internally stores data in a contiguous block of memory,
independent of other built-in Python objects, use much less
memory than built-in Python sequences.
Standard math functions for fast operations on entire
arrays of data without having to write loops
NumPy Arrays are important because they enable you to
express batch operations on data without writing any for
loops. We call this vectorization.
NumPy ndarray vs list
One of the key features of NumPy is its N-dimensional array
object, or ndarray, which is a fast, flexible container for
large datasets in Python.
Whenever you see “array,” “NumPy array,” or “ndarray” in the
text, with few exceptions they all refer to the same thing: the
ndarray object.
NumPy-based algorithms are generally 10 to 100 times faster
(or more) than their pure Python counterparts and use
significantly less memory.
import numpy as np
my_arr = np.arange(1000000)
my_list = list(range(1000000))
ndarray
ndarray is used for storage of homogeneous data
Every array must have a shape and a dtype
Supports convenient slicing, indexing and efficient vectorized
computation
1-D Arrays
2-D Arrays
An array that has 1-D arrays as its elements is called a 2-D array.
These are often used to represent matrix or 2nd order tensors.

3-D arrays
An array that has 2-D arrays (matrices) as its elements is called 3-D array.
These are often used to represent a 3rd order tensor.
NumPy Arrays provides the ndim attribute that returns an integer that
tells us how many dimensions the array have

Higher Dimensional Arrays

An array can have any number of dimensions.
When the array is created, you can define the number of dimensions by using the
ndmin argument.
Access Array Elements
Array indexing is the same as accessing an array element.
You can access an array element by referring to its index number.
The indexes in NumPy arrays start with 0, meaning that the first element has
index 0, and the second has index 1 etc.

Access 2-D Arrays

Think of 2-D arrays like a table with rows and columns, where the row
represents the dimension and the index represents the column.
Access 3-D Arrays

Negative

OUTPUT?

OUTPUT?
Slicing arrays
• Slicing in python means taking elements from one given index to another given
index.
• We pass slice instead of index like this: [start:end].
• We can also define the step, like this: [start:end:step].
• If we don't pass start its considered 0
• If we don't pass end its considered length of array in that dimension
• If we don't pass step its considered 1
2D Array

From both elements, return index 2:

Converting Data Type on Existing Arrays
The astype() function creates a copy of the array, and allows you to specify the data type as
a parameter.

The Difference Between Copy and View

• The main difference between a copy and a view of an array is that the
copy is a new array, and the view is just a view of the original array.

• The copy owns the data and any changes made to the copy will not
affect original array, and any changes made to the original array will not
affect the copy.

• The view does not own the data and any changes made to the view will
affect the original array, and any changes made to the original array will
affect the view.
Joining NumPy Arrays
We pass a sequence of arrays that we want to join to the concatenate() function, along with
the axis. If axis is not explicitly passed, it is taken as 0.
array_split() for splitting arrays, we pass it the array we want to split and the
number of splits.

If the array has less elements than required, it will adjust from the end accordingly.
Searching Arrays
You can search an array for a certain value, and return the indexes that get a
match. To search an array, use the where() method.

Sorting
Operations between arrays and scalars
Array creation functions
Numpy Indexing

Contents of ndarray object can be accessed and modified by

indexing or slicing, just like Python's in-built container
objects.
items in ndarray object follows zero-based index. Three types
of indexing methods are available − field access, basic
slicing and advanced indexing.
Basic slicing is an extension of Python's basic concept of
slicing to n dimensions. A Python slice object is constructed
by giving start, stop, and step parameters to the built-
in slice function. This slice object is passed to the array to
extract a part of array.
Example :
import numpy as np import numpy as np Output:
a = np.arange(10) a = np.arange(10)
s = slice(2,7,2) b = a[2:7:2] [2 4 6]
print a[s] print b
ndarray object is prepared by arange() function. Then a slice object
is defined with start, stop, and step values 2, 7, and 2 respectively.
When this slice object is passed to the ndarray, a part of it starting
with index 2 up to 7 with a step of 2 is sliced.
If a : is inserted in front of it, all items from that index onwards will
be extracted.
Descriptive Statistics in Numpy:
Descriptive statistics allow us to summarise data sets quickly
with just a couple of numbers, and are in general easy to explain
to others.

Descriptive statistics fall into two general categories:

•1) Measures of central tendency which describe a ‘typical’ or
common value (e.g. mean, median, and mode); and,
•2) Measures of spread which describe how far apart values are
(e.g. percentiles, variance, and standard deviation).
Percentile:
numpy.percentile()function used to compute the nth percentile of the
given data (array elements) along the specified axis.

Syntax : numpy.percentile(arr, n, axis=None, out=None)

Parameters :
arr :input array.
n : percentile value.
axis : axis along which we want to calculate the percentile value.
Otherwise, it will consider arr to be flattened(works on all the axis).
axis = 0 means along the column and axis = 1 means working along the
row.
out :Different array in which we want to place the result. The array
must have same dimensions as expected output.
Return :nth Percentile of the array (a scalar value if axis is none)or
array with percentile values along specified axis.
Example: Output :
# Python Program illustrating
# numpy.percentile() method arr : [20, 2, 7, 1, 34]
50th percentile of arr : 7.0
import numpy as np 25th percentile of arr : 2.0
75th percentile of arr : 20.0
# 1D array
arr = [20, 2, 7, 1, 34]
print("arr : ", arr)
1 2 7 20 34
print("50th percentile of arr : ",
np.percentile(arr, 50))
print("25th percentile of arr : ",
np.percentile(arr, 25))
print("75th percentile of arr : ",
np.percentile(arr, 75))
Example:
Numpy: Has two related functions, percentile and quantile. The
percentile function uses q in range [0,100] e.g. for 90th
percentile use 90, whereas the quantile function uses q in range
[0,1], so the equivelant q would be 0.9. They can be used
interchangeably.
p25 = np.percentile(data_sample_even, q=25, interpolation='linear’)
p75 = np.percentile(data_sample_even, q=75, interpolation='linear’)
iqr = p75 - p25
Variance in Numpy -

Variance is the sum of squares of differences between all numbers

and means. The mathematical formula for variance is as follows,

Where,
N is the total number of elements or frequency of distribution.

calculate the variance by using numpy.var() function

Syntax:
numpy.var(a, axis=None, dtype=None, out=None, ddof=0, keepdi
ms=<no value>)

Parameters:
a: Array containing data to be averaged
axis: Axis or axes along which to average a
dtype: Type to use in computing the variance.
out: Alternate output array in which to place the result.
ddof: Delta Degrees of Freedom
keepdims: If this is set to True, the axes which are reduced are left
in the result as dimensions with size one
Example:

# Python program to get variance of a list Output:

4.0
# Importing the NumPy module
import numpy as np

# Taking a list of elements

list = [2, 4, 4, 4, 5, 5, 7, 9]

# Calculating variance using var()

print(np.var(list))
Introduction to Pandas -
What is Pandas?
• Pandas is a Python library used for working with data sets.
• It has functions for analyzing, cleaning, exploring, and
manipulating data.
• The name "Pandas" has a reference to both "Panel Data", and
"Python Data Analysis" and was created by Wes McKinney in
2008.
Why Use Pandas?
• Pandas allows us to analyze big data and make conclusions
based on statistical theories.
• Pandas can clean messy data sets, and make them readable
and relevant.
• Relevant data is very important in data science.
Installation of pandas:
C:\Users\Your Name>pip install pandas

Once Pandas is installed, import it in your applications by adding

the import keyword:

OR
0 1 1 7 2 2 dtype: int64

Creating Series Objects

What is a Series?
• A Pandas Series is like a column in a table.
• It is a one-dimensional array holding data of any type.
• Example
• Create a simple Pandas Series from a list:

import pandas as pd
a = [1, 7, 2]
myvar = pd.Series(a)
print(myvar)
Labels
• If nothing else is specified, the values are labeled with their
index number. First value has index 0, second value has index 1
etc.
• This label can be used to access a specified value.
Create Labels
With the index argument, you can name your own labels.
Key/Value Objects as Series
You can also use a key/value object, like a dictionary, when
creating a Series. The keys of the dictionary become the labels.

Example
Create a simple Pandas Series from a dictionary:

import pandas as pd

calories = {"day1": 420, "day2": 380, "day3": 390}

myvar = pd.Series(calories)

print(myvar)
To select only some of the items in the dictionary, use the index
argument and specify only the items you want to include in the
Series.

Example
Create a Series using only data from "day1" and "day2":

import pandas as pd

calories = {"day1": 420, "day2": 380, "day3": 390}

myvar = pd.Series(calories, index = ["day1", "day2"])

print(myvar)
Pandas DataFrame
It is two-dimensional size-
mutable, potentially
heterogeneous tabular data
structure with labeled axes
(rows and columns).

A Data frame is a two-dimensional

data structure, i.e., data is aligned
in a tabular fashion in rows and
columns.

Pandas DataFrame consists of

three principal components,
the data, rows, and columns.
Data Frame Objects
Data sets in Pandas are usually multi-dimensional tables, called
DataFrames.
Series is like a column, a DataFrame is the whole table.

Example
Create a DataFrame from two Series:

import pandas as pd
data = {
"calories": [420, 380, 390],
"duration": [50, 40, 45]
}
myvar = pd.DataFrame(data)
print(myvar)
What is a DataFrame?
A Pandas DataFrame is a 2 dimensional data structure, like a 2
dimensional array, or a table with rows and columns.
Example
Create a simple Pandas DataFrame:

import pandas as pd
data = {
"calories": [420, 380, 390],
"duration": [50, 40, 45]
}
#load data into a DataFrame object:
df = pd.DataFrame(data)
print(df)
Locate Row
As you can see from the result above, the DataFrame is like a
table with rows and columns.

Pandas use the loc attribute to return one or more specified

row(s)

Example
Return row 0:

#refer to the row index:

print(df.loc[0])
Return row 0 and 1:
#use a list of indexes:
print(df.loc[[0, 1]])
Named Indexes
With the index argument, you can name your own indexes.

Example
Add a list of names to give each row a name:

import pandas as pd

data = {
"calories": [420, 380, 390],
"duration": [50, 40, 45]
}

df = pd.DataFrame(data, index = ["day1", "day2", "day3"])

print(df)
Locate Named Indexes
Use the named index in the loc attribute to return the specified
row(s).

Example
Return "day2":

#refer to the named index:

print(df.loc["day2"])
Read CSV Files

• A simple way to store big data sets is to use CSV files (Comma
Separated Files).

• CSV files contains plain text and is a well know format that can
be read by everyone including Pandas.

• In our examples we will be using a CSV file called 'data.csv'.

Load Files Into a DataFrame
If your data sets are stored in a file, Pandas can load them into a
DataFrame.

Example
Load a comma separated file (CSV file) into a DataFrame:

import pandas as pd

df = pd.read_csv('data.csv')

print(df)
Output:
max_rows
• The number of rows returned is defined in Pandas option settings.

• You can check your system's maximum rows with the

pd.options.display.max_rows statement.
Read JSON
• Big data sets are often stored, or extracted as JSON (JavaScript Object Notation).
• JSON is plain text, but has the format of an object, and is well known in the world of
programming, including Pandas.
• In our examples we will be using a JSON file called 'data.json'.
Simple Operations with Data frames
Basic operation which can be performed on Pandas DataFrame :

Creating a DataFrame
Dealing with Rows and Columns
Indexing and Selecting Data
Working with Missing Data
Iterating over rows and columns
Create a Pandas DataFrame from Lists
DataFrame can be created using a single list or a list of lists.

# import pandas as pd
import pandas as pd

# list of strings
lst = ['Geeks', 'For', 'Geeks', 'is',
'portal', 'for', 'Geeks']

# Calling DataFrame constructor on list

df = pd.DataFrame(lst)
print(df)
Dealing with Rows and Columns
A Data frame is a two-dimensional data structure, i.e., data is aligned
in a tabular fashion in rows and columns. We can perform basic
operations on rows/columns like selecting, deleting, adding, and
renaming.

Column Selection: In Order to select a column in Pandas

DataFrame, we can either access the columns by calling them by
their columns name.
# Import pandas package
import pandas as pd

# Define a dictionary containing employee data

data = {'Name':['Jai', 'Princi', 'Gaurav', 'Anuj'],
'Age':[27, 24, 22, 32],
'Address':['Delhi', 'Kanpur', 'Allahabad', 'Kannauj'],
'Qualification':['Msc', 'MA', 'MCA', 'Phd']}

# Convert the dictionary into DataFrame

df = pd.DataFrame(data)

# select two columns

print(df[['Name', 'Qualification']])
Column Addition
Dropping Columns
Row Selection: Pandas provide a unique method to retrieve rows
from a Data frame. DataFrame.loc[] method is used to retrieve rows
from Pandas DataFrame. Rows can also be selected by passing
integer location to an iloc[] function.

# importing pandas package

import pandas as pd
# making data frame from csv file
data = pd.read_csv("nba.csv",
index_col ="Name")

# retrieving row by loc method

first = data.loc["Avery Bradley"]
second = data.loc["R.J. Hunter"]
print(first, "\n\n\n", second)
Adding New Row
Dropping Row
Click and refer more problems in pandas
https://www.geeksforgeeks.org/dealing-with-rows-and-columns-in-pandas-data
frame/?
ref=lbp
Working with Missing Data
Checking for missing values using isnull() and notnull() :
In order to check missing values in Pandas DataFrame, we use a function isnull()
and notnull(). Both function help in checking whether a value is NaN or not.
These function can also be used in Pandas Series in order to find null values in a
series.

# importing pandas as pd
import pandas as pd
# importing numpy as np
import numpy as np
# dictionary of lists
dict = {'First Score':[100, 90, np.nan, 95],
'Second Score': [30, 45, 56, np.nan],
'Third Score':[np.nan, 40, 80, 98]}
# creating a dataframe from list
df = pd.DataFrame(dict)
# using isnull() function
df.isnull()
Querying from Data Frames

The query() method allows you to query the DataFrame.

The query() method takes a query expression as a string
parameter, which has to evaluate to either True of False.
It returns the DataFrame where the result is True according to the
query expression.
Syntax
dataframe.query(expr, inplace)
Example:
Return the rows where age is over 35:

import pandas as pd

data = {
"name": ["Sally", "Mary", "John"],
"age": [50, 40, 30]
}

df = pd.DataFrame(data)

print(df.query('age > 35'))

Step1: from google.colab import drive
drive.mount('/content/drive')

Step2:
import pandas as pd
path="/content/drive/MyDrive/CT2.csv"
df=pd.read_csv(path)
print(df.query('mark>30'))
Applying Functions to Data frames
The apply() function is used to apply a function along an axis of the
DataFrame.

Objects passed to the function are Series objects whose index is

either the DataFrame’s index (axis=0) or the DataFrame’s columns
(axis=1).

By default (result_type=None), the final return type is inferred from

the return type of the applied function. Otherwise, it depends on the
result_type argument.
Syndax:
dataframe.apply (func, axis, raw, result_type, args, kwds)
Parameters:

Returns: Series or DataFrame

Result of applying func along the given axis of the DataFrame.
Example:
Comparison between Numpy and Pandas
Speed Testing between Numpy and Pandas

For Data Scientists, Pandas and

Numpy are both essential tools in
Python.

Numpy runs vector and matrix

operations very efficiently, while
Pandas provides the R-like data
frames allowing intuitive tabular
data analysis.

A consensus is that Numpy is more

optimized for arithmetic Ref:
computations. https://towardsdatascience.com/speed-testing
-pandas-vs-numpy-ffbf80070ee7
Other Python Libraries

A Python library is a collection of related modules. It contains

bundles of code that can be used repeatedly in different
programs.

It makes Python Programming simpler and convenient for the

programmer.

As we don’t need to write the same code again and again for
different programs.

Python libraries play a very vital role in fields of Machine

Learning, Data Science, Data Visualization, etc.
1.TensorFlow: This library was developed by Google in collaboration
with the Brain Team. It is an open-source library used for high-level
computations. It is also used in machine learning and deep learning
algorithms. It contains a large number of tensor operations. Researchers
also use this Python library to solve complex computations in Mathematics
and Physics.

2.Matplotlib: This library is responsible for plotting numerical data.

And that’s why it is used in data analysis. It is also an open-source library
and plots high-defined figures like pie charts, histograms, scatterplots,
graphs, etc.

3.Pandas: Pandas are an important library for data scientists. It is an

open-source machine learning library that provides flexible high-level data
structures and a variety of analysis tools. It eases data analysis, data
manipulation, and cleaning of data. Pandas support operations like
Sorting, Re-indexing, Iteration, Concatenation, Conversion of data,
Visualizations, Aggregations, etc.
4. Numpy: The name “Numpy” stands for “Numerical Python”. It is the
commonly used library. It is a popular machine learning library that
supports large matrices and multi-dimensional data. It consists of in-
built mathematical functions for easy computations. Even libraries like
TensorFlow use Numpy internally to perform several operations on tensors.
Array Interface is one of the key features of this library.

5. SciPy: The name “SciPy” stands for “Scientific Python”. It is an open-

source library used for high-level scientific computations. This library is
built over an extension of Numpy. It works with Numpy to handle
complex computations. While Numpy allows sorting and indexing of
array data, the numerical data code is stored in SciPy. It is also widely used
by application developers and engineers.
6. Scrapy: It is an open-source library that is used for extracting data
from websites. It provides very fast web crawling and high-level
screen scraping. It can also be used for data mining and automated
testing of data.

7. Scikit-learn: It is a famous Python library to work with complex

data. Scikit-learn is an open-source library that supports machine
learning. It supports variously supervised and unsupervised
algorithms like linear regression, classification, clustering, etc. This
library works in association with Numpy and SciPy.

8. PyGame: This library provides an easy interface to the Standard

Directmedia Library (SDL) platform-independent graphics, audio, and
input libraries. It is used for developing video games using
computer graphics and audio libraries along with Python
programming language.
9. PyTorch: PyTorch is the largest machine learning library that
optimizes tensor computations. It has rich APIs to perform tensor
computations with strong GPU acceleration. It also helps to solve
application issues related to neural networks.

10. PyBrain: The name “PyBrain” stands for Python Based

Reinforcement Learning, Artificial Intelligence, and Neural
Networks library. It is an open-source library built for beginners in
the field of Machine Learning. It provides fast and easy-to-use
algorithms for machine learning tasks. It is so flexible and easily
understandable and that’s why is really helpful for developers that are
new in research fields.

Numpy ML - AI
No ratings yet
Numpy ML - AI
135 pages
Print
No ratings yet
Print
296 pages
Numpy
No ratings yet
Numpy
32 pages
Essential Python Libraries
100% (1)
Essential Python Libraries
41 pages
Areer: A Warm Welcome To Careerera Family
No ratings yet
Areer: A Warm Welcome To Careerera Family
131 pages
M3-Introduction To Numpy and Pandas
No ratings yet
M3-Introduction To Numpy and Pandas
55 pages
Unit - V
100% (1)
Unit - V
75 pages
Unit-V Python - BCC402
No ratings yet
Unit-V Python - BCC402
20 pages
Numpy
No ratings yet
Numpy
64 pages
Getting Started With NumPy in Data Analytics
No ratings yet
Getting Started With NumPy in Data Analytics
45 pages
Unit 5
No ratings yet
Unit 5
75 pages
Unit - V
No ratings yet
Unit - V
75 pages
Numpy Operations
No ratings yet
Numpy Operations
55 pages
Numpy Basics
No ratings yet
Numpy Basics
66 pages
Python Notes (Code With Harry)
100% (11)
Python Notes (Code With Harry)
18 pages
Numpy
No ratings yet
Numpy
38 pages
Array in Python
No ratings yet
Array in Python
33 pages
Unit 3 - Numpy - VP
No ratings yet
Unit 3 - Numpy - VP
53 pages
Python 5th Sem
No ratings yet
Python 5th Sem
33 pages
Python Numpy
No ratings yet
Python Numpy
20 pages
Unit 5
No ratings yet
Unit 5
40 pages
Unit 5
No ratings yet
Unit 5
19 pages
Scientific Computing
No ratings yet
Scientific Computing
24 pages
NumPy - The Absolute Basics For Beginners - NumPy v1.23 Manual
No ratings yet
NumPy - The Absolute Basics For Beginners - NumPy v1.23 Manual
29 pages
Ntroduction To Um Y: "The Goal Is To Turn Data Into Information, and Information Into Insight." - Carly Fiorina
No ratings yet
Ntroduction To Um Y: "The Goal Is To Turn Data Into Information, and Information Into Insight." - Carly Fiorina
28 pages
Numpy and Pandas
No ratings yet
Numpy and Pandas
28 pages
Unit 7 - Python Libraries
No ratings yet
Unit 7 - Python Libraries
22 pages
Lab-3 AI
No ratings yet
Lab-3 AI
21 pages
Data Science - Sec5
No ratings yet
Data Science - Sec5
16 pages
Numpy New
No ratings yet
Numpy New
16 pages
Numpy Tutorial
No ratings yet
Numpy Tutorial
19 pages
Lecture 2 - NumPy I
No ratings yet
Lecture 2 - NumPy I
12 pages
Satish Dangi
No ratings yet
Satish Dangi
13 pages
Unit 2
No ratings yet
Unit 2
21 pages
Scipy, Matplotlib, Pandas
No ratings yet
Scipy, Matplotlib, Pandas
16 pages
NumPy Class 11th
No ratings yet
NumPy Class 11th
10 pages
PP&DS 3
No ratings yet
PP&DS 3
109 pages
Chapter 2
No ratings yet
Chapter 2
32 pages
Python Module 5
No ratings yet
Python Module 5
43 pages
Kuliah #7 Alprog - Numpy, Pandas, Matplotlib
No ratings yet
Kuliah #7 Alprog - Numpy, Pandas, Matplotlib
48 pages
Self Numpy
No ratings yet
Self Numpy
6 pages
Unit - Iii
No ratings yet
Unit - Iii
79 pages
NUMPYA03
No ratings yet
NUMPYA03
36 pages
Introduction To NumPy
No ratings yet
Introduction To NumPy
5 pages
Python For Engineers - Unit III - Notes
No ratings yet
Python For Engineers - Unit III - Notes
37 pages
NumPy Notes
No ratings yet
NumPy Notes
13 pages
Numpy, Pandas and Matplotlib
No ratings yet
Numpy, Pandas and Matplotlib
60 pages
Numpy
No ratings yet
Numpy
14 pages
Unit 3
No ratings yet
Unit 3
42 pages
Lecture 2 - NumPy I
No ratings yet
Lecture 2 - NumPy I
11 pages
Numpy in Python
No ratings yet
Numpy in Python
34 pages
Lecture 2 - NumPy I
No ratings yet
Lecture 2 - NumPy I
12 pages
Numpy
No ratings yet
Numpy
27 pages
Answer 02
No ratings yet
Answer 02
4 pages
NumPy Notes
No ratings yet
NumPy Notes
15 pages
Numpy - Pandas
No ratings yet
Numpy - Pandas
26 pages
NUMPY
No ratings yet
NUMPY
33 pages
05-Unit-V Python Lecture Notes
No ratings yet
05-Unit-V Python Lecture Notes
14 pages
Unit 3 Numpy
No ratings yet
Unit 3 Numpy
23 pages
Airline Reservation System
100% (2)
Airline Reservation System
110 pages
Airflow User Guide
No ratings yet
Airflow User Guide
444 pages
Num Py
No ratings yet
Num Py
8 pages
Gym Management System
No ratings yet
Gym Management System
49 pages
Blue and Yellow Introduction To Media and Information Literacy Presentation
No ratings yet
Blue and Yellow Introduction To Media and Information Literacy Presentation
32 pages
NSX NAPP Automation Guide v0.5
No ratings yet
NSX NAPP Automation Guide v0.5
63 pages
Light Sales Pitch Presentation
No ratings yet
Light Sales Pitch Presentation
21 pages
PHP String Functions
No ratings yet
PHP String Functions
8 pages
Calculator
No ratings yet
Calculator
11 pages
Agile Scrum Master
No ratings yet
Agile Scrum Master
16 pages
Iteration: Chapter Goals
No ratings yet
Iteration: Chapter Goals
46 pages
JDBC ServLet
No ratings yet
JDBC ServLet
14 pages
v03 NBU83ADM - Lab 03 NetBackup Storage Linux
No ratings yet
v03 NBU83ADM - Lab 03 NetBackup Storage Linux
39 pages
Cobol HTML
No ratings yet
Cobol HTML
6 pages
Cbse Notes
No ratings yet
Cbse Notes
14 pages
Introduction To SAP Enhancements
No ratings yet
Introduction To SAP Enhancements
19 pages
GUI Design in C++ 2 PDF
No ratings yet
GUI Design in C++ 2 PDF
26 pages
Front-End Developer With React and Material UI
No ratings yet
Front-End Developer With React and Material UI
1 page
Summer Internship (17-110)
No ratings yet
Summer Internship (17-110)
11 pages
Workspace One Editions Comparison
No ratings yet
Workspace One Editions Comparison
12 pages
LAB2
No ratings yet
LAB2
4 pages
Exception Handling
No ratings yet
Exception Handling
10 pages
SAP Closing Cockpit Upload
100% (1)
SAP Closing Cockpit Upload
21 pages
Community - Docker - Ansible Community Documentation
No ratings yet
Community - Docker - Ansible Community Documentation
5 pages
GLSL Essentials Sample Chapter
No ratings yet
GLSL Essentials Sample Chapter
26 pages
Chapter 19 Notes
No ratings yet
Chapter 19 Notes
8 pages
The Top 10 Free Ide For Java Coding
No ratings yet
The Top 10 Free Ide For Java Coding
4 pages
Comparing and Evaluating The Performance of Inter Process Communication Models in Linux Environment
No ratings yet
Comparing and Evaluating The Performance of Inter Process Communication Models in Linux Environment
5 pages
Resume
No ratings yet
Resume
1 page
NR-410209-Principles of Software Engineering
No ratings yet
NR-410209-Principles of Software Engineering
5 pages
The Numpy Pocketbook: Essentials on the Go
From Everand
The Numpy Pocketbook: Essentials on the Go
Silas Meadowlark
No ratings yet

Unit - V

Uploaded by

Unit - V

Uploaded by

INSTITUTE OF SCIENCE

Python Datascience Handbook, Oreilly,Jake VanderPlas, 2017.

Python For Beginners, Timothy C.Needham,2019. [Chapters 1

Introduction to Pandas - Creating Series Objects, Data Frame

PROGRAMMING FOR PROBLEM SOLVING

Higher Dimensional Arrays

Access 2-D Arrays

From both elements, return index 2:

The Difference Between Copy and View

Contents of ndarray object can be accessed and modified by

Descriptive statistics fall into two general categories:

Syntax : numpy.percentile(arr, n, axis=None, out=None)

Variance is the sum of squares of differences between all numbers

calculate the variance by using numpy.var() function

# Python program to get variance of a list Output:

# Taking a list of elements

# Calculating variance using var()

Once Pandas is installed, import it in your applications by adding

Creating Series Objects

calories = {"day1": 420, "day2": 380, "day3": 390}

calories = {"day1": 420, "day2": 380, "day3": 390}

myvar = pd.Series(calories, index = ["day1", "day2"])

A Data frame is a two-dimensional

Pandas DataFrame consists of

Pandas use the loc attribute to return one or more specified

#refer to the row index:

df = pd.DataFrame(data, index = ["day1", "day2", "day3"])

#refer to the named index:

• In our examples we will be using a CSV file called 'data.csv'.

• You can check your system's maximum rows with the

# Calling DataFrame constructor on list

Column Selection: In Order to select a column in Pandas

# Define a dictionary containing employee data

# Convert the dictionary into DataFrame

# select two columns

# importing pandas package

# retrieving row by loc method

The query() method allows you to query the DataFrame.

print(df.query('age > 35'))

Objects passed to the function are Series objects whose index is

By default (result_type=None), the final return type is inferred from

Returns: Series or DataFrame

For Data Scientists, Pandas and

Numpy runs vector and matrix

A consensus is that Numpy is more

A Python library is a collection of related modules. It contains

It makes Python Programming simpler and convenient for the

Python libraries play a very vital role in fields of Machine

2.Matplotlib: This library is responsible for plotting numerical data.

3.Pandas: Pandas are an important library for data scientists. It is an

5. SciPy: The name “SciPy” stands for “Scientific Python”. It is an open-

7. Scikit-learn: It is a famous Python library to work with complex

8. PyGame: This library provides an easy interface to the Standard

10. PyBrain: The name “PyBrain” stands for Python Based

You might also like