0% found this document useful (0 votes)

12 views20 pages

Pandas Py

The document demonstrates the use of pandas and numpy libraries in Python for data manipulation and analysis. It includes creating DataFrames, reading from and writing to CSV files, and performing basic operations like describing data, indexing, and modifying values. Additionally, it showcases handling of large datasets and provides examples of generating random data.

Uploaded by

vinaysikarwar199

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views20 pages

Pandas Py

Uploaded by

vinaysikarwar199

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

In [1]: import numpy as np

import pandas as pd

In [2]: dict1 = {
"name":['harry', 'rohan','skillf','shubh'],
"marks":[92,34,24,17],
"city":['rampur','kolkata','barelly','antarctica']
}

In [3]: df = pd.DataFrame(dict1)

In [4]: df

Out[4]: name marks city

0 harry 92 rampur

1 rohan 34 kolkata

2 skillf 24 barelly

3 shubh 17 antarctica

In [5]: df.to_csv('friends.csv')

In [6]: df.to_csv('friends_index_false.csv ', index = False)

In [7]: # if we have millions of lines in data

In [8]: df.head(2)

Out[8]: name marks city

0 harry 92 rampur

1 rohan 34 kolkata

In [9]: df.tail(2)

Out[9]: name marks city

2 skillf 24 barelly

3 shubh 17 antarctica

In [10]: df.describe()

Loading [MathJax]/extensions/Safe.js
Out[10]: marks

count 4.00000

mean 41.75000

std 34.21866

min 17.00000

25% 22.25000

50% 29.00000

75% 48.50000

max 92.00000

In [11]: vinay = pd.read_csv('vinay.csv') # to read data

In [12]: vinay

Out[12]: Unnamed: 0.1 Unnamed: 0 train no. speed city

0 0 0 1521644 50 rampur

1 1 1 24165 34 kolkata

2 2 2 54876 24 barelly

3 3 3 5157 17 antarctica

In [13]: vinay['speed'][0] = 50

C:\Users\vinay\AppData\Local\Temp\ipykernel_12824\473427975.py:1: SettingWithCopyWarnin
g:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_

guide/indexing.html#returning-a-view-versus-a-copy
vinay['speed'][0] = 50

In [14]: vinay

Out[14]: Unnamed: 0.1 Unnamed: 0 train no. speed city

0 0 0 1521644 50 rampur

1 1 1 24165 34 kolkata

2 2 2 54876 24 barelly

3 3 3 5157 17 antarctica

In [15]: vinay.to_csv('vinay.csv')

In [16]: vinay.index = ['first','second','third','fourth']

In [17]: vinay

Loading [MathJax]/extensions/Safe.js
Out[17]: Unnamed: 0.1 Unnamed: 0 train no. speed city

first 0 0 1521644 50 rampur

second 1 1 24165 34 kolkata

third 2 2 54876 24 barelly

fourth 3 3 5157 17 antarctica

In [18]: ser = pd.Series(np.random.rand(34))

In [19]: type(ser)

pandas.core.series.Series
Out[19]:

In [20]: newdf = pd.DataFrame(np.random.rand(334,5), index=np.arange(334))

In [21]: newdf.head()

Out[21]: 0 1 2 3 4

0 0.192439 0.483302 0.182232 0.109495 0.346556

1 0.072344 0.358511 0.836136 0.389201 0.662256

2 0.351126 0.453518 0.532963 0.806051 0.880142

3 0.808912 0.194086 0.244100 0.224745 0.603455

4 0.121119 0.840377 0.933503 0.332410 0.579510

In [22]: type(newdf)

pandas.core.frame.DataFrame
Out[22]:

In [23]: newdf.describe()

Out[23]: 0 1 2 3 4

count 334.000000 334.000000 334.000000 334.000000 334.000000

mean 0.511220 0.502170 0.515231 0.514727 0.501599

std 0.289863 0.280761 0.293557 0.272481 0.291661

min 0.009534 0.000230 0.009467 0.004386 0.002222

25% 0.275510 0.284415 0.254842 0.301877 0.229439

50% 0.526640 0.513685 0.526909 0.508590 0.516764

75% 0.767359 0.739335 0.781234 0.742986 0.753503

max 0.997394 0.996811 0.999285 0.998439 0.999803

In [24]: newdf.dtypes

0 float64
Out[24]:
1 float64
2 float64
3 float64
4 float64
dtype: object

In [25]: newdf[0][1] = 'vinay'

Loading [MathJax]/extensions/Safe.js
C:\Users\vinay\AppData\Local\Temp\ipykernel_12824\4287450646.py:1: FutureWarning: Settin
g an item of incompatible dtype is deprecated and will raise in a future error of panda
s. Value 'vinay' has dtype incompatible with float64, please explicitly cast to a compat
ible dtype first.
newdf[0][1] = 'vinay'

In [26]: newdf.head()

Out[26]: 0 1 2 3 4

0 0.192439 0.483302 0.182232 0.109495 0.346556

1 vinay 0.358511 0.836136 0.389201 0.662256

2 0.351126 0.453518 0.532963 0.806051 0.880142

3 0.808912 0.194086 0.244100 0.224745 0.603455

4 0.121119 0.840377 0.933503 0.332410 0.579510

In [27]: newdf.index

Index([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
Out[27]:
...
324, 325, 326, 327, 328, 329, 330, 331, 332, 333],
dtype='int32', length=334)

In [28]: newdf.columns

RangeIndex(start=0, stop=5, step=1)

Out[28]:

In [29]: newdf.to_numpy()

array([[0.19243897629678863, 0.4833016054951558, 0.18223248119149482,

Out[29]:
0.10949487441382522, 0.346555717762674],
['vinay', 0.358511057712223, 0.8361359540599419,
0.38920071958360003, 0.6622558371512339],
[0.3511260190014004, 0.4535179465768121, 0.5329625751629071,
0.8060513324243946, 0.8801421656725747],
...,
[0.2833121022519195, 0.8041833005905062, 0.30184328883816447,
0.33450997341497823, 0.09415712001759435],
[0.6543592257723887, 0.5571194761629852, 0.24589863402724477,
0.9873811670345046, 0.7192368401412679],
[0.6643166221995344, 0.725229517706132, 0.19252707794502544,
0.38162343584405134, 0.4854364965153011]], dtype=object)

In [30]: newdf[0][0]= 0.3

In [31]: newdf.head()

Out[31]: 0 1 2 3 4

0 0.3 0.483302 0.182232 0.109495 0.346556

1 vinay 0.358511 0.836136 0.389201 0.662256

2 0.351126 0.453518 0.532963 0.806051 0.880142

3 0.808912 0.194086 0.244100 0.224745 0.603455

4 0.121119 0.840377 0.933503 0.332410 0.579510

In [32]: newdf.T

Loading [MathJax]/extensions/Safe.js
Out[32]: 0 1 2 3 4 5 6 7 8 9 ...

0 0.3 vinay 0.351126 0.808912 0.121119 0.541671 0.810778 0.013301 0.970215 0.834933 ... 0.443

1 0.483302 0.358511 0.453518 0.194086 0.840377 0.332581 0.49378 0.546343 0.357016 0.844727 ... 0.215

2 0.182232 0.836136 0.532963 0.2441 0.933503 0.743576 0.173255 0.78586 0.456049 0.842426 ... 0.821

3 0.109495 0.389201 0.806051 0.224745 0.33241 0.498823 0.027296 0.580119 0.22295 0.937127 ... 0.761

4 0.346556 0.662256 0.880142 0.603455 0.57951 0.498658 0.963489 0.033478 0.524955 0.784691 ... 0.611

5 rows × 334 columns

In [33]: newdf.head()

Out[33]: 0 1 2 3 4

0 0.3 0.483302 0.182232 0.109495 0.346556

1 vinay 0.358511 0.836136 0.389201 0.662256

2 0.351126 0.453518 0.532963 0.806051 0.880142

3 0.808912 0.194086 0.244100 0.224745 0.603455

4 0.121119 0.840377 0.933503 0.332410 0.579510

In [34]: newdf.sort_index(axis=0, ascending=False)

Out[34]: 0 1 2 3 4

333 0.664317 0.725230 0.192527 0.381623 0.485436

332 0.654359 0.557119 0.245899 0.987381 0.719237

331 0.283312 0.804183 0.301843 0.334510 0.094157

330 0.168163 0.853079 0.751411 0.833227 0.176438

329 0.759106 0.047294 0.450999 0.568085 0.224133

... ... ... ... ... ...

4 0.121119 0.840377 0.933503 0.332410 0.579510

3 0.808912 0.194086 0.244100 0.224745 0.603455

2 0.351126 0.453518 0.532963 0.806051 0.880142

1 vinay 0.358511 0.836136 0.389201 0.662256

0 0.3 0.483302 0.182232 0.109495 0.346556

334 rows × 5 columns

In [35]: newdf.head()

Out[35]: 0 1 2 3 4

0 0.3 0.483302 0.182232 0.109495 0.346556

1 vinay 0.358511 0.836136 0.389201 0.662256

2 0.351126 0.453518 0.532963 0.806051 0.880142

3 0.808912 0.194086 0.244100 0.224745 0.603455

4 0.121119 0.840377 0.933503 0.332410 0.579510

Loading [MathJax]/extensions/Safe.js
In [36]: type(newdf[0])

pandas.core.series.Series
Out[36]:

In [37]: newdf2 = newdf #Newdf2 is only a view , will not copy

In [38]: newdf2[0][0]= 5498

In [39]: newdf

Out[39]: 0 1 2 3 4

0 5498 0.483302 0.182232 0.109495 0.346556

1 vinay 0.358511 0.836136 0.389201 0.662256

2 0.351126 0.453518 0.532963 0.806051 0.880142

3 0.808912 0.194086 0.244100 0.224745 0.603455

4 0.121119 0.840377 0.933503 0.332410 0.579510

... ... ... ... ... ...

329 0.759106 0.047294 0.450999 0.568085 0.224133

330 0.168163 0.853079 0.751411 0.833227 0.176438

331 0.283312 0.804183 0.301843 0.334510 0.094157

332 0.654359 0.557119 0.245899 0.987381 0.719237

333 0.664317 0.725230 0.192527 0.381623 0.485436

334 rows × 5 columns

In [40]: # to copy

In [41]: newdf2 = newdf.copy()

In [42]: newdf2[0][0] = 2

C:\Users\vinay\AppData\Local\Temp\ipykernel_12824\2252306501.py:1: SettingWithCopyWarnin
g:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_

guide/indexing.html#returning-a-view-versus-a-copy
newdf2[0][0] = 2

In [43]: newdf

Loading [MathJax]/extensions/Safe.js
Out[43]: 0 1 2 3 4

0 5498 0.483302 0.182232 0.109495 0.346556

1 vinay 0.358511 0.836136 0.389201 0.662256

2 0.351126 0.453518 0.532963 0.806051 0.880142

3 0.808912 0.194086 0.244100 0.224745 0.603455

4 0.121119 0.840377 0.933503 0.332410 0.579510

... ... ... ... ... ...

329 0.759106 0.047294 0.450999 0.568085 0.224133

330 0.168163 0.853079 0.751411 0.833227 0.176438

331 0.283312 0.804183 0.301843 0.334510 0.094157

332 0.654359 0.557119 0.245899 0.987381 0.719237

333 0.664317 0.725230 0.192527 0.381623 0.485436

334 rows × 5 columns

In [44]: newdf.loc[0,2] = 654

In [45]: newdf.head(3)

Out[45]: 0 1 2 3 4

0 5498 0.483302 654.000000 0.109495 0.346556

1 vinay 0.358511 0.836136 0.389201 0.662256

2 0.351126 0.453518 0.532963 0.806051 0.880142

In [46]: newdf.columns = list('ABCDE')

In [47]: newdf.head()

Out[47]: A B C D E

0 5498 0.483302 654.000000 0.109495 0.346556

1 vinay 0.358511 0.836136 0.389201 0.662256

2 0.351126 0.453518 0.532963 0.806051 0.880142

3 0.808912 0.194086 0.244100 0.224745 0.603455

4 0.121119 0.840377 0.933503 0.332410 0.579510

In [48]: newdf.loc[0,0] = 654

newdf

Loading [MathJax]/extensions/Safe.js
Out[48]: A B C D E 0

0 5498 0.483302 654.000000 0.109495 0.346556 654.0

1 vinay 0.358511 0.836136 0.389201 0.662256 NaN

2 0.351126 0.453518 0.532963 0.806051 0.880142 NaN

3 0.808912 0.194086 0.244100 0.224745 0.603455 NaN

4 0.121119 0.840377 0.933503 0.332410 0.579510 NaN

... ... ... ... ... ... ...

329 0.759106 0.047294 0.450999 0.568085 0.224133 NaN

330 0.168163 0.853079 0.751411 0.833227 0.176438 NaN

331 0.283312 0.804183 0.301843 0.334510 0.094157 NaN

332 0.654359 0.557119 0.245899 0.987381 0.719237 NaN

333 0.664317 0.725230 0.192527 0.381623 0.485436 NaN

334 rows × 6 columns

In [49]: newdf.loc[1,'A'] = 654541

newdf

Out[49]: A B C D E 0

0 5498 0.483302 654.000000 0.109495 0.346556 654.0

1 654541 0.358511 0.836136 0.389201 0.662256 NaN

2 0.351126 0.453518 0.532963 0.806051 0.880142 NaN

3 0.808912 0.194086 0.244100 0.224745 0.603455 NaN

4 0.121119 0.840377 0.933503 0.332410 0.579510 NaN

... ... ... ... ... ... ...

329 0.759106 0.047294 0.450999 0.568085 0.224133 NaN

330 0.168163 0.853079 0.751411 0.833227 0.176438 NaN

331 0.283312 0.804183 0.301843 0.334510 0.094157 NaN

332 0.654359 0.557119 0.245899 0.987381 0.719237 NaN

333 0.664317 0.725230 0.192527 0.381623 0.485436 NaN

334 rows × 6 columns

In [50]: newdf = newdf.drop(1, axis=1)

Loading [MathJax]/extensions/Safe.js
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
Cell In[50], line 1
----> 1 newdf = newdf.drop(1, axis=1)

File ~\anaconda3\Lib\site-packages\pandas\core\frame.py:5347, in DataFrame.drop(self, la

bels, axis, index, columns, level, inplace, errors)
5199 def drop(
5200 self,
5201 labels: IndexLabel | None = None,
(...)
5208 errors: IgnoreRaise = "raise",
5209 ) -> DataFrame | None:
5210 """
5211 Drop specified labels from rows or columns.
5212
(...)
5345 weight 1.0 0.8
5346 """
-> 5347 return super().drop(
5348 labels=labels,
5349 axis=axis,
5350 index=index,
5351 columns=columns,
5352 level=level,
5353 inplace=inplace,
5354 errors=errors,
5355 )

File ~\anaconda3\Lib\site-packages\pandas\core\generic.py:4711, in NDFrame.drop(self, la

bels, axis, index, columns, level, inplace, errors)
4709 for axis, labels in axes.items():
4710 if labels is not None:
-> 4711 obj = obj._drop_axis(labels, axis, level=level, errors=errors)
4713 if inplace:
4714 self._update_inplace(obj)

File ~\anaconda3\Lib\site-packages\pandas\core\generic.py:4753, in NDFrame._drop_axis(se

lf, labels, axis, level, errors, only_slice)
4751 new_axis = axis.drop(labels, level=level, errors=errors)
4752 else:
-> 4753 new_axis = axis.drop(labels, errors=errors)
4754 indexer = axis.get_indexer(new_axis)
4756 # Case for non-unique axis
4757 else:

File ~\anaconda3\Lib\site-packages\pandas\core\indexes\base.py:6992, in Index.drop(self,

labels, errors)
6990 if mask.any():
6991 if errors != "ignore":
-> 6992 raise KeyError(f"{labels[mask].tolist()} not found in axis")
6993 indexer = indexer[~mask]
6994 return self.delete(indexer)

KeyError: '[1] not found in axis'

In [51]: newdf.head()

Loading [MathJax]/extensions/Safe.js
Out[51]: A B C D E 0

0 5498 0.483302 654.000000 0.109495 0.346556 654.0

1 654541 0.358511 0.836136 0.389201 0.662256 NaN

2 0.351126 0.453518 0.532963 0.806051 0.880142 NaN

3 0.808912 0.194086 0.244100 0.224745 0.603455 NaN

4 0.121119 0.840377 0.933503 0.332410 0.579510 NaN

In [52]: newdf.loc[[1,2],['C','D']]

Out[52]: C D

1 0.836136 0.389201

2 0.532963 0.806051

In [53]: newdf.head()

Out[53]: A B C D E 0

0 5498 0.483302 654.000000 0.109495 0.346556 654.0

1 654541 0.358511 0.836136 0.389201 0.662256 NaN

2 0.351126 0.453518 0.532963 0.806051 0.880142 NaN

3 0.808912 0.194086 0.244100 0.224745 0.603455 NaN

4 0.121119 0.840377 0.933503 0.332410 0.579510 NaN

In [54]: newdf.loc[[1,2],:]

Out[54]: A B C D E 0

1 654541 0.358511 0.836136 0.389201 0.662256 NaN

2 0.351126 0.453518 0.532963 0.806051 0.880142 NaN

In [55]: newdf.loc[:,['C','D']]

Loading [MathJax]/extensions/Safe.js
Out[55]: C D

0 654.000000 0.109495

1 0.836136 0.389201

2 0.532963 0.806051

3 0.244100 0.224745

4 0.933503 0.332410

... ... ...

329 0.450999 0.568085

330 0.751411 0.833227

331 0.301843 0.334510

332 0.245899 0.987381

333 0.192527 0.381623

334 rows × 2 columns

In [56]: newdf.loc[(newdf['A']<0.3)]

Out[56]: A B C D E 0

4 0.121119 0.840377 0.933503 0.332410 0.579510 NaN

7 0.013301 0.546343 0.785860 0.580119 0.033478 NaN

10 0.135702 0.660754 0.382900 0.996195 0.144280 NaN

11 0.119561 0.370444 0.343563 0.792946 0.889031 NaN

17 0.264975 0.796818 0.150061 0.508361 0.895146 NaN

... ... ... ... ... ... ...

322 0.158642 0.768554 0.455983 0.236494 0.321771 NaN

323 0.024951 0.461243 0.380886 0.816249 0.067329 NaN

327 0.2804 0.002557 0.094892 0.759649 0.311843 NaN

330 0.168163 0.853079 0.751411 0.833227 0.176438 NaN

331 0.283312 0.804183 0.301843 0.334510 0.094157 NaN

92 rows × 6 columns

In [57]: newdf.loc[(newdf['A']<0.3) & newdf['C']>0.1]

Loading [MathJax]/extensions/Safe.js
Out[57]: A B C D E 0

4 0.121119 0.840377 0.933503 0.332410 0.579510 NaN

7 0.013301 0.546343 0.785860 0.580119 0.033478 NaN

10 0.135702 0.660754 0.382900 0.996195 0.144280 NaN

11 0.119561 0.370444 0.343563 0.792946 0.889031 NaN

17 0.264975 0.796818 0.150061 0.508361 0.895146 NaN

... ... ... ... ... ... ...

322 0.158642 0.768554 0.455983 0.236494 0.321771 NaN

323 0.024951 0.461243 0.380886 0.816249 0.067329 NaN

327 0.2804 0.002557 0.094892 0.759649 0.311843 NaN

330 0.168163 0.853079 0.751411 0.833227 0.176438 NaN

331 0.283312 0.804183 0.301843 0.334510 0.094157 NaN

92 rows × 6 columns

In [58]: newdf.head(2)

Out[58]: A B C D E 0

0 5498 0.483302 654.000000 0.109495 0.346556 654.0

1 654541 0.358511 0.836136 0.389201 0.662256 NaN

In [59]: newdf.iloc[0,4]

0.346555717762674
Out[59]:

In [60]: newdf.iloc[[0,5],[1,2]]

Out[60]: B C

0 0.483302 654.000000

5 0.332581 0.743576

In [61]: newdf.head(3)

Out[61]: A B C D E 0

0 5498 0.483302 654.000000 0.109495 0.346556 654.0

1 654541 0.358511 0.836136 0.389201 0.662256 NaN

2 0.351126 0.453518 0.532963 0.806051 0.880142 NaN

In [62]: newdf.drop([0])

Loading [MathJax]/extensions/Safe.js
Out[62]: A B C D E 0

1 654541 0.358511 0.836136 0.389201 0.662256 NaN

2 0.351126 0.453518 0.532963 0.806051 0.880142 NaN

3 0.808912 0.194086 0.244100 0.224745 0.603455 NaN

4 0.121119 0.840377 0.933503 0.332410 0.579510 NaN

5 0.541671 0.332581 0.743576 0.498823 0.498658 NaN

... ... ... ... ... ... ...

329 0.759106 0.047294 0.450999 0.568085 0.224133 NaN

330 0.168163 0.853079 0.751411 0.833227 0.176438 NaN

331 0.283312 0.804183 0.301843 0.334510 0.094157 NaN

332 0.654359 0.557119 0.245899 0.987381 0.719237 NaN

333 0.664317 0.725230 0.192527 0.381623 0.485436 NaN

333 rows × 6 columns

In [63]: newdf.head(2)

Out[63]: A B C D E 0

0 5498 0.483302 654.000000 0.109495 0.346556 654.0

1 654541 0.358511 0.836136 0.389201 0.662256 NaN

In [64]: newdf.iloc[0,4]

0.346555717762674
Out[64]:

In [65]: newdf.iloc[[0,1],[1,2]]

Out[65]: B C

0 0.483302 654.000000

1 0.358511 0.836136

In [66]: newdf.head(3)

Out[66]: A B C D E 0

0 5498 0.483302 654.000000 0.109495 0.346556 654.0

1 654541 0.358511 0.836136 0.389201 0.662256 NaN

2 0.351126 0.453518 0.532963 0.806051 0.880142 NaN

In [67]: newdf.drop([0])

Loading [MathJax]/extensions/Safe.js
Out[67]: A B C D E 0

1 654541 0.358511 0.836136 0.389201 0.662256 NaN

2 0.351126 0.453518 0.532963 0.806051 0.880142 NaN

3 0.808912 0.194086 0.244100 0.224745 0.603455 NaN

4 0.121119 0.840377 0.933503 0.332410 0.579510 NaN

5 0.541671 0.332581 0.743576 0.498823 0.498658 NaN

... ... ... ... ... ... ...

329 0.759106 0.047294 0.450999 0.568085 0.224133 NaN

330 0.168163 0.853079 0.751411 0.833227 0.176438 NaN

331 0.283312 0.804183 0.301843 0.334510 0.094157 NaN

332 0.654359 0.557119 0.245899 0.987381 0.719237 NaN

333 0.664317 0.725230 0.192527 0.381623 0.485436 NaN

333 rows × 6 columns

In [69]: newdf.drop(['A','C'],axis=1) # newdf is not affected

Out[69]: B D E 0

0 0.483302 0.109495 0.346556 654.0

1 0.358511 0.389201 0.662256 NaN

2 0.453518 0.806051 0.880142 NaN

3 0.194086 0.224745 0.603455 NaN

4 0.840377 0.332410 0.579510 NaN

... ... ... ... ...

329 0.047294 0.568085 0.224133 NaN

330 0.853079 0.833227 0.176438 NaN

331 0.804183 0.334510 0.094157 NaN

332 0.557119 0.987381 0.719237 NaN

333 0.725230 0.381623 0.485436 NaN

334 rows × 4 columns

In [74]: newdf.drop([1,5], axis=0, inplace= True) # It will delete from newdf

#-> It will return to the newdf

In [75]: newdf.head(3)

Out[75]: A B C D E 0

0 5498 0.483302 654.000000 0.109495 0.346556 654.0

2 0.808912 0.194086 0.244100 0.224745 0.603455 NaN

3 0.121119 0.840377 0.933503 0.332410 0.579510 NaN

In [76]: newdf.reset_index(drop=True, inplace=True)

Loading [MathJax]/extensions/Safe.js
In [77]: newdf.head(3)

Out[77]: A B C D E 0

0 5498 0.483302 654.000000 0.109495 0.346556 654.0

1 0.808912 0.194086 0.244100 0.224745 0.603455 NaN

2 0.121119 0.840377 0.933503 0.332410 0.579510 NaN

In [78]: newdf.loc[:, ['B']]= 5

In [80]: newdf.head()

Out[80]: A B C D E 0

0 5498 5.0 654.000000 0.109495 0.346556 654.0

1 0.808912 5.0 0.244100 0.224745 0.603455 NaN

2 0.121119 5.0 0.933503 0.332410 0.579510 NaN

3 0.810778 5.0 0.173255 0.027296 0.963489 NaN

4 0.970215 5.0 0.456049 0.222950 0.524955 NaN

In [ ]:

NUMPY
In [81]: import numpy as np

In [91]: myarr = np.array([[14,6,32,7]], np.int8)

myarr

# By np.int_size we define the or set the limit how much we want the size it may be 8,32

array([[14, 6, 32, 7]], dtype=int8)

Out[91]:

In [92]: myarr.shape

(1, 4)
Out[92]:

In [93]: myarr.dtype

dtype('int8')
Out[93]:

In [94]: myarr[0,1]

6
Out[94]:

In [95]: myarr[0,1] =45

myarr

Loading [MathJax]/extensions/Safe.js
array([[14, 45, 32, 7]], dtype=int8)
Out[95]:

Array creation: Conversion from other python structures

In [96]: listarry = np.array([[1,2,3],[8,6,4],[2,6,7]])

In [97]: listarry

array([[1, 2, 3],
Out[97]:
[8, 6, 4],
[2, 6, 7]])

In [99]: listarry.shape

(3, 3)
Out[99]:

In [100… listarry.size

9
Out[100]:

In [102… zeros = np.zeros((2,5))

In [103… zeros

array([[0., 0., 0., 0., 0.],

Out[103]:
[0., 0., 0., 0., 0.]])

In [105… rng = np.arange(15)

rng

array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14])

Out[105]:

In [109… ispace = np.linspace(1,5,9)

ispace

array([1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5, 5. ])

Out[109]:

In [112… emp = np.empty((4,6))

emp

array([[6.23042070e-307, 1.86918699e-306, 1.69121096e-306,

Out[112]:
1.33511562e-306, 7.56587585e-307, 1.12503450e-311],
[0.00000000e+000, 0.00000000e+000, 0.00000000e+000,
0.00000000e+000, 0.00000000e+000, 0.00000000e+000],
[0.00000000e+000, 0.00000000e+000, 0.00000000e+000,
nan, 0.00000000e+000, 0.00000000e+000],
[0.00000000e+000, 0.00000000e+000, 0.00000000e+000,
0.00000000e+000, 8.34451715e-308, 2.22507386e-306]])

In [114… emp_like = np.empty_like(ispace)

emp_like

array([1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5, 5. ])

Out[114]:

In [116… ide = np.identity(45)

ide

Loading [MathJax]/extensions/Safe.js
array([[1., 0., 0., ..., 0., 0., 0.],
Out[116]:
[0., 1., 0., ..., 0., 0., 0.],
[0., 0., 1., ..., 0., 0., 0.],
...,
[0., 0., 0., ..., 1., 0., 0.],
[0., 0., 0., ..., 0., 1., 0.],
[0., 0., 0., ..., 0., 0., 1.]])

In [117… ide.shape

(45, 45)
Out[117]:

In [119… arr = np.arange(99)

arr

array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,

Out[119]:
17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67,
68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84,
85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98])

In [120… arr.reshape(3,33)

array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,

Out[120]:
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
32],
[33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48,
49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64,
65],
[66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81,
82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97,
98]])

In [121… arr.reshape(3,31)

---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[121], line 1
----> 1 arr.reshape(3,31)

ValueError: cannot reshape array of size 99 into shape (3,31)

In [122… arr.ravel()

array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,

Out[122]:
17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67,
68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84,
85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98])

In [123… x = [[1,2,3],[4,5,6],[7,1,0]]

In [126… ar = np.array(x)
ar

array([[1, 2, 3],
Out[126]:
[4, 5, 6],
[7, 1, 0]])

In [127… ar.sum(axis=0)

array([12, 8, 9])
Out[127]:
Loading [MathJax]/extensions/Safe.js
In [128… ar.sum(axis=1)

array([ 6, 15, 8])

Out[128]:

In [130… ar.T

array([[1, 4, 7],
Out[130]:
[2, 5, 1],
[3, 6, 0]])

In [131… ar.flat

<numpy.flatiter at 0x21230427be0>
Out[131]:

In [132… for item in ar.flat:

print(item)

1
2
3
4
5
6
7
1
0

In [134… ar.ndim # No. of dimensions

2
Out[134]:

In [135… ar.size

9
Out[135]:

In [136… ar.nbytes

36
Out[136]:

In [137… one = np.array([1,3,4,634,2])

In [140… one.argmax() # Returns index

3
Out[140]:

In [142… one.argmin()

0
Out[142]:

In [143… one.argsort()

array([0, 4, 1, 2, 3], dtype=int64)

Out[143]:

In [144… ar

array([[1, 2, 3],
Out[144]:
[4, 5, 6],
[7, 1, 0]])

In [146… ar.argmin()
Loading [MathJax]/extensions/Safe.js
8
Out[146]:

In [147… ar.argmax(axis=0)

array([2, 1, 1], dtype=int64)

Out[147]:

In [148… ar.argmax(axis=1)

array([2, 2, 0], dtype=int64)

Out[148]:

In [149… ar.argsort(axis=0)

array([[0, 2, 2],
Out[149]:
[1, 0, 0],
[2, 1, 1]], dtype=int64)

In [150… ar.ravel()

array([1, 2, 3, 4, 5, 6, 7, 1, 0])
Out[150]:

In [151… ar.reshape((9,1))

array([[1],
Out[151]:
[2],
[3],
[4],
[5],
[6],
[7],
[1],
[0]])

In [152… ar

array([[1, 2, 3],
Out[152]:
[4, 5, 6],
[7, 1, 0]])

In [157… ar2 = np.array([[1,2,1],[8,5,12],[4,0,6]])

ar2

array([[ 1, 2, 1],
Out[157]:
[ 8, 5, 12],
[ 4, 0, 6]])

In [156… ar + ar2

array([[ 2, 4, 4],
Out[156]:
[12, 10, 18],
[11, 1, 6]])

In [158… ar * ar2

array([[ 1, 4, 3],
Out[158]:
[32, 25, 72],
[28, 0, 0]])

In [159… np.sqrt(ar)

array([[1. , 1.41421356, 1.73205081],

Out[159]:
[2. , 2.23606798, 2.44948974],
[2.64575131, 1. , 0. ]])

In [160… ar.sum()
Loading [MathJax]/extensions/Safe.js
29
Out[160]:

In [161… ar.max()

7
Out[161]:

In [162… ar.min()

0
Out[162]:

In [163… ar

array([[1, 2, 3],
Out[163]:
[4, 5, 6],
[7, 1, 0]])

In [164… np.where(ar>5)

(array([1, 2], dtype=int64), array([2, 0], dtype=int64))

Out[164]:

In [165… np.count_nonzero(ar)

8
Out[165]:

In [166… np.nonzero(ar)

(array([0, 0, 0, 1, 1, 1, 2, 2], dtype=int64),

Out[166]:
array([0, 1, 2, 0, 1, 2, 0, 1], dtype=int64))

In [167… ar[1,2] = 0

In [168… np.nonzero(ar)

(array([0, 0, 0, 1, 1, 2, 2], dtype=int64),

Out[168]:
array([0, 1, 2, 0, 1, 0, 1], dtype=int64))

In [169… import sys

In [170… py_ar = [0,4,55,2]

In [171… np_ar = np.array(py_ar)

In [172… sys.getsizeof(1)*len(py_ar)

112
Out[172]:

In [174… np_ar.itemsize * np_ar.size

16
Out[174]:

The above two are showing that numpy saves the space

In [ ]:

Loading [MathJax]/extensions/Safe.js

Dsbda Assignment 1
No ratings yet
Dsbda Assignment 1
5 pages
Practical File Ip
No ratings yet
Practical File Ip
27 pages
Pandas Part-2
No ratings yet
Pandas Part-2
9 pages
Week 3 GGG
No ratings yet
Week 3 GGG
17 pages
Python DataFrame Techniques
No ratings yet
Python DataFrame Techniques
10 pages
Machine Learning Group Project
No ratings yet
Machine Learning Group Project
22 pages
Pandas Guide for Beginners
No ratings yet
Pandas Guide for Beginners
18 pages
Lab2 Day8 23BCSA84 AssignmentSolution
No ratings yet
Lab2 Day8 23BCSA84 AssignmentSolution
7 pages
Ds Pract 5 Data Analytics1 Vedanti
No ratings yet
Ds Pract 5 Data Analytics1 Vedanti
7 pages
Assignments IP Class 12
No ratings yet
Assignments IP Class 12
9 pages
Pandas for Data Science Beginners
No ratings yet
Pandas for Data Science Beginners
21 pages
DSP Lec6
No ratings yet
DSP Lec6
10 pages
GR12 Record Programs 6TH Onwards
No ratings yet
GR12 Record Programs 6TH Onwards
18 pages
Pandas DataFrame and Series Operations
No ratings yet
Pandas DataFrame and Series Operations
74 pages
Pandas - Ipynb - Colab
No ratings yet
Pandas - Ipynb - Colab
22 pages
Dsbda Exp4 Part1
No ratings yet
Dsbda Exp4 Part1
39 pages
Second
No ratings yet
Second
4 pages
Answers Practical File
No ratings yet
Answers Practical File
19 pages
Revision Notes DataFrame XII IP
No ratings yet
Revision Notes DataFrame XII IP
8 pages
Panda Merged
No ratings yet
Panda Merged
19 pages
Pandas
No ratings yet
Pandas
8 pages
DAR CompleteFile 1
No ratings yet
DAR CompleteFile 1
41 pages
12 Pandas
100% (1)
12 Pandas
21 pages
Data Cleaning
No ratings yet
Data Cleaning
22 pages
9.9.24 Revision
No ratings yet
9.9.24 Revision
9 pages
Numpy Boolean Indexing: Filter
No ratings yet
Numpy Boolean Indexing: Filter
39 pages
Prac3.ipynb (Auto-R) - JupyterLab
No ratings yet
Prac3.ipynb (Auto-R) - JupyterLab
6 pages
Exp 3
No ratings yet
Exp 3
10 pages
10 Minutes To Pandas
No ratings yet
10 Minutes To Pandas
26 pages
Python Pandas-DataFrames Complete - Jupyter Notebook
No ratings yet
Python Pandas-DataFrames Complete - Jupyter Notebook
34 pages
AD3301 - Data - Transformation - Ipynb - Colaboratory
No ratings yet
AD3301 - Data - Transformation - Ipynb - Colaboratory
27 pages
Numpy Dataframe
No ratings yet
Numpy Dataframe
12 pages
Dataframe
No ratings yet
Dataframe
19 pages
Merged
No ratings yet
Merged
35 pages
Pandas
No ratings yet
Pandas
24 pages
10) Merging Dataframes: # Detecting Duplicates
No ratings yet
10) Merging Dataframes: # Detecting Duplicates
7 pages
One Hot Encoding
No ratings yet
One Hot Encoding
12 pages
Ip Practical
No ratings yet
Ip Practical
23 pages
ML PROGRAMS
No ratings yet
ML PROGRAMS
55 pages
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
10 pages
Dsbda 5
No ratings yet
Dsbda 5
12 pages
CSC - 310 Advanced Python Programming Continuous Assessment-2 Assignment:Ca2
No ratings yet
CSC - 310 Advanced Python Programming Continuous Assessment-2 Assignment:Ca2
33 pages
Unit 3 Python B.SC IT
No ratings yet
Unit 3 Python B.SC IT
18 pages
Pandas DataFrame Notes
100% (1)
Pandas DataFrame Notes
10 pages
K Means
No ratings yet
K Means
15 pages
Data Frame
No ratings yet
Data Frame
11 pages
Python Pandas and DataFrame Basics
No ratings yet
Python Pandas and DataFrame Basics
20 pages
Pandas DataFrame Cheat Sheet
No ratings yet
Pandas DataFrame Cheat Sheet
4 pages
Pandas DataFrame Cheat Sheet
100% (1)
Pandas DataFrame Cheat Sheet
10 pages
EEC Notes
No ratings yet
EEC Notes
34 pages
Unit3 - 3) Pandas - Ipynb - Colab
No ratings yet
Unit3 - 3) Pandas - Ipynb - Colab
11 pages
Pandas for Data Analysis Beginners
No ratings yet
Pandas for Data Analysis Beginners
19 pages
Davp Pyq 2023 Solution
No ratings yet
Davp Pyq 2023 Solution
15 pages
PRGM 4
No ratings yet
PRGM 4
3 pages
Pandas Cheat Sheet........
No ratings yet
Pandas Cheat Sheet........
11 pages
Base Drive CKT
No ratings yet
Base Drive CKT
1 page
Line Interactive Ups
No ratings yet
Line Interactive Ups
1 page
Numerical
No ratings yet
Numerical
2 pages
Step 1: Inverter kVA Rating: Given
No ratings yet
Step 1: Inverter kVA Rating: Given
2 pages
Swayam Sir Micro
No ratings yet
Swayam Sir Micro
26 pages
Practice Questions (Unsolved)
No ratings yet
Practice Questions (Unsolved)
8 pages
Ad3251 Data Structures Design
100% (1)
Ad3251 Data Structures Design
2 pages
Radha Priyanka Resume
No ratings yet
Radha Priyanka Resume
4 pages
(RR) Summer'21 ReleaseOverviewDeck
No ratings yet
(RR) Summer'21 ReleaseOverviewDeck
254 pages
Computer 2 0 Batch RBE E Book Hindi With Latest SSC TCS Questions
No ratings yet
Computer 2 0 Batch RBE E Book Hindi With Latest SSC TCS Questions
179 pages
Harman Packaged Browser User Guide
No ratings yet
Harman Packaged Browser User Guide
13 pages
Fiction Novel Template
No ratings yet
Fiction Novel Template
1 page
Zero-Day Threats: Rising Risks & Markets
No ratings yet
Zero-Day Threats: Rising Risks & Markets
5 pages
Top 50 Django Interview Q&A
No ratings yet
Top 50 Django Interview Q&A
19 pages
Aditya
No ratings yet
Aditya
13 pages
Safetica ONE: Data Security Overview
No ratings yet
Safetica ONE: Data Security Overview
9 pages
U-WAM IFU 1908 en
No ratings yet
U-WAM IFU 1908 en
270 pages
Blueprint Template
No ratings yet
Blueprint Template
12 pages
AIX LVM Cheat Sheet for Sysadmins
No ratings yet
AIX LVM Cheat Sheet for Sysadmins
4 pages
Big Data Answers
No ratings yet
Big Data Answers
14 pages
PDF Basics CheatSheet
No ratings yet
PDF Basics CheatSheet
2 pages
Apple Deployment and Management Test Study Guide
No ratings yet
Apple Deployment and Management Test Study Guide
51 pages
Promise Surveillance Storage 201003
No ratings yet
Promise Surveillance Storage 201003
10 pages
Syncios Manager Guide
No ratings yet
Syncios Manager Guide
15 pages
A76XX Series SSL Application Note V1.03
No ratings yet
A76XX Series SSL Application Note V1.03
26 pages
Computer Aided Engineering Design CO
No ratings yet
Computer Aided Engineering Design CO
2 pages
Session-1 DataFrame
No ratings yet
Session-1 DataFrame
13 pages
Danao Supplies Manual
No ratings yet
Danao Supplies Manual
171 pages
Upgrade Deployment Guide
No ratings yet
Upgrade Deployment Guide
21 pages
Amala Mubashira
No ratings yet
Amala Mubashira
11 pages
Yash Patel: Full Stack Developer Resume
No ratings yet
Yash Patel: Full Stack Developer Resume
1 page
It Job Responsibilities
No ratings yet
It Job Responsibilities
3 pages
Computer Progra-WPS Office
No ratings yet
Computer Progra-WPS Office
4 pages
Testbank Chapter 2
100% (2)
Testbank Chapter 2
19 pages
DaVinci - The ChatGPT Virtual Assistant Instructions
No ratings yet
DaVinci - The ChatGPT Virtual Assistant Instructions
11 pages