0% found this document useful (0 votes)

15 views15 pages

Unit 2

Uploaded by

Mehak Mehta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views15 pages

Unit 2

Uploaded by

Mehak Mehta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

For most data analysis applications, the main areas of funcionality I'lfocus on are:

Fast vectorized array operations for data munging and cleaning, subsetting and
filtering, transformation, and any other kinds of computations
Common array algorithms like sorting, unique, and set operations
" Eficient descriptive statistics and aggregating/summarizing data
Data alignment and relational data manipulations for merging and joining together
heterogeneous data sets
Expressing conditional logic as array expressions instead of loops with if-elif
else branches
Group-wise data manipulations (aggregation, transformation, function applica
tion). Much more on this in Chapter 5
While NumPy provides the computational foundation for these operations, you will
likely want to use pandas as your basis for most kinds of data analysis (especially for
structured or tabular data) as it provides a rich, high-level interface making most com
mon data tasks very concise and simple. pandas also provides some more domain
specific functionality like time series manipulation, which is not present in NumPy.

In this chapter and throughout the book, I use the standard NumPy
convention of always using import numpy as np. You are, of course,
g welcome to put from numpy import *in your code to avoid having to
write np., but I would caution you against makinga habit of this.

The NumPy ndarray: AMultidimensional Array Object

One of the key features of NumPy is its N-dimensional array object, or ndarray, which
is a fast, flexible container for large data sets in Python. Arrays enable you to perform
mathematical operations on whole blocks of data using similar syntax to the equivalent
operations between scalar elements:
In [8]: data
Out [8]:
array([[ 0.9526, -0.246 -0.8856],
[0.5639, 0.2379, 0.9104]])
In data * 10 In [10]: data + data
Out9
t[9]: Out[10]:
array([[ 9.5256, -2.4601, -8.8565], array([[ 1.9051, -0.492 , -1.7713],
[5.6385, 2.3794, 9.104 ])) [ 1.1277, 0.4759, 1.8208]])
An ndarray is a generic multidimensional container for homogeneous data; that is, all
of the elements must be the same type. Every array has a shape, a tuple indicating the
size of each dimension, and a dtype, an object describing the data type of the array:
In [11]: data.shape
Out[11]: (2, 3)

80 | Chapter 4: NumPy Basics: Arrays and Vectorized Computation

www.it-ebooks. info
In (12]: data.dtype
Out[12]: dtype(" float64')
This chapter will introduce you to the basics of using NumPy arrays, and should be
sufficient for following along with the rest of the book. While it's not necessary to have
adeep understanding of NumPy for many data analytical applications, becoming pro
ficient in array-oriented programming and thinking is a key step along the way to be
coming a scientific Python guru.

Whenever you see "array", "NumPy array", or "ndarray" in the text,

with few exceptions they all refer to the same thing: the ndarray object.

Creating ndarrays
The easiest way to create an array is to use the array function. This accepts any se
quence-like object (including other arrays) and produces a new NumPy array contain
ing the passed data. For example, a list is a good candidate for conversion:
In [13]: data1 = (6, 7.5, 8, 0, 1]
In [14]: arri = np.array(data1)

In [15]: arr1
Out [15]: array([ 6. , 7.5, 8. , 0. , 1. )

Nested sequences, like alist of equal-length lists, will be converted into a multidimen
sional array:
In [16]: data2 ([1, 2, 3, 4], [5, 6, 7, 8]]

In (17): arr2 np.array (data2)

In [18]: arr2
Out[18):
array([[1, 2, 3, 4].
(5, 6, 7, 8]])
In [19): arr2.ndim
Out[19]: 2
In [20]: arr2.shape
Out [20]: (2, 4)
Unless explicitly specified (more on this later), np. array tries to infer a good data type
for the array that it creates. The data type is stored in aspecial dtype object; for example,
in the above two examples we have:
In (21]: arr1.dtype
Out[21]: dtype('float64' )

The NumPy ndarray: AMultidimenslonal Aay 0bject| 81

www.it-ebooks. info
In [22]: arr2.dtype
Out[22]: dtype('int64')
In addition to np.array, there are a number of other functions for creating new arrays.
As examples, zeros and ones create arrays of 0's or 1's, respectively, with agiven length
or shape. empty creates an array without initializing its values to any particular value.
To create a higher dimensional array with these methods, pass a tuple for the shape:
In (23]: np.zeros(10)
Out [23]: array([ 0., 0.) 0., 0., 0., 0., 0., 0., 0., 0.])

In [24]: np.zeros((3, 6))

Out[24]:
array(([ 0., 0., 0., 0., 0., 0.],
0., 0., 0., 0. 0., 0.],
0., 0.. 0., 0.)))
In [25): np.empty((2, 3, 2))
Out [25]:
array([[ 4.94065646e-324, 4.94065646e-324),
3.87491056e-297, 130],
.400e-22Aii.
.

4.9406s646e-324, 4.9

[[ 1.90723115e+083, S.73293533e-053],
[ -2.33568637e+124, -6.70608105e-012],
[ 4.42786966e+160, 1.27100354e+025]]])

Ir's not safe to assume that np.empty will return an array of all zeros. In
many cases, as previously shown, it will return uninitialized garbage
values.

arange is an array-valued version of the built-in Python range function:

In [26]: np.arange(15)
Out [26]: array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14])
See Table 4-1 for a short list of standard array creation functions. Since NumPy is
focused on numerical computing, the data type, if not specified, will in many cases be
float64 (floating point).
Table 4-1. Array creation functions
Function Description
array Convert input data (list, tuple, array, or other sequence type) to an ndarray elther by
inferring adtype or explidtly speifying adtype. Copies the input data by defauit.
asarray Convert input to ndaray, but do not copy if the input is already an ndarray
arange Like the built-in range but returns an ndarray instead ofa list.
ones, ones like Produce an array of all 1's with the given shape and dtype. ones like takes another
array and produces aones array of the same shape and dtype.
zeros, zeros like Like ones and ones like but producing arrays of 0's instead

82 Chapter 4: NumPy Basics: Arrays and Vectorized Computation

www.it-ebooks. info
Function Description
empty, empty_like Create new arrays by allocating new memory, but do not populate with any values like
ones and zeros
eye, identity (reate asquare NxNidentity matrix (1's on the diagonal and O's elsewhere)

Data Types for ndarays

The data type or dtype is a special object containing the information the ndarray needs
to interpret a chunk of memory as a particular type of data:
In [27]: arri = np. array([1, 2, 3], dtype=np.float64)
In [28]: arr2 = np. array([1, 2, 3], dtype=np.int32)
In [29]: arr1.dtype In [30]: arr2.dtype
Out [29]: dtype("float64') Out [30]: dtype(' int32')
Drypes are part of what make NumPy so powerful and flexible. In most cases they map
directly onto an underlying machine representation, which makes it easy to read and
write binary streams of data to disk and also to connect to code written in a low-level
language like Cor Fortran. The numerical drypes are named the same way: a type name,
like float or int, followed by a number indicating the number of bits per element. A
standard double-precision floating point value (what's used under the hood in Python's
float object) takes up 8 bytes or 64 bits. Thus, this type is known in NumPy as
float64. See Table 4-2 for a full listing of NumPy's supported data types.

Don't worry about memorizing the NumPy dtypes, especially if you're

a new user. It's often only necessary to care about the general kind of
data you're dealing with, whether floating point, complex, integer,
boolean, string, or general Python object. When you need more control
over how data are stored in memory and on disk, especially large data
sets, it is good to know that you have control over the storage type.

Table 4-2. NumPy data types

Type Type Code Description

int8, uint8 i1, u1 Signed and unsigned 8-bit (1 byte) integer types
int16, uint16 i2, u2 Signed and unsigned 16-bit integer types
int32, uint32 i4, u4 Signed and unsigned 32-bit integer types
int64, uint64 i8, u8 Signed and unsigned 32-bit integer types
float16 f2 Half-precision floating point
float32 f4 or f Standard single-precision floating point. Compatible with Cfloat
float64, float128 f8 or d Standard double-precision floating point. Compatible with Cdouble
and Python float object

The NumPy ndaray: AMultidimensional Array Object | 83

www.it-ebooks. info
Type Type Code Descriptlon
float128 f16 or g Extended-precision floating point
complex64, complex128, c8, c16, Complexnumbersrepresentedby two 32,64,or 128foats, respectively
complex256 c32

bool ? Boolean type storing True and False values

object 0 Python objet type
string Fixed-length string type (1 byte per character). For exampe, to create
astring dtype with length 10, use 'S10".
unicode Fixed-length unicode type (number ofbytes platform specific). Same
specification semantics as string_(e.g. 'U10").

You can explicitly convert or cast an array from one dype to another using ndarray's
astype method:
In [31]: arr np.array([1, 2, 3, 4, 5))
In [32]: arr.dtype
Out[32]: dtype(" int64')
In (33]: float_ar = arr.astype(np.float64)
In (34]: float_arr.dtype
Out[34]: dtype"float64')
In this example, integers were cast to floating point. IfI cast some floating point num
bers to be of integer dtype, the decimal part will be truncated:
In [35): arr = np. array([3.7, -1.2, -2.6, 0.5, 12.9, 10.1])
In [36]: arr
Out[36]: array([ 3.7, -1.2, -2.6, 0.5, 12.9, 10.1])
In [37]: arr.astype (np. int32)
Qut37]: array([ 3, -1, -2, 0, 12, 10), dtype=int32)
Should you have an array of strings representing numbers, you can use astype to convert
them to numeric form:
In [38]: numeric_strings = np.array(['1.25', '-9.6', '42'], dtype=np.string )
In (39): numeric _strings. astype(float)
Out [39): array([ 1.25, -9.6 , 42. J)
If casting were to fail for some reason (like a string that cannot be converted to
float64), a TypeError will be raised. See that I was a bit lazy and wrote float instead of
np.float64; NumPy is smart enough to alias the Python types to the equivalent dtypes.
You can also use another array's dtype attribute:
In [40]: int array = np.arange(10)

84 Chapter 4: NumPy Basics: Arays and Vectorized Computation

www.it-ebooks. info
In [41]: calibers np.array([.22, .270, 357, 380, .44, 50], dtype=np.float64)
In (42]: int array.astype(calibers.dtype)
Out [42]: array([ 0., 1., 2., 3., 4. 5., 6., 7., 8., 9.])
There are shorthand type code strings you can also use to refer to a dtype:
In [43]: empty uint32 = np.empty(8, dtype='u4')
In [44]: empty_uint32
Out [44):
array([ 0, 65904672, 0, 64856792, 0,
39438163, 0], dtype=uint32)

Calling astype always creates a new array (a copy of the data), even if
the new dtype is the same as the old dtype.

It's worth keeping in mind that floating point numbers, such as those
in float64 and float32 arrays, are only capable of approximating frac
tional quantities. In complex computations, you may accrue some
floating point error, making comparisons only valid up to acertain num
ber of decimal places.

Operations between Arrays and Scalars

Arrays are important because they enable you to express batch operations on data
without writing any for loops. This is usually called vectorization. Any arithmetic op
erations between equal-size arrays applies the operation elementwise:
In [45): arr np.array([[1. , 2., 3.], [4., 5., 6.]])
In (46): arr
Out[46]:
array([[ 1., 2, 3l.
[4., 5., 6.)])
In [47]: arr * arr In 481: arr- arr
Out[47]: Out[48]:
array([[ 1., 4., 9.J,
[16., 25., 36.]]1)
array(([ 0.,
O. 0., i
Arithmetic operations with scalars are as you would expect, propagating the value to
each element:
In [49]: 1 /arr In (50]: arr ** 0.5
Out [49]: Out[50]:
array([[ 1. 0.5 0.3333]. array([[ 1. ) 1.4142, 1.7321],
[0.25 , 0.2 0.1667]]) [ 2. . 2.2361, 2.4495]])

The NumPy ndarray: AMultidimensional Aray Object|85

www.it-ebooks.info
Operations between differently sized arrays is called broadcasting and will be discussed
in more detail in Chapter 12. Having adeep understanding of broadcasting is not nec
essary for most of this book.

Basic Indexing and Slicing

NumPy array indexing is a rich topic, as there are many ways you may want to select
a subset of your data or individual elements. One-dimensional arrays are simple; on
the surface they act similarly to Python lists:
In (51]: arr = np.arange (10)
In [52]: arr
Out [52]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
In [$3]: arr[5]
Out[53]: 5
In [54): arr[5:8]
Out [54]: array([5, 6, 7])
In [55]: arr[5:8] = 12
In [56]: arr
Out[56]: array([ 0, 1, 2, 3, 4, 12, 12, 12, 8, 9])
As you can see, if you assign a scalar value to a slice, as in arr[5:8] 12, the value is
propagated (or broadcasted henceforth) to the entire selection. An important first dis
tinction from lists is that array slices are views on the original array. This means that
the data is not copied, and any modifications to the view will be reflected in the source
array:
In [57]: arr_slice - arr[5:8]
In [58]: arr slice[1] = 12345
In (59]: arr
Out [59]: array([ 0, 1, 2, 3, 4, 12, 12345, 12, 8. 9])

In [60]: arr slice[:] = 64

In [61]: arr
Out [61]: array([ 0, 1, 2, 3, 4, 64, 64, 64, 8, 9])
If you are new to NumPy, you might be surprised by this, especially if they have used
other array programming languages which copy data more zealously. As NumPy has
been designed with large data use cases in mind, you could imagine performance and
memory problems if NumPy insisted on copying data left and right.

86 | Chapter 4: NumPy Basics: Arays and Vectorized Computation

www.it-ebooks. info
If you want a copy of a slice of an ndarray instead of a view, you will
need to explicitly copy the array; for example arr[5:8].copy().

With higher dimensional arrays, you have many more options. In atwo-dimensional
array, the elements at each index are no longer scalars but rather one-dimensional
arrays:
In [62]: arr2d = np. array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
In [63]: arr2d[2]
Out [63): array([7, 8, 9])
Thus, individual elements can be accessed recursively. But that is a bit too much work,
so you can pass a comma-separated list of indices to select individual elements. So these
are equivalent:
In [64]: arr2d[o][2)
Out [64]: 3
In [65]: arr2d[o, 2]
Out [65]: 3
See Figure 4-1 for an ilustration of indexing on a 2D array.
axis 1
1

0,0 0.1 0,2

axis 0 1 1,0 1,1 1,2

2,0 2, 1 2,2

Figure 4-1. Indexing elements in a NumPy array

In multidimensional arrays, ifyouomit later indices, the returned object will be a lower
dimensional ndarray consisting of all the data along the higher dimensions. So in the
2 x 2x 3 array arr3d
In [66]: arr3d = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]1])
In (67]: arr3d
Out[67]:
array([[[ 1, 2, 3),

The NumPy ndarray: AMultidimensional Aay Object | 87

www.it-ebooks. info
[4, S, 6]1.
([ 7, 8, 9],
(10, 11, 12]]])
arr3d[0] is a 2 x 3 array:
In [68]: arr3d[o]
Out[68]:
array([[1, 2, 3],
[4, 5, 6]])
Both scalar values and arrays can be assigned to arr3d[o]:
In [69]: old values = arr3d[o].copy()
In [70]: arr3d[o] = 42
In [71]: arr3d
Out [71]:
array(([[42, 42, 42],
[42, 42, 42]],
[[ 7, 8, 9],
[10, 11, 12]])
In [72]: arr3d[o] = old_values
In [73]: arr3d
Out[73]:
array([[l 1, 2, 3]
[4, 5, 6]),
[I 7, 8, 9],
[10, 11, 12]]])
Similarly, arr3d[1, o] gives you all of the values whose indices start with (1, 0), form
ing a 1-dimensional array:
In [74]: arr3d[1, o]
Out[74): array([7, 8, 9])
Note that in all of these cases where subsections of the array have been selected, the
returned arrays are views.

Indexing with slices

Like one-dimensional objects such as Python lists, ndarrays can be sliced using the
familiar syntax:
In [75]: arr[1:6]
Out[75]: array([ 1, 2, 3, 4, 64])
Higher dimensional objects give you more options as you can slice one or more axes
and also mix integers. Consider the 2D array above, arr2d. Slicing this array is a bit
different:
In [76]: arr2d In [77): arr2d[:2]
Out [76]: Qut [77]:

88 Chapter 4: NumPy Basics: Arrays and Yectorized Computation

www.it-ebooks. info
array([[1, 2, 3], array([ [4,
[1, 5,2, 6jj)
3],
[4, 5, 6]
[7, 8, 9]))
As you can see, it has sliced along axis 0, the first axis. Aslice, therefore, selects a range
of elements along an axis. You can pass multiple slices just like you can pass multiple
indexes:
In [78]: arr2d[:2, 1:]
Out [78]:
array([[2, 3).
[S, 6j])
When slicing like this, you always obtain array views of the samenumber ofdimensions.
By mixing integer indexes and slices, you get lower dimensional slices:
In [79]: arr2d[1, :2] Out
In olarr2d[2, :1)
Out[79]: array([4, si) array([7])
See Figure 4-2 for an illustration. Note that a colon by itself means to take the entire
axis, so you can slice only higher dimensional axes by doing:
In [81]: arr2d[:, :1]
Out [81]:
array([[1],
(
i71j)
Of course, assigning to a slice expression assigns to the whole selection:
In [82]: arr2d[:2, 1:] = 0

Boolean Indexing
Let's consider an example where we have some data in an array and an array of names
with duplicates. I'm going to use here the randn function in numpy.random to generate
some random normally distributed data:
In [83]: names np.array(['Bob', 'Joe', Will', 'Bob', 'Will', 'Joe', 'Joe'])

In [84]: data randn(7, 4)

In [85]: names
Out [85]:
array([" Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'],
dtype=" |S4')
In [86]: data
Out[86]:
array([[-0.048, 0.5433, -0.2349, 1.2792],
[-0.268 , 0.5465, 0.0939, -2.0445],
[-0.047 -2.026 , 0.7719, 0.3103])
[ 2.1452, 0.8799, -0.0523, 0.0672]
[-1.0023, -0.1698, 1.1503, 1.7289],

The NumPy ndarray: AMutidimensional Aray Object | 89

www.it-ebooks. info
[0.1913, 0.4544, 0.4519, 0. 5535],
[0.5994, 0.8174, -0.9297, -1.2564]])

Expression Shape
arr[:2, 1:] (2, 2)

arr[2] (3,)
arr[2, :] (3,)
arr[2:, :] (1, 3)

arr[:, :2] (3, 2)

arr[1, :2] (2,)

arr[1:2, :2] (1, 2)

Figure 4-2. Two-dimensional array slicing

Suppose each name corresponds to a row in the data array. If we wanted to select all
the rows with corresponding name 'Bob". Like arithmetic operations, comparisons
(such
'Bob'
as ==) with arrays are also vectorized. Thus, comparing names with the string
yields boolean array:
a
In [87]: names == 'Bob
Out[87]: array([ True, False, False, True, False, False, False], dtype=bool)
This boolean array can be passed when indexing the array:
In [88]: data[names == 'Bob']
Out [88j:
array([[-o.048 , 0.5433, -0.2349, 1.2792],
[2.1452, 0.8799, -0.0523, 0.0672]])
The boolean array must be of the same lengrh as the axis it's indexing. You can even
mix and match boolean arrays with slices or integers (or sequences of integers, more
on this later):
In [89]: data[names = 'Bob', 2:]
Out [89]:
array([[-0.2349, 1.2792],

90 | Chapter 4: NumPy Basics: Arays and Vectorized Computation

www.it-ebooks.info
[-0.0523, 0.0672]])

In [90]: data[names s 'Bob', 3]

Out[90]: array([ 1.2792, 0.0672])
To select everything but 'Bob', you can either use = or negate the condition using -;
In (91]: names I- 'Bob
Out[91]: array([False, True, True, False, True, True, True], dtype=bool)
In (92]: data[-(names 'Bob')]
Out [92]:
array([[-0.268, 0.5465, 0.0939, -2.0445),
[-0.047 , -2.026 0.7719, 0.3103],
(-1.0023, -0.1698, 1.1503, 1.7289]
[ 0.1913, 0.4544, 0.4519, 0.5535],
0.5994, 0.8174, -0.9297, -1.2564]])
Selecting two of the three names to combine multiple boolean conditions, use boolean
arithmetic operators like &(and) and (or):
In [93]: mask = (names 'Bob') | (names 'Will')
In [94]: mask
Out [94]: array([True, False, True, True, True, False, False], dtype=bool)
In [95): data[mask]
Out[95]:
array([[-0.048, 0.5433, -0.2349, 1.2792),
-0.047 , -2.026 , 0.7719, 0.3103]
2.1452, 0.8799, -0.0523, 0.0672],
(-1.0023, -0.1698, 1.1503, 1.7289)])
Selecting data from an array by boolean indexing always creates a copy of the data,
even if the returned array is unchanged.
The Python keywords and and or do not work with boolean arrays.

Setting values with boolean arrays works in a common-sense way. To set all of the
negative values in data to Owe need only do:
In [96]: data[data < o] = 0
In (97]: data
Out[97]:
array([[ 0. 0.5433, 0. , 1.2792]
0.5465, 0.0939, 0.
0 0. 0.7719, 0.3103],
2.1452, 0.8799, 0. 0.0672],
0. , 0. , 1.1503, 1.7289],
0.1913, 0.4544, 0.4519, 0.5535],
0.5994, 0.8174, 0. 0. j)

The Numfy ndarray: AMultidimensional Aray Object | 91

www.it-ebooks. info
Setting whole rows or columns using a lD boolean array is als0 easy:
In [98]: data[names != 'Joe'] -7
In [99]: data
Out [99]:
array([[ 7. 7.
0. 0.5465, 0.0939, 0,
7. 7.
7. 7.
7 7.
0.1913, 0.4544, 0.4519,
0.5994, 0.8174, 0.
o.553)
0.

Fancy Indexing
Fancy indexing is a term adopted by NumPy to describe indexing using integer arrays.
Suppose we had a 8 x 4 array:
In [100]: arr = np.empty((8, 4))
In [101]: for i in range(8):
arr[i] i
In [102]: arr
Out[102]:
array([[ 0., 0., 0., 0.],
1., 1. 1., 1. J
2. 2.. 2. 2.
3"J»
4. 4) 4.) 4.],
5., 5., 5., 5.],
6., 6., 6., 6.],
7., 7., 7., 7.]))
To select out a subset of the rows in a particular order, you can simply pass a list or
ndarray of integers specifying the desired order:
In [103]: arr[[4, 3, 0, 6]]
Out [103] :
array([[ 4., 4. 4., 4.J,
3., 3, 3., 3.],
0., 0., 0., 0.J,
6., 6., 6., 6.11)

Hopefully this code did what you expected! Using negative indices select rows from
the end:
In [104]: arr[[-3, -5, -7]]
Out[104):
array([[ 5., 5., 5., 5.],
3., 3., 3., 3.],
I1., 1., 1., 1.]])

92 Chapter 4: NumPy Basics: Aays and Vectorized Computation

www.it-ebooks.info
index arrays does something slightly different; it selects a lD array of
Passing multiple indices:
elements corresponding to each tuple of
# more on reshape in Chapter (8, 4))
In (105]: arr = np.arange(32) .reshape(
In (106]: arr
Out [106] :
array([[ 0, 1, 2, 3],
4, 5, 6, 7],
8, 9, 10, 11],
[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]
[24, 25, 26, 27,
[28, 29, 30, 31])

In (107]: arr[[1, 5, 7, 2), [0, 3, 1, 2)}]

Out [107] : array([ 4, 23, 29, 10])
the elements (1, o), (5, 3), (7,
Take a moment to understand what just happened:indexing
1), and (2, 2) were selected. The behavior of fancy in this case is a bit different
(myself included), which is the rectangular
from what some users might have expectedmatrix's
region formed by selecting a subset of the rows and columns. Here is one way
to get that:
In [108) : arr[(1, 5, 7, 2]][:, [0, 3, 1, 2]]
Out [108] :
array([[ 4, 7, 5, 6],
[20, 23, 21, 22],
[28, 31, 29, 30J
8, 11, 9, 10]])
Another way is to use the np.ix function, which converts two lD integer arrays to an
indexer that selects the square region:
In (109]: arr[np.ix ([1, 5, 7, 2), [0, 3, 1, 2])]
Out[109] :
array(([ 4, 7, 5, 6),
[20, 23, 21, 22],
[28, 31, 29, 30),
[8, 11, 9, 10)])
Keepin mind that fancy indexing, unlike slicing, always copies the data into a newarray.

Transposing Arrays and Swapping Axes

Transposing is a special form of reshaping which similarly returns a view on the un
derlying data without copying anything. Arrays have the transpose method and also
the special Tattribute:
In [110]: arr = np.arange(15) .reshape( (3, 5))
In [111]: arr In [112)]: arr.T

The NumPy ndarray: AMultidimensional Array Object|93

www.it-ebooks. info
Out[111] : Out[112 ]:
array([[ 0, 1, 2, 3, 4] array([[ 0, S, 10],
[5, 6, 7, 8, 9], 1, 6, 11],
(10, 11, 12, 13, 14]]) [ 2, 7,, 121,
3, 8, 13],
4, 9, 14]])
When doing matrix computations, you will do this very often, like for example com
puting the inner matrix product X'X using np. dot:
In (113]: arr = np.random. randn(6, 3)
In [114]: np.dot(arr.T, arr)
Out [114]:
array([[ 2.584 , 1.8753, 0.88881,
1.8753, 6.6636, 0.3884],
[ o.8888, 0.3884, 3.9781]])
For higher dimensional arrays, transpose will accept a tuple of axis numbers to permute
the axes (for extra mind bending):
In [115]: arr = np.arange(16) .reshape((2, 2, 4))
In [116]: arr
Out [116]:
array([([ 0, 1, 2, 3:
[4, 5, 6, 7]]),
[[8, 9, 10, 11),.
(12, 13, 14, 15j)])
In [117]: arr.transpose((1, 0, 2))
Out[117]:
array([[[ 0, 1, 2, 3),
[8, 9, 10, 11]],
[[ 4, 5, 6, 7),
(12, 13, 14, 15í])
Simple transposing with .T is just a special case of swapping axes. ndarray has the
method swapaxes which takes a pair of axis numbers:
In [118]: arr In [119]: arr.swapaxes (1, 2)
Out [118] : Out[119] :
array([[[ o, 4),
array(([[ 0, 1, 2, 7ii.
3),
[4, 5, 6, 1, 5],
2, 6],
([ 8, 9, 10, 3, 711,
[12, 13, 14, 15j1)
[[ 8, 12)],
9, 13],
[10, 14],
[11, 1s]]])
Swapaxes similarly returns a view on the data without making a copy.

94 | Chapter 4: NumPy Basics: Arrays and Vectorized Computation

www.it-ebooks. info

Durgasoft - Python For Data Science Running Notes
100% (2)
Durgasoft - Python For Data Science Running Notes
300 pages
Numpy Numerical Python - Unit3
No ratings yet
Numpy Numerical Python - Unit3
69 pages
Numpy
No ratings yet
Numpy
32 pages
Numpy ML - AI
No ratings yet
Numpy ML - AI
135 pages
05 NumPy - Arrays and Vectorized Computation
No ratings yet
05 NumPy - Arrays and Vectorized Computation
47 pages
Chapter 2 - NumPy and Pandas
No ratings yet
Chapter 2 - NumPy and Pandas
26 pages
CH - 2 Advance Python
No ratings yet
CH - 2 Advance Python
47 pages
UNIT-03 Numpy
No ratings yet
UNIT-03 Numpy
49 pages
Unit 4 Final
No ratings yet
Unit 4 Final
100 pages
Print
No ratings yet
Print
296 pages
Module Numpy
No ratings yet
Module Numpy
67 pages
Numpy Operations
No ratings yet
Numpy Operations
55 pages
NumPy Library and Function
No ratings yet
NumPy Library and Function
129 pages
Numpy
No ratings yet
Numpy
64 pages
Numpy User
No ratings yet
Numpy User
565 pages
Numpy User
No ratings yet
Numpy User
529 pages
Numpy User
No ratings yet
Numpy User
502 pages
Numpy @CodeProgrammer
No ratings yet
Numpy @CodeProgrammer
64 pages
Numpy User
No ratings yet
Numpy User
659 pages
Unit 2
No ratings yet
Unit 2
21 pages
Python 5th Sem
No ratings yet
Python 5th Sem
33 pages
Python Sem V Portion 2
No ratings yet
Python Sem V Portion 2
29 pages
Numpy
No ratings yet
Numpy
54 pages
Numpyintro PDF
No ratings yet
Numpyintro PDF
17 pages
Introduction To Numpy: Aniruddh Kadam Reg No-12109237 Lovely Professional University
100% (1)
Introduction To Numpy: Aniruddh Kadam Reg No-12109237 Lovely Professional University
84 pages
ICT582 Topic 10
No ratings yet
ICT582 Topic 10
27 pages
Scientific Computing
No ratings yet
Scientific Computing
24 pages
Numpy in Pandas
No ratings yet
Numpy in Pandas
30 pages
Swift For Complete Beginners 2022-05-29 (Just The PDF
100% (2)
Swift For Complete Beginners 2022-05-29 (Just The PDF
232 pages
Python Module 5
No ratings yet
Python Module 5
43 pages
Data Science Handwritten Notes - 3
No ratings yet
Data Science Handwritten Notes - 3
26 pages
Data Science - Sec5
No ratings yet
Data Science - Sec5
16 pages
Numerical Python Numpy
No ratings yet
Numerical Python Numpy
28 pages
CMP216 - Fundamentals of A Data Structure
100% (2)
CMP216 - Fundamentals of A Data Structure
225 pages
Day 3-Numpy Basics - Jupyter Notebook
No ratings yet
Day 3-Numpy Basics - Jupyter Notebook
8 pages
Unit 3
No ratings yet
Unit 3
37 pages
NumPy and NumPy Arrays
No ratings yet
NumPy and NumPy Arrays
5 pages
Numpy User
No ratings yet
Numpy User
486 pages
Numpy & Pandas
No ratings yet
Numpy & Pandas
13 pages
NumPy Quickstart
No ratings yet
NumPy Quickstart
26 pages
Num Py
No ratings yet
Num Py
46 pages
03 Numpy
No ratings yet
03 Numpy
30 pages
DSA Interview Questions & Answers PDF
No ratings yet
DSA Interview Questions & Answers PDF
45 pages
Unit 3
No ratings yet
Unit 3
42 pages
Unit8 DataAnalyticsandVisualizationpdf 2023 10 17 09 16 46
No ratings yet
Unit8 DataAnalyticsandVisualizationpdf 2023 10 17 09 16 46
64 pages
Unit V NumPy
No ratings yet
Unit V NumPy
75 pages
Numpy
No ratings yet
Numpy
18 pages
Unit - Iii
No ratings yet
Unit - Iii
79 pages
BPOPS103 Module3notes New
No ratings yet
BPOPS103 Module3notes New
58 pages
CSE488 Lab3 Numpy
No ratings yet
CSE488 Lab3 Numpy
14 pages
Python Day 7
No ratings yet
Python Day 7
10 pages
Numpy, Pandas and Matplotlib
No ratings yet
Numpy, Pandas and Matplotlib
60 pages
Numpy Tutorial
No ratings yet
Numpy Tutorial
13 pages
Unit III Python
No ratings yet
Unit III Python
42 pages
Num Py
No ratings yet
Num Py
31 pages
Python Num Py Tutorial - Numpy
No ratings yet
Python Num Py Tutorial - Numpy
53 pages
Server Side Scripting PHP
No ratings yet
Server Side Scripting PHP
99 pages
FALLSEM2023-24 CSI3007 ETH VL2023240104352 2023-09-27 Reference-Material-I
No ratings yet
FALLSEM2023-24 CSI3007 ETH VL2023240104352 2023-09-27 Reference-Material-I
47 pages
C Lab Manual Final
No ratings yet
C Lab Manual Final
146 pages
(PDF Download) The Big R-Book: From Data Science To Learning Machines and Big Data Philippe J. S. de Brouwer Fulll Chapter
100% (6)
(PDF Download) The Big R-Book: From Data Science To Learning Machines and Big Data Philippe J. S. de Brouwer Fulll Chapter
64 pages
Data Structures and Algorithms in Swift: Implement Stacks, Queues, Dictionaries, and Lists in Your Apps 1st Edition Elshad Karimov
No ratings yet
Data Structures and Algorithms in Swift: Implement Stacks, Queues, Dictionaries, and Lists in Your Apps 1st Edition Elshad Karimov
49 pages
Core Java
No ratings yet
Core Java
143 pages
Analitical Reseach
No ratings yet
Analitical Reseach
15 pages
Unit 3 Numpy
No ratings yet
Unit 3 Numpy
23 pages
Numpy: Usage For Data Analysis Operations
No ratings yet
Numpy: Usage For Data Analysis Operations
20 pages
NumPy Notes
No ratings yet
NumPy Notes
13 pages
Concurrent Collections in Java
No ratings yet
Concurrent Collections in Java
33 pages
Accenture Questions and Interview Experience
No ratings yet
Accenture Questions and Interview Experience
41 pages
21CS744 - RPA Question Bank - With - Solution
No ratings yet
21CS744 - RPA Question Bank - With - Solution
32 pages
Tentative NumPy Tutorial
No ratings yet
Tentative NumPy Tutorial
30 pages
Write A Java Program That First Sorts An Integer Array Using Bubble Sort An - 20241230 - 223135 - 0000
No ratings yet
Write A Java Program That First Sorts An Integer Array Using Bubble Sort An - 20241230 - 223135 - 0000
23 pages
System Verilog: Vlsi To You Youtube Channel Vlsi To You Youtube Channel
100% (1)
System Verilog: Vlsi To You Youtube Channel Vlsi To You Youtube Channel
26 pages
TI Designs：TIDEP-01001车辆乘员检测参考设计
No ratings yet
TI Designs：TIDEP-01001车辆乘员检测参考设计
22 pages
Working With NumPy For Class 12th PDF
No ratings yet
Working With NumPy For Class 12th PDF
5 pages
MCQs On Data Structures and Algorithms
No ratings yet
MCQs On Data Structures and Algorithms
28 pages
Comprog Yehey''
No ratings yet
Comprog Yehey''
7 pages
01 Principal of Programming Using C
No ratings yet
01 Principal of Programming Using C
5 pages
Lesson 7
No ratings yet
Lesson 7
33 pages
R23 - Introduction To Programming - Syllabus
No ratings yet
R23 - Introduction To Programming - Syllabus
1 page
GDS II Stream Format Manual 6.0 Feb87
No ratings yet
GDS II Stream Format Manual 6.0 Feb87
47 pages
Data Structures and Algorithms MCQ Questions
No ratings yet
Data Structures and Algorithms MCQ Questions
9 pages
C Pro Questions
No ratings yet
C Pro Questions
3 pages
Topic Wise Questions
No ratings yet
Topic Wise Questions
6 pages
3 - Arrays and Functions
No ratings yet
3 - Arrays and Functions
6 pages
CS711 Final Term Paper (Naina Malik)
No ratings yet
CS711 Final Term Paper (Naina Malik)
8 pages
Arrays-WPS Office
No ratings yet
Arrays-WPS Office
3 pages
Sample-Questions Hwi
No ratings yet
Sample-Questions Hwi
8 pages
SSK3100 - SET A - 3
No ratings yet
SSK3100 - SET A - 3
3 pages
Ba Paper 3
No ratings yet
Ba Paper 3
1 page
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet

Unit 2

Uploaded by

Unit 2

Uploaded by

For most data analysis applications, the main areas of funcionality I'lfocus on are:

The NumPy ndarray: AMultidimensional Array Object

80 | Chapter 4: NumPy Basics: Arrays and Vectorized Computation

Whenever you see "array", "NumPy array", or "ndarray" in the text,

In (17): arr2 np.array (data2)

The NumPy ndarray: AMultidimenslonal Aay 0bject| 81

In [24]: np.zeros((3, 6))

arange is an array-valued version of the built-in Python range function:

82 Chapter 4: NumPy Basics: Arrays and Vectorized Computation

Data Types for ndarays

Don't worry about memorizing the NumPy dtypes, especially if you're

Table 4-2. NumPy data types

Type Type Code Description

The NumPy ndaray: AMultidimensional Array Object | 83

bool ? Boolean type storing True and False values

84 Chapter 4: NumPy Basics: Arays and Vectorized Computation

Operations between Arrays and Scalars

The NumPy ndarray: AMultidimensional Aray Object|85

Basic Indexing and Slicing

In [60]: arr slice[:] = 64

86 | Chapter 4: NumPy Basics: Arays and Vectorized Computation

0,0 0.1 0,2

axis 0 1 1,0 1,1 1,2

Figure 4-1. Indexing elements in a NumPy array

The NumPy ndarray: AMultidimensional Aay Object | 87

Indexing with slices

88 Chapter 4: NumPy Basics: Arrays and Yectorized Computation

In [84]: data randn(7, 4)

The NumPy ndarray: AMutidimensional Aray Object | 89

arr[:, :2] (3, 2)

arr[1, :2] (2,)

Figure 4-2. Two-dimensional array slicing

90 | Chapter 4: NumPy Basics: Arays and Vectorized Computation

In [90]: data[names s 'Bob', 3]

The Numfy ndarray: AMultidimensional Aray Object | 91

92 Chapter 4: NumPy Basics: Aays and Vectorized Computation

In (107]: arr[[1, 5, 7, 2), [0, 3, 1, 2)}]

Transposing Arrays and Swapping Axes

The NumPy ndarray: AMultidimensional Array Object|93

94 | Chapter 4: NumPy Basics: Arrays and Vectorized Computation

You might also like