SLICING (INDEXING/SUBSETTING)
Numpy (Numerical Python)
Numpy Cheat Sheet Setting data with assignment :
ndarray1[ndarray1 < 0] = 0 *
5. Boolean arrays methods
Count # of Trues (ndarray1 > 0).sum()
Python Package in boolean array
Created By: Arianne Colton and Sean Chen If ndarray1 is two-dimensions, ndarray1 < 0 If at least one ndarray1.any()
*
creates a two-dimensional boolean array. value is True
If all values are ndarray1.all()
True
Numpy (Numerical Python) COMMON OPERATIONS
1. Transposing Note: These methods also work with non-boolean
What is NumPy? Default data type is np.float64. This is A special form of reshaping which returns a view arrays, where non-zero elements evaluate to True.
** equivalent to Pythons float type which is 8 on the underlying data without copying anything.
Foundation package for scientific computing in Python bytes (64 bits); thus the name float64.
ndarray1.transpose() or 6. Sorting
Why NumPy? If casting were to fail for some reason,
*** TypeError will be raised. ndarray1.T or Inplace sorting ndarray1.sort()
Numpy ndarray is a much more efficient way
of storing and manipulating numerical data ndarray1.swapaxes(0, 1)
than the built-in Python data structures. Return a sorted sorted1 =
SLICING (INDEXING/SUBSETTING) 2. Vectorized wrappers (for functions that np.sort(ndarray1)
Libraries written in lower-level languages, such copy instead of
as C, can operate on data stored in Numpy Slicing (i.e. ndarray1[2:6]) is a view on take scalar values) inplace
ndarray without copying any data. the original array. Data is NOT copied. Any math.sqrt() works on only a scalar
modifications (i.e. ndarray1[2:6] = 8) to the
N-DIMENSIONAL ARRAY (NDARRAY) np.sqrt(seq1) # any sequence (list, 7. Set methods
view will be reflected in the original array.
ndarray, etc) to return a ndarray
What is NdArray? Instead of a view, explicit copy of slicing via : Return sorted np.unique(ndarray1)
3. Vectorized expressions unique values
Fast and space-efficient multidimensional array ndarray1[2:6].copy()
(container for homogeneous data) providing vectorized np.where(cond, x, y) is a vectorized version Test membership resultBooleanArray =
arithmetic operations of ndarray1 values np.in1d(ndarray1, [2,
Multidimensional array indexing notation : of the expression x if condition else y 3, 6])
in [2, 3, 6]
Create NdArray np.array(seq1) ndarray1[0][2] or ndarray1[0, 2] np.where([True, False], [1, 2],
# seq1 - is any sequence like object, [2, 3]) => ndarray (1, 3)
Other set methods : intersect1d(),union1d(),
i.e. [1, 2, 3]
setdiff1d(), setxor1d()
Create Special 1, np.zeros(10) * Boolean indexing :
Common Usages :
NdArray # one dimensional ndarray with 10 ndarray1[(names == Bob) | (names == 8. Random number generation (np.random)
elements of value 0 Will), 2:] np.where(matrixArray > 0, 1, -1)
2, np.ones(2, 3) Supplements the built-in Python random * with
# 2: means select from 3rd column on => a new array (same shape) of 1 or -1 values
# two dimensional ndarray with 6
functions for efficiently generating whole arrays
elements of value 1 np.where(cond, 1, 0).argmax() * of sample values from many kinds of probability
3, np.empty(3, 4, 5) * Selecting data by boolean indexing => Find the first True element distributions.
*
# three dimensional ndarray of ALWAYS creates a copy of the data.
samples = np.random.normal(size =(3, 3))
uninitialized values argmax() can be used to find the
4, np.eye(N) or The and and or keywords do NOT work index of the maximum element.
* * Example usage is find the first
np.identity(N) with boolean arrays. Use & and |. Python built-in random ONLY samples
# creates N by N identity matrix element that has a price > number *
in an array of price data. one value at a time.
NdArray version of np.arange(1, 10)
Pythons range * Fancy indexing (aka indexing using integer arrays)
Select a subset of rows in a particular order : 4. Aggregations/Reductions Methods
Get # of Dimension ndarray1.ndim (i.e. mean, sum, std)
Get Dimension Size dim1size, dim2size, .. = ndarray1[ [3, 8, 4] ]
ndarray1.shape Compute mean ndarray1.mean() or
ndarray1[ [-1, 6] ]
Get Data Type ** ndarray1.dtype np.mean(ndarray1)
# negative indices select rows from the end Created by Arianne Colton and Sean Chen
Explicit Casting ndarray2 = ndarray1. Compute statistics ndarray1.mean(axis = 1)
astype(np.int32) *** www.datasciencefree.com
Fancy indexing ALWAYS creates a over axis * ndarray1.sum(axis = 0)
* Based on content from
copy of the data.
Cannot assume empty() will return all zeros. Python for Data Analysis by Wes McKinney
* * axis = 0 means column axis, 1 is row axis.
It could be garbage values.
Updated: August 18, 2016