Rapids Cheatsheet

This is the cheatsheet of the world's fastest GPU using Machine Learning package called as Rapids developed by NVIDIA

Uploaded by

Pratyaksh Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

354 views2 pages

Rapids Cheatsheet

This is the cheatsheet of the world's fastest GPU using Machine Learning package called as Rapids developed by NVIDIA

Uploaded by

Pratyaksh Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

Cheat Sheet

www.RAPIDS.ai

TIDY DATA A foundation for wrangling in pandas INGESTING AND RESHAPING DATA Change the layout of a data set
In a tidy data set: gdf.sort_values(‘mpg’)
Order rows by values of a column (low
F M A F M A to high).
CSV
&
gdf.sort_values(‘mpg’,ascending=False)
Planned for Future Release Order rows by values of a column (high
to low).
df.rename(columns = {‘y’:’year’})
Planned for Future Release
Each variable is saved Each observation is gdf = cuDF.read_csv(filename, delimiter=”,”, df.pivot(columns=’var’, values=’val’) Rename the columns of a DataFrame
in its own column. saved in its own row. names=col_names, dtype =col_types) Spread rows into columns. gdf.sort_index()
Sort the index of a DataFrame.
Tidy data complements pandas’ vectorized operations. pandas will gdf.set_index()
automatically preserve observations as you manipulate variables. No other Return a new DataFrame with a new index.

} }
format works as intuitively. gdf.drop_column(‘Length’)
Drop column from DataFrame.
M A F

cudf.concat([gdf1,gdf2]) gdf.add_column(‘name’, gdf1[‘name’])

M A Append rows of DataFrames. Append columns of DataFrames.

SYNTAX Creating DataFrames SUBSET OBSERVATIONS SUBSET VARIABLES (COLUMNS)

a b c
1 4 7 10
2 5 8 11
3 6 9 12
gdf.query(‘Length > 7’] df.sample(frac=0.5) gdf[[‘width’,’length’,’species’]]
gdf = cudf.DataFrame([
Extract rows that meet logical criteria. Randomly select fraction of rows. Select multiple columns with specific names.
(“a”, [4 ,5, 6]),
(“b”, [7, 8, 9]), df.drop_duplicates() df.sample(n=10) gdf[‘width’] or gdf.width
Remove duplicate rows (only considers Planned
Randomly for Future
select n rows.Release Select single column with specific name.
(“c”, [10, 11, 12])
]) columns). df.iloc[10:20] df.filter(regex=’regex’)
Specify values for each column. Planned for Future Release
df.head(n) Select rows by position. Select columns whose name matches regular expression regex.
Select first n rows. gdf.nlargest(n, ‘value’) REGEX (REGULAR EXPRESSIONS) EXAMPLES
gdf = cudf.DataFrame.from_records( df.tail(n) Select and order top n entries.
[[4, 7, 10], Select last n rows. gdf.nsmallest(n, ‘value’) ‘\.’ Matches strings containing a period ‘.’
[5, 8, 11], Select and order bottom n entries.
[6, 9, 12]], ‘Length$’ Planned
Matches forending
strings Futurewith
Release
word ‘Length’
index=[1, 2, 3],
columns=[‘a’, ‘b’, ‘c’])
LOGIC IN PYTHON (AND PANDAS) ‘^Sepal’ Matches strings beginning with the word ‘Sepal’
Specify values for each row. < Less than != Not equal to ‘^x[1-5]$’ Matches strings beginning with ‘x’ and ending with 1,2,3,4,5
> Greater than df.column.isin(values) Group
membership ‘’^(?!Species$).*’ Matches strings except the string ‘Species’
METHOD CHAINING == Equals pd.isnull(obj) Is NaN gdf.loc[2:5,[‘x2’,’x4’]]
Get rows from index 2 to index 5 from ‘a’ and ‘b’ columns.
Most pandas methods return a DataFrame so another pandas method can be applied <= Less than or pd.notnull(obj) Is not NaN
to the result. This improves readability of code. df.iloc[:,[1,2,5]]
equals
Select columns in positions 1, 2 and 5 (first column is 0).
gdf = cudf.from_pandas(df) >= Greater than or &,|,~,^,df.any(),df.all() Logical and, or, not, Planned for Future Release
.query(‘val >= 200’) df.loc[df[‘a’] > 10, [‘a’,’c’]]
equals xor, any, all Select rows meeting logical condition, and only the specific columns.
.nlargest(‘val’,3)
SUMMARIZE DATA HANDLING MISSING DATA COMBINE DATA SETS
gdf[‘w’].value_counts() df.dropna()
Planned for Future Release gdf1 gdf2

+ =
Count number of rows with each unique value of variable. Drop rows with any column having NA/null data. x1 x3
x1 x2
len(gdf) gdf[‘length’].fillna(value) A T
A 1
# of rows in DataFrame. Replace all NA/null data with value.
B 2 B F
gdf[‘w’].unique_count()
# of distinct values in a column. C 3 D T
df.describe()
Planned for Future Release
MAKE NEW COLUMNS STANDARD JOINS
Basic descriptive statistics for each column (or GroupBy)
x1 x2 x3
A 1 T gdf.merge(gdf2,
how=’left’, on=’x1’)
df.assign(Area=lambda df: df.Length*df.Height)
Planned for Future Release B 2 F Join matching rows from bdf to adf.
Compute and append one or more new columns. C 3 NaN
Pygdf provides a set of summary functions that operate on different kinds of pandas gdf[‘Volume’] = gdf.Length*gdf.Height*gdf.Depth
objects (DataFrame columns, Series, GroupBy) and produce single values for each of the
Add single column. x1 x2 x3
pd.qcut(df.col, n, labels=False) A 1.0 T gdf.merge(gdf1, gdf2,
groups. When applied to a DataFrame, the result is returned as a pandas Series for each Planned for Future Release how=’right’, on=’x1’)
Bin column into n buckets. B 2.0 F
column. Examples: Join matching rows from gdf1 to gdf2.
Apply row Apply row D NaN T
sum() min() functions functions
Sum values of each object. Minimum value in each object. x1 x2 x3 gdf.merge(gdf1, gdf2,
count() max() pandas provides a large set of vector functions that operate on all columns of a A 1 T how=‘inner’, on=’x1’)
Count non-NA/null values of each Maximum value in each object. DataFrame or a single selected column (cuDF Series). These functions produce vectors Join data. Retain only rows in both sets.
of values for each of the columns, or a single Series for the individual Series. Examples:
B 2 F
object. mean()
median() Mean value of each object. max(axis=1) min(axis=1) x1 x2 x3
Median value of each object. var() Element-wise max. Element-wise min.
Planned for Future Release A 1 T gdf.merge(gdf1, gdf2,
quantile([0.25,0.75]) Variance of each object. clip(lower=-10,upper=10) abs() B 2 F how=‘outer’, on=’x1’)
Quantiles of each object. std() Trim values at input thresholds Absolute value.
C 3 NaN Join data. Retain all values, all rows.
applymap(function) Standard deviation of each object.
Apply function to each object. Define a kernal function: D NaN T
>>> def kernel(in1, in2, in3, out1, out2, extra1, extra2):
for i, (x, y, z) in enumerate(zip(in1, in2, in3)): FILTERING JOINS
out1[i] = extra2 * x - extra1 * y
GROUP DATA out2[i] = y - extra1 * z x1 x2 x
adf[adf.x1.isin(bdf.x1)]
A 1 All rows for
in adf that have a match in bdf.
gdf.groupby(“col”) Call the kernel with apply_rows: Planned Future Release
Return a GroupBy object, grouped
B 2
>>> outdf = gdf.apply_rows(kernel,
by values in column named “col”. incols=[‘in1’, ‘in2’, ‘in3’], x1 x2 adf[~adf.x1.isin(bdf.x1)]
df.groupby(level=”ind”) outcols=dict(out1=np.float64,
out2=np.float64), C 3 All rows in adf that do not have a match in bdf.
Return a GroupBy
Planned object,Release
for Future grouped
by values in index level named “ind”. kwargs=dict(extra1=2.3, extra2=3.4))
gdf1 gdf2

+ =
x1 x2 x1 x2
WINDOWS A 1 B 2
B 2 C 3
df.expanding()
All of the summary functions listed above can be applied to a group. Additional Return an Expanding object allowing summary functions to be applied C 3 D 4
GroupBy functions: cumulatively.
Planned for Future Release SET-LIKE OPERATIONS
size()
Planned for Future Release
agg(function) df.rolling(n)
Size of each group. Aggregate group using function. Return a Rolling object allowing summary functions to be applied to windows
of length n.
x1 x2
gdf.merge(gdf1, gdf2, how=‘inner’)
The examples below can also be applied to groups. In this case, the function is B 2 Rows that appear in both ydf and zdf (Intersection).
applied on a per-group basis, and the returned vectors are of the length of the C 3
original DataFrame.

shift(1) shift(-1)
ONE-HOT ENCODING x1 x2
Copy with values shifted by 1. Copy with values lagged by 1. CuDF can convert pandas category data types into one-hot encoded or A 1 gdf.merge(gdf1, gdf2, how=’outer’)
rank(method=’dense’) cumsum() dummy variables easily. B 2 Rows that appear in either or both ydf and zdf
Ranks with no gaps. Cumulative sum. pet_owner = [1, 2, 3, 4, 5] (Union).
Planned for Future Release pet_type = [‘fish’, ‘dog’, ‘fish’, ‘bird’, ‘fish’]
C 3
rank(method=’min’) cummax()
df = pd.DataFrame({‘pet_owner’: pet_owner, ‘pet_type’: pet_type}) D 4
Ranks. Ties get min rank. Cumulative max.
df.pet_type = df.pet_type.astype(‘category’)
rank(pct=True) cummin() pd.merge(ydf, zdf, how=’outer’,
Ranks rescaled to interval [0, 1]. Cumulative min. my_gdf = cuDF.DataFrame.from_pandas(df) indicator=True)
x1 x2 Planned for Future Release
rank(method=’first’) cumprod() my_gdf[‘pet_codes’] = my_gdf.pet_type.cat.codes .query(‘_merge == “left_only”’)
Ranks. Ties go to first value. Cumulative product.
A 1 .drop(columns=[‘_merge’])
codes = my_gdf.pet_codes.unique() Rows that appear in ydf but not zdf (Setdiff).
enc_gdf = my_gdf.one_hot_encoding(‘pet_codes’, ‘pet_dummy’, codes)

This cheat sheet inspired by Rstudio Data Wrangling Cheatsheet (https://www.rstudio.com/wp-content/uploads/2015/02/data-wrangling-cheatsheet.pdf) Written by Irv Lustig, Princeton Consultants

Catalog 7 FG35
No ratings yet
Catalog 7 FG35
12 pages
Python Libraries Cheat Sheets
No ratings yet
Python Libraries Cheat Sheets
6 pages
Introduction To Python - 1
No ratings yet
Introduction To Python - 1
79 pages
Python Assignment
33% (3)
Python Assignment
53 pages
Pymbook Readthedocs Io en Latest
100% (1)
Pymbook Readthedocs Io en Latest
173 pages
Tableau CheatSheet Zep
No ratings yet
Tableau CheatSheet Zep
1 page
Python Class Notes
No ratings yet
Python Class Notes
18 pages
Python Training Techavera
No ratings yet
Python Training Techavera
5 pages
Python Numpy Tutorial
No ratings yet
Python Numpy Tutorial
22 pages
Pytest Documentation: Release 2.7.1
No ratings yet
Pytest Documentation: Release 2.7.1
219 pages
1 Pandas Basics
No ratings yet
1 Pandas Basics
13 pages
Top 50 Pandas Interview Questions and Answers (2024)
No ratings yet
Top 50 Pandas Interview Questions and Answers (2024)
34 pages
DSL Pandas
No ratings yet
DSL Pandas
87 pages
Customer Segmentation Clustering
No ratings yet
Customer Segmentation Clustering
35 pages
Data Analysis and Visualisation With Python
No ratings yet
Data Analysis and Visualisation With Python
75 pages
Pandas
No ratings yet
Pandas
41 pages
Python PPT 01
No ratings yet
Python PPT 01
286 pages
Why Program?: Python For Informatics: Exploring Information
No ratings yet
Why Program?: Python For Informatics: Exploring Information
47 pages
15.python OS Module
No ratings yet
15.python OS Module
14 pages
Python Exit Slip - May 2016
No ratings yet
Python Exit Slip - May 2016
8 pages
Research Paper Presentation Pandas Moshiul Arefin
No ratings yet
Research Paper Presentation Pandas Moshiul Arefin
30 pages
ENG202 - Introduction To Python
No ratings yet
ENG202 - Introduction To Python
34 pages
Python Exercises Documentation: Release 1.0
No ratings yet
Python Exercises Documentation: Release 1.0
15 pages
Pandas: Import
100% (1)
Pandas: Import
13 pages
Mongodb Cheat Sheet
No ratings yet
Mongodb Cheat Sheet
10 pages
Python Generators: How To Create A Generator in Python?
No ratings yet
Python Generators: How To Create A Generator in Python?
8 pages
06 Linux Shell Programming
No ratings yet
06 Linux Shell Programming
59 pages
Panda Cheatsheet
No ratings yet
Panda Cheatsheet
17 pages
Numpy Python
No ratings yet
Numpy Python
36 pages
Unit 2 PDF
No ratings yet
Unit 2 PDF
46 pages
Python Data Structures
No ratings yet
Python Data Structures
20 pages
Gujarat Technological University: Semester - V Subject Name: Python Programming
No ratings yet
Gujarat Technological University: Semester - V Subject Name: Python Programming
4 pages
Python 07 Files
No ratings yet
Python 07 Files
17 pages
Python Programming
No ratings yet
Python Programming
22 pages
7.0-Python Trainings
100% (1)
7.0-Python Trainings
24 pages
Python Database Connectivity
No ratings yet
Python Database Connectivity
12 pages
Python OS Module - 30 Most Useful Methods From Python OS Module
No ratings yet
Python OS Module - 30 Most Useful Methods From Python OS Module
5 pages
Hands On Scripting
No ratings yet
Hands On Scripting
24 pages
Python Crash Course Strings, Math
No ratings yet
Python Crash Course Strings, Math
27 pages
By Ghazwan Khalid Auda
100% (1)
By Ghazwan Khalid Auda
17 pages
Datascience Lab Manual
No ratings yet
Datascience Lab Manual
46 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
2 pages
Python - Programming
No ratings yet
Python - Programming
9 pages
Scala Cheatsheet
No ratings yet
Scala Cheatsheet
2 pages
Visualization - Python Data Analysis
No ratings yet
Visualization - Python Data Analysis
13 pages
Computer Science 61A
No ratings yet
Computer Science 61A
60 pages
SQL Alchemy
No ratings yet
SQL Alchemy
1,088 pages
Pandas Notes Basic To Advance
No ratings yet
Pandas Notes Basic To Advance
21 pages
Python 160403194316
No ratings yet
Python 160403194316
42 pages
Module-3 Python 17CS664
No ratings yet
Module-3 Python 17CS664
41 pages
Nosql Database Systems: M.Tech. (Iind, Sem Ce/Cn)
100% (1)
Nosql Database Systems: M.Tech. (Iind, Sem Ce/Cn)
135 pages
Data Visualization
No ratings yet
Data Visualization
9 pages
Introduction To Data Visualization With Python
No ratings yet
Introduction To Data Visualization With Python
47 pages
List Comprehension in Python
No ratings yet
List Comprehension in Python
8 pages
Dev Answer Key
100% (1)
Dev Answer Key
17 pages
Python 4
No ratings yet
Python 4
132 pages
0802 Python Tutorial
100% (1)
0802 Python Tutorial
155 pages
PostgreSQL Cheat Sheet - Hackr - Io
No ratings yet
PostgreSQL Cheat Sheet - Hackr - Io
90 pages
Optimizing Hadoop for MapReduce
From Everand
Optimizing Hadoop for MapReduce
Khaled Tannir
No ratings yet
Pandas Cheat Sheet
100% (2)
Pandas Cheat Sheet
6 pages
Pandas Cheat Sheet CN
No ratings yet
Pandas Cheat Sheet CN
4 pages
RTGS
No ratings yet
RTGS
76 pages
Excel Creating Charts and Conditional Formatting
No ratings yet
Excel Creating Charts and Conditional Formatting
4 pages
Physical Facilities
100% (2)
Physical Facilities
18 pages
1954 Barranquilla Prospectada. ANGÚLO PDF
No ratings yet
1954 Barranquilla Prospectada. ANGÚLO PDF
288 pages
Undertaking 2023-24
0% (1)
Undertaking 2023-24
1 page
Create Elegant Data Visualisations Using The Grammar of Graphics - Ggplot2
No ratings yet
Create Elegant Data Visualisations Using The Grammar of Graphics - Ggplot2
5 pages
Principles of Commercial Geography I
No ratings yet
Principles of Commercial Geography I
2 pages
Boiler Operations Safety Manual Calderas
No ratings yet
Boiler Operations Safety Manual Calderas
44 pages
Plan de Mantenimiento Excavadora
No ratings yet
Plan de Mantenimiento Excavadora
2 pages
MASTERLOAD - Iq o4MAY2021 05042020
No ratings yet
MASTERLOAD - Iq o4MAY2021 05042020
8 pages
Bottle Jack: TON Toneladas
No ratings yet
Bottle Jack: TON Toneladas
4 pages
Hidden Line and Surfaces
0% (1)
Hidden Line and Surfaces
12 pages
Radio Amateur Examination Q001-Q200
No ratings yet
Radio Amateur Examination Q001-Q200
422 pages
State Wise List of Nodal Department, Institution and Nodal Officers. - National Water Mission, Ministry of Jal Shakti, Department of Water Resources, RD & GR, Government of India
No ratings yet
State Wise List of Nodal Department, Institution and Nodal Officers. - National Water Mission, Ministry of Jal Shakti, Department of Water Resources, RD & GR, Government of India
6 pages
SR# Call Type A Party B-Party Date & Time Duration Cell ID
0% (1)
SR# Call Type A Party B-Party Date & Time Duration Cell ID
117 pages
X-008064 16
No ratings yet
X-008064 16
16 pages
JBD New Price Catalogue
No ratings yet
JBD New Price Catalogue
80 pages
Mis Asg 3rd Sem
0% (1)
Mis Asg 3rd Sem
5 pages
United Motor Nitrox 150 Picture Book 2010 Despiece
100% (1)
United Motor Nitrox 150 Picture Book 2010 Despiece
33 pages
User Manual Indi-Pas Sensy
No ratings yet
User Manual Indi-Pas Sensy
4 pages
Lay Out New
No ratings yet
Lay Out New
1 page
Iot Based Biometric Attendance System: Ijarcce
No ratings yet
Iot Based Biometric Attendance System: Ijarcce
4 pages
Leviton 6124 Timer Instructions
No ratings yet
Leviton 6124 Timer Instructions
2 pages
Banasthali Text Broucher2012-13
No ratings yet
Banasthali Text Broucher2012-13
7 pages
Robotics RRL
No ratings yet
Robotics RRL
2 pages
Kraft Elektronik AB: Vendor Instruction Technical Description ESP Power Supply, 3 Phase T/R Unit
No ratings yet
Kraft Elektronik AB: Vendor Instruction Technical Description ESP Power Supply, 3 Phase T/R Unit
20 pages
Interfacial Tension
No ratings yet
Interfacial Tension
12 pages
Plant Design PDF
No ratings yet
Plant Design PDF
65 pages
Revised Resume
No ratings yet
Revised Resume
1 page