[go: up one dir, main page]

0% found this document useful (0 votes)
186 views41 pages

Keyrus - 2021 Alteryx Designer Onboarding

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
186 views41 pages

Keyrus - 2021 Alteryx Designer Onboarding

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

WORKSHOP ONBOARDING

October - 2021

© Copyright 2021 – Keyrus 1


SESSION 1
• Getting Started
• Connecting to Data
• Data Preparation
ABOUT ALTERYX

" Alteryx is a drag and drop environment


that puts advanced data analysis tools in
reach of those who are not specialists "

3
DATA PREPARATION

Connecting
Formatting & Transform Output Data
The ability to connect to
The ability to change how the Output data in different
multiple data sources in
data is presented. formats
multiple data format

Cleansing & Profiling Combining


The ability to profile data The ability to combina and
and detect formatting and enrich data
misleading data issues

4
ALTERYX INTERFACE
The Alteryx user interface consists of several components that can be moved, re-sized, reorganized, or closed

1 2
1. Main menu 3
2. Search bar
3. Tool palette
4. Configuration window 5
4
5. Canvas
6. Results window

5
TOOL CATEGORIES

The Tools palette consists of tools organized into categories

The most common tools used to create workflows are in the favorites category

6
CONNECT, SELECT, AND MAP DATA TYPES

• The Input Data tool will be the one that allows you to connect to the data

•The Browse tool will be the one that allows you to look at preliminary results as you develop a workflow

• Alteryx classifies the data into several types that allow you to know the characteristics of the data, for
example, number or text.

• The tool Autofield allows you to "refine" the type of data according to the content that each field has.

• Users can also view or change the data type by connecting Tool Select to their dataset.

• There are approximately 15 different data types grouped into the following main groups: Boolean,
numeric, string, date/time, and spatial object

7
HOW TO ORGANIZE THE DATA USING SELECT TOOL

• Fields can be reordered using the UP and DOWN arrow or using the Sort or Move functions available

through the Options menu.

•You can de-select fields using this too. Fields that are Removed will be ignored by any additional functions

connected to the workflow

• With the rename property users will have the ability to rename the fields that appear in the output window

•It is possible to include descriptions of fields that are only visible in the Select Tool configuration window.

8
EXERCISE 1A: INPUT DATA AND SET FIELD TYPES

Use Input Data to connect to Dataset.csv


Use Select to rename fields as ‘Product Name’ and ‘Postcode’
Use Auto Field to set all field types
Add a Browse Tool to view results

9
RECORD ID Y CROSS TAB

•If your dataset does not have a unique identifier, the Record ID tool will take care of

this by creating a column of sequential integers.

•The Cross Tab tool is similar to the Pivot Table function in Excel, allowing users many

different ways to combine, manipulate, and display data

10
EXERCISE 1B: CALCULATE TOTAL BY DAY

Use Record ID to assign a unique identifier to each record


Use Cross Tab to calculate the Sum of Sales by Product for each Date

11
TOOLS FILTER Y SUMMARIZE
•The Filter tool splits data into two separate sets based on user criteria, True and False
• The True set consists of all data that match user criteria

• The False set consists of all data that does not match user criteria

•The Summarize tool aggregates data using a variety of actions, such as Group By,
Average, Sum, Count, etc.

•The fields that are shown as a result of the Summarize action are the only fields that
will be brought forward in the workflow

12
EJERCICIO 1C: FILTER Y SUMMARIZE

Use Summarize to find out how many rows relate to Tridesta


Use Filter to remove all Tridesta rows

13
FORMATTING DATE VALUES

•The DateTime tool provides a quick way of extracting a DateTime value from a String field

•Dates are properly represented in Alteryx as YYYY-MM-DD

•The DateTime tool must be configured with the format of the incoming date string

14
EXERCISE 1D: PARSE DATES

Use DateTime Parse to convert the date from MM/dd/yyyy

Add a Select tool to remove the old Date field and rename DateTime_Out as Date

15
TOOL OUTPUT DATA

•The Tool Output Data will be the one that allows to generate an
output of my already prepared data.

•It offers different types of output: files, writes to database tables, etc.

•It can be configured depending on the type of output format.

16
RESULTS WINDOW

•Errors inform users when a tool is not configured correctly

•Conversion Errors inform users when data exceeds files limits or cannot be
converted correctly

•Warnings will occur in a variety of circumstances, such as when a field is missing or


computer processing power is low

•Messages provide general feedback on the workflow.

•Files displays all files linked to, in, or from the workflow.

17
ANNOTATIONS, COMMENTS AND CONTAINERS

•The Annotation icon in the configuration window allows users to rename individual tools

•The Comment tool allows users to place free text on the canvas

•A Container can be used to package workflows or portions of a workflow to organize the


canvas

18
EXERCISE 1E – OUTPUT DATA AND TIDY WORKFLOW

Add Annotations to each tool

Add Tool Containers to your workflow

Add an Output Data tool to create the file Dataset.yxdb

19
TEXT TO COLUMNS

Use Text To Columns to split delimited data into separate fields

It is possible to indicate more than one delimiter but they will be


treated as a unique character

20
EXERCISE 2

Open Exercise 2 Start File.yxmd


Use Alteryx to strip out the delimiters as superfluous and format the data as represented in the
output

From:

To:

21
SESSION 2 •

Data Cleansing
Union, Join & Append
DATA CLEANSING

•The data cleansing tool performs basic data cleansing operations such as replacing null
values and removing punctuation

•Use Data Profiling in the Browse tool to check if your data is clean

23
EXERSICE 3: DATA CLEANSING

•Connect to Messy Data.csv


•Use Data Profiling to locate messy data
•Use the Data Cleansing tool to clean the messy data

24
UNION JOINS

•The Union tool is used for merging files that have similar structures

•Data from input files will be merged vertically so that data from like fields will be stacked on top
of each other

•Data will be merged in the order that input datasets are connected to the Union Join tool as a
default

25
EXERCISE 4: COMBINING FILES WITH A UNION JOIN

Use 3 Input tools to connect to the CSV files in ‘Exercise 4’

Add a Union tool to combine the files

Bonus: Use a wildcard in the Input Tool to simplify the process


26
JOIN DATA
•The Join tool can only accept two inputs
•Data is merged based on matches in one or fields as specified by the user
•The Join tool will produce three output datasets
•The L and R outputs consist of entries that are unique to the correlating input
datasets
•The J output consists of entries that in both the L and R datasets. This is
known as the Inner Join

27
BASIC JOIN TYPES

28
EXERCISE 5: INNER JOIN

Connect to the Customers dataset


Join the combined Sales data and Customer data using a Join tool,
filtering null values where appropriate

29
EXCERCISE 6: JOINS
Connect to Customers, Sales and Events files
Build a dataset of those customers (whether or not they have bought tickets for
events) adding information about the events of their tickets.

30
APPENDING DATA

•The Append Fields tool allows users to merge datasets that do not have any common fields

•The Append Fields tool can only accept two inputs – a Target and a Source

•Data from the Source dataset will be brought onto the Target dataset

31
EXERCISE 7: APPENDING DATA
Calculate the total sales per product and include a column containing the total sales
reported in the source file.

32
EXERCISE 8: RECREATE JOIN TYPES

•Open Exercise 8 Start File.yxmd


•Use the provided input files to recreate each of the outputs
• Inner Join
• Left Join
• Right Join
• Left Outer Join
• Right Outer Join
• Full Join
• Outer Join
•Hint: use both Join and Union

33
SESSION 3
• Formula Tool
• Conditional Functions
• String Functions
FORMULA TOOL

•The Formula tool allows users to apply formulas to existing datasets

•Alteryx has many built-in standard expressions to aid formula creation

•Formula results can either go into a new column or overwrite existing data

•The Data Preview box shows a preview output, allowing users to see if the formula is correctly
constructed

•We will focus on two types of formula: Conditional functions and String functions

35
CONDITIONAL FUNCTIONS

•Conditional functions allow us to make changes to our data based on logical conditions

•These are expressed using IF, THEN, ELSE

•Conditional functions always begin with IF and finish with ENDIF

•We can add comments by prefixing them with //

36
EXERCISE 9: CONDITIONAL FUNCTIONS

•Connect to the file Iris_data.csv


•Use a formula tool to add a new field called Units
•If the Class is Setosa units should be inches, everything else is cm
•Use a second formula tool to add ”Length in cm” by multiplying inches

37
STRING FUNCTIONS

•String functions allows us to make changes to or find information within string fields

•String functions are often referred to as ”parsing”. Parsing data is simply the process of
separating data into specific components

•Datasets that are not consistent, including multiple separator types and/or components of
varying lengths, should be parsed using expressions in the Formula tool

38
EXERCISE 10: STRING FUNCTIONS

The Input contains dates formatted as year, month, day where the first character
details if the year begins with 19 or 20.

It is 19 when the first character is 0 and 20 when the character is 1.

The remainder of the date following the 0 or 1 is the remaining year digits followed
by month followed by day.

1 = 2000

0 = 1900

39
EXERCISE 11: POPULAR BABY NAMES

• Given the provided dataset, determine the most popular names for Males and
Females for each available year.

•The column "Field_1" contains three concatenated values: the name, the associated
gender (Male or Female) and the number of occurrences that the name appeared in
birth records.

•The column "FileName" contains the name of the file in which the record is found;
the data was read in from a zip file that contained text files for each year (1880-2017)
of records.

40
HINT: YOU WILL NEED…

•Formula Tool
•Text to Columns Tool
•Auto Field Tool
•Summarize Tool
•Join Tool(s)
•Select Tool(s)

You might also like