0% found this document useful (0 votes)

5 views17 pages

Summary - Advanced Power Query and Data Modelling-4142

Uploaded by

Rana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views17 pages

Summary - Advanced Power Query and Data Modelling-4142

Uploaded by

Rana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

Summary: Advanced Power Query and Data Modelling

SESSION OVERVIEW:

By the end of this session, students will be able to:

● Handle missing values and errors using Power Query Editor
● Merge and append tables in Power Query Editor
● Understand major data models in Power BI and the need of data models
● Define relationships in data models and the cardinality

KEY TOPICS AND EXAMPLES:

1. Power Query - 3: Handling missing values/errors and

merging/appending tables

Many times situations arise that require us to handle missing values, errors and outliers.
Power Query editor has numerous tools that enable us to handle such situations.

a. Handling Missing Values

Power BI automatically assigns a ‘null’ value to all columns wherever the data is
missing. We can also see the number of null/empty values through column profiling.
This is the green bar below each column that shows how many values are valid, have
errors or are empty.

Note: By default, Power Query performs data profiling over the first 1,000 rows of
our dataset. If we want it to operate over the entire data set, select the Column
Profiling based on the top 1000 rows button in the lower part of your editor to
change the column profiling to Column Profiling based on entire data set.

Example: This can be seen in the sales dataset provided. The Customer Region
column has 35% empty values. On hovering over the column profile, it shows the
number (and %) of empty values and provides an option to remove empty values, as
shown below.

1
How do we handle such missing values?
Here are some ways to do them:
i. Remove Empty: This option that is provided by the column profiling bar
will simply filter out all rows which have empty values in that particular
column. This is usually not desirable as in this case we lose the information
stored in the other columns for that filtered out row.
ii. Replace Values: Another way to handle missing values is using the Replace
Values option. We can replace missing values with a dummy value (e.g. 0) or
NA.
Example: This is shown for the Sales Dataset below, for the Customer
Region column. We can replace the ‘null’ values by “Not Defined”.

Note: The value with which we are replacing must be of the same data type as the
value we intend to replace. For example if we are replacing null values in a number
data type column with the String value “NA” then it will throw an error. To handle

2
such errors it is important to first convert the column data type to the required data
type (in this case String) and then use the “Replace Values” operation.

iii. Conditional Column: We can also create a conditional column that returns a
standard value or a value from another column in case of null value. In all
other cases, it returns the non-null value from the column.

Example: This can be seen below for the dataset. If customer age is greater
than 30 then it returns “>30”, if it is less than or equal to 30, it returns “<=30”
else it returns null.

b. Handling Errors

In certain situations we get errors in our columns when Power BI is not able to
recognise the format of a value. For example, in a number column, if one row has text
then Power Query will show an error in that cell.

Whenever we see an error in a column, the column profile bar will change colour and
on hovering over the bar, we can see the exact metrics: number of errors.
Example: In the Sales Dataset, we can see the column profile for the Customer Age
column.

3
There are multiple ways to handle errors:

i. Remove Errors: Power Query also provides an option to Remove Errors. It
depends on stakeholders if this approach can be used, but in general this
approach is not recommended as it removes the entire row and hence there is
a loss of data from other columns for that row.

There are 2 ways to remove errors.

1. Column Profile bar shows an option to Remove Errors (as shown in
image above).
2. Right click on column → Remove Errors

ii. Replace Errors: We can replace all the errors in a column by a fixed value
(0/1/NA).
Example: In this Sales dataset, we can see that one of the values in the
customer age column is a data entry error. ‘Abc’ is mentioned as customer
age instead of a numerical value. One of the ways in which we can handle
this value is by right clicking on the column → Replace Errors → Enter
34.36 as it is the average age (we can also replace by median or any other
suitable value).

Note: Here, if we try to replace this value by NA or any other string value, it
will throw an error as shown below. This is because Customer Age is a
numerical column and it accepts only numbers. A replacement of any
numerical value cannot be a string/text.

4
c. Merging Tables
When we have data in two tables and we want to bring columns from one table to the
other, we may need to merge tables. Merging tables functions like Excel VLOOKUP
or SQL joins. This can be done using the Merge Queries option in the Combine
group in the Home tab.

Example: In the example below, there are two tables:

- Sales: The CountryID field is an identifier from the Countries table.
- Countries: This table contains the CountryID and the country name.
We can join these tables using the column column, in this case Country ID.

5
There are multiple types of joins that Power BI provides:
● Left outer: Returns all rows from the left table and matching rows from the right
table.
● Right outer: Returns all rows from the right table and matching rows from the left
table.
● Full outer: Returns all rows from both the tables.
● Inner: Returns only matching rows from both the left and the right tables.
● Left anti: Returns only the rows from the left table that do not match with rows from
the right table.
● Right anti: Returns only the rows from the right table that do not match with rows
from the left table.

6
Example: To practice merging tables, this dataset can be used. It has two sheets: Sales and
Customers. These steps need to be followed to merge these tables:
1. Go to Merge Queries
2. Select the Customers table from the dropdown menu in the Merge window
3. Select the customer ID column in both tables
4. click OK

5. It shows an additional column: Customers (shown below). Click on the icon on the top right
of the Customers column.

6. Select only the Customer Name column from the dropdown list. Make sure the Expand option
is selected instead of aggregate.
a. Expand: When you choose "Expand," Power BI brings in all individual rows from
the merged table and creates a new column within your existing table.

7
b. Aggregate: Choosing "Aggregate" allows you to summarize the data from the
merged table before incorporating it into your main table. You can choose various
aggregation functions like SUM, and COUNT

7. On clicking OK, the Customers.Customer Name column appears in your existing data. In this
manner we have merged both tables.

d. Appending Queries
If we have multiple sheets or datasets in the same format and they need to be placed
on top of each other, we use the Append functionality. Key salient features include:
● The data type of each column and the order of columns should be the same in
the tables that are to be appended.

8
● It is similar to the UNION functionality in SQL.
● The append dialogue box presents 2 options:
○ Append Queries: This is the default option, allowing you to select
multiple queries (tables) you want to combine.
○ Append Queries as New: Choosing this option creates a new query
with the appended data, keeping the original tables intact.

For example: Get data from a new data source: Sales - Part 2 Excel file and use the
append operation on the Sales table in the Sales dataset. This is demonstrated below.

Step 1: In the Power Query Editor, Get data from Sales - Part 2 Excel file.

Step 2: In the Sales dataset, click on Append queries. In the Append window, select
the Sales - Part 2 dataset as shown below.

9
Step 3: As can be seen in this image, the queries (tables) have been appended.

2. Models in Power BI

Data Modelling in Power BI enables us to connect multiple data sources using relationships.
Relationships define how data sources are connected to each other. Note: In Excel, Data
Modelling is called Power Pivot.

Types of models in Power BI:

A. Star Schema
a. It is a design approach that is widely used by data engineers as it presents a
user-friendly structure and high performance analytics queries.
b. It classifies model tables as fact or dimension. A fact table forms the center of the star
while dimension tables form the points of the star.
c. Fact Table: It stores events or observations. It includes dimension key columns and
numeric measure columns that can be summarised. The main goal of a fact table is
to summarise.
d. Dimension Table: It stores business entities. It includes a key column and
descriptive columns for filtering and grouping.
e. Dimension tables are usually related to fact tables by using one-to-many
relationships. This allows filters applied to dimension table columns to propagate to
the fact table. We can hence directly apply the filters on the fact table to get the
desired output.

10
Note: Star schema is an example of a type of dimensional data modelling. Another type of
dimensional modelling type is the Snowflake schema.

The above model comprises 7 tables, the central one of which is the fact table (Sales) and the
rest 6 are dimension tables:
● Sales Order
● Reseller
● Customer
● Sales Territory
● Date
● Product

B. ER Model
It represents entities and relationships in a diagrammatic form. This model is widely used in
relational databases. Entities are represented by tables, attributes by columns and relationships
by lines connecting entities.

11
Phases of a Data Model:
1. Conceptual data model: It defines the overall structure of the business and the data. We
define at the table level and data level (entities and entity relationships). For instance, you
may have customer, employee, and product data.
2. Logical data model: Here we look at specific attributes within the entities: the
columns/attributes of a table. For instance, Customer A buys Product B from Sales Associate
C.
3. Physical data model: Here we look at the data type and size of the attributes of the table. It
is the actual implementation of the data model.

Need of a Good Data Model in Power BI:

● Saves time, effort and money
● Increases performance and speed of analytics queries
● Reduces errors
● Maintains quality of data

Example of Data Model:

Use the Sales dataset, Product Information dataset and Sales Detail dataset for implementing a data
model practically in Power BI.

12
Once you import the datasets in Power BI, it will automatically detect the relationships and create a
star schema model, as shown below. The central table is the fact table and the rest are dimension
tables.

Power BI has automatically identified the common column: Customer ID from the four tables and has
created all necessary relationships
● Sales table in the Sales dataset
● Customers table in the Sales dataset
● Product Information table
● Sales Detail table

3. Relationships and Cardinality

● After we have defined the model, as explained in the previous section, we need to define
relationships among our data sources. In the Models section of Power BI Desktop, we can see
the relationship between our data sources. Power BI tries to automatically detect the
relationship between different data sources. These relationships are defined using columns in
the dataset.

● We can also define a new relationship. To create a new relationship, we need to drag a column
name from one data source to the respective column in the other data source. Another way to
do this is to go to the Manage Relationships option under the Home tab and click on the New
button. This is shown in the image below:

13
● We can change an existing relationship below selecting a relationship and clicking on Edit to
edit it. This is shown in the image below:

● The relationship view can also be used to hide a column in the report. Right click the column
name in the data source and select “Hide in report view”.

Example: In our dataset, we have a relationship between Sales and customers through Customer ID
column, as shown below.

14
Cardinality:
Each model relationship has a particular cardinality associated with it. Cardinality refers to the
relationship or the number of distinct values in the relationship. It determines how rows in one table
relate to rows in another table based on specific columns. “One” means the column has unique values
while “many” means that the column can contain duplicate values. The four types of cardinality are:
● One-to-many (1:*)
● Many-to-one (*:1)
● One-to-one (1:1)
● Many-to-many (*:*)
Power BI detects the best cardinality for each relationship in your data model. You can also change
this in the data model.

Note: Many to many relationships are not recommended. This is due to the following reasons:
● The model doesn’t provide flexibility in the ways you use the filter or group functionalities.
● It also makes it difficult to create valid calculations and aggregations, slows down queries,
and impacts overall report performance.
● It can lead to unpredictable results, and lack of normalization.

Hence, many to many relationships should be avoided. It should be noted that the cardinality defaults
to many-to-many when it determines neither table contains unique values for the relationship
columns.

Example: In our dataset, Power BI has intelligently identified a one-to-one cardinality on the
Customer ID column in Sales sheet as well as Customers sheet.

A. Direction of Data Flow

Relationships are defined using a cross filter direction, which determines the direction that
filters will propagate. There are 2 types of cross filter directions:

15
Single
Only one table in the relationship can filter data across to the other table. If we apply filters in the
other table, they are not applied in the first table.

Bi-directional
Data can be filtered in either direction. If we filter data in the first table, the other table also gets
filtered, and vice versa. This type is not recommended as it can create ambiguous filter paths.

As can be seen from the image below, Power BI has automatically classified some relationships as
single directional and some as bi-directional.

B. Active/Inactive Relationships

As can be seen from the image above, we have an option (toggle) to make a relationship
active/inactive in the Data Modelling window. What if you need more than one connection
between tables? This is where active and inactive relationships come in.

Active Relationships: The Workhorses:

An active relationship is the default connection between two tables. It's symbolized by a solid
line in the Power BI model view.
● When you filter data in one table, the filter automatically propagates to the connected
table through the active relationship. This allows you to explore related data
seamlessly.
● Each table can only have one active relationship with another table. This prevents
ambiguity and ensures clear filter flow.

16
Inactive Relationships: Hidden Paths:
Inactive relationships are additional connections you create between tables, represented by
dashed lines.
● They don't participate in automatic filter propagation.
● The power of inactive relationships lies in their controlled activation.
● You can use the USERELATIONSHIP function in DAX expressions to activate a
specific inactive relationship for calculations or custom visuals.

Why Use Inactive Relationships?

Imagine a "Customers" table with a "Customer Type" column. You can have separate inactive
relationships for "Customer Type" connecting to a "Sales" table and a "Support Tickets" table,
allowing for role-based analysis.

Active relationships are the backbone for filter propagation in your Power BI model. But don't
underestimate the power of inactive relationships. They offer flexibility and control for
complex data analysis scenarios, especially when dealing with multiple relationships or
role-playing dimensions.

Session 4 - Details
No ratings yet
Session 4 - Details
18 pages
Daily Running Notes - Power Query
No ratings yet
Daily Running Notes - Power Query
21 pages
PBI 1.2 (Final)
No ratings yet
PBI 1.2 (Final)
27 pages
Basic To Advanced Excel Session 6
No ratings yet
Basic To Advanced Excel Session 6
10 pages
Master Advanced MS Excel - 25 Key Interview Questions
No ratings yet
Master Advanced MS Excel - 25 Key Interview Questions
30 pages
POWER BI - Participant
No ratings yet
POWER BI - Participant
35 pages
Dta102 - Bid102 - Day 2
No ratings yet
Dta102 - Bid102 - Day 2
21 pages
Lecture Sheet-Power Query
No ratings yet
Lecture Sheet-Power Query
17 pages
Excel Power Query Course Notes: Leila Gharani (Microsoft Excel MVP)
100% (2)
Excel Power Query Course Notes: Leila Gharani (Microsoft Excel MVP)
43 pages
Data Analysis Using Excel Handouts
No ratings yet
Data Analysis Using Excel Handouts
95 pages
Lab 02 Manual
No ratings yet
Lab 02 Manual
9 pages
pl300 240727 164137
No ratings yet
pl300 240727 164137
39 pages
powerBI - Lesson2
No ratings yet
powerBI - Lesson2
33 pages
LQ UZO TQjav7mTv7fI2Zg C3W1-Quick-Reference
No ratings yet
LQ UZO TQjav7mTv7fI2Zg C3W1-Quick-Reference
1 page
Real Power BI Interview Questions
No ratings yet
Real Power BI Interview Questions
21 pages
PowerBi VideoNotes PDF
No ratings yet
PowerBi VideoNotes PDF
9 pages
Power Query of Excel 13
No ratings yet
Power Query of Excel 13
4 pages
Day-2 Power Bi
No ratings yet
Day-2 Power Bi
31 pages
Excel BI Tools for Data Analysts
No ratings yet
Excel BI Tools for Data Analysts
29 pages
Data Analytics With Financial Accounting Information: Winter 2022 Session 4
No ratings yet
Data Analytics With Financial Accounting Information: Winter 2022 Session 4
36 pages
Power Pivot & Query Essentials
No ratings yet
Power Pivot & Query Essentials
19 pages
Advanced Power Query
No ratings yet
Advanced Power Query
90 pages
Week 2 Stats
No ratings yet
Week 2 Stats
46 pages
Power BI Data Analyst
No ratings yet
Power BI Data Analyst
51 pages
in Built Row and Column Transformations
No ratings yet
in Built Row and Column Transformations
11 pages
BSYS - Final Notes
No ratings yet
BSYS - Final Notes
10 pages
Power Query ETL Guide for Power BI
No ratings yet
Power Query ETL Guide for Power BI
22 pages
Power BI Interview Questions and Answers
No ratings yet
Power BI Interview Questions and Answers
7 pages
in Built Row and Column Transformations
No ratings yet
in Built Row and Column Transformations
11 pages
Power Query Transforming Data in Power BI Like A Pro
No ratings yet
Power Query Transforming Data in Power BI Like A Pro
31 pages
Notes BAE
No ratings yet
Notes BAE
9 pages
Power BI Data Analysis Guide
No ratings yet
Power BI Data Analysis Guide
24 pages
Course Content - SQL MSPBI
No ratings yet
Course Content - SQL MSPBI
12 pages
Introduction To Power Query Week 2
No ratings yet
Introduction To Power Query Week 2
35 pages
Automate Bank Reconciliations With Excel Power Quer1
No ratings yet
Automate Bank Reconciliations With Excel Power Quer1
5 pages
Power BI Unit-1 Complete Study Guide
No ratings yet
Power BI Unit-1 Complete Study Guide
5 pages
Excel PowerQuery PowerPivot DAX
No ratings yet
Excel PowerQuery PowerPivot DAX
105 pages
Power+BI Cheat+Sheet
No ratings yet
Power+BI Cheat+Sheet
1 page
Power BI Blog 4
No ratings yet
Power BI Blog 4
3 pages
Module 2 - Connecting & Shaping Data
No ratings yet
Module 2 - Connecting & Shaping Data
25 pages
Power Query s1
No ratings yet
Power Query s1
9 pages
Data Cleaning in Power BI
100% (1)
Data Cleaning in Power BI
9 pages
Data Transformation in Excel
No ratings yet
Data Transformation in Excel
5 pages
Wa0001.
No ratings yet
Wa0001.
24 pages
Power BI Syllabus
No ratings yet
Power BI Syllabus
12 pages
PowerBI Detailed Data Preparation Notes
No ratings yet
PowerBI Detailed Data Preparation Notes
3 pages
UNIT IV Data Cleaning Techhniques
No ratings yet
UNIT IV Data Cleaning Techhniques
43 pages
Power BI Curriculum by Laxman
No ratings yet
Power BI Curriculum by Laxman
8 pages
Data Analyst
No ratings yet
Data Analyst
13 pages
Data Cleaning and Formatting in Power BI (Slides)
No ratings yet
Data Cleaning and Formatting in Power BI (Slides)
12 pages
DS41 Power BI Technical Live Session
No ratings yet
DS41 Power BI Technical Live Session
14 pages
Manual Transforming Data Into Intelligence With Power BI and AI Integration
No ratings yet
Manual Transforming Data Into Intelligence With Power BI and AI Integration
131 pages
PowerQuery PowerPivot DAX
100% (6)
PowerQuery PowerPivot DAX
114 pages
Data Cleaning in Excel
No ratings yet
Data Cleaning in Excel
4 pages
PowerBI Excel Import Practical NO 3
No ratings yet
PowerBI Excel Import Practical NO 3
3 pages
Powerquery Tip Card
No ratings yet
Powerquery Tip Card
1 page
Macaw Power BI Cheat Sheet EN
No ratings yet
Macaw Power BI Cheat Sheet EN
2 pages
Radiomuseum Mitsubishi Electric 8 X 584 A 543720
No ratings yet
Radiomuseum Mitsubishi Electric 8 X 584 A 543720
2 pages
Shariah Compliant Mutual Funds
No ratings yet
Shariah Compliant Mutual Funds
6 pages
Simulation and Vibration Analysis of Shaft Cracks
No ratings yet
Simulation and Vibration Analysis of Shaft Cracks
7 pages
IE506 Assignment1
No ratings yet
IE506 Assignment1
2 pages
RESEARCH MANUSCRIPT - Group 4
No ratings yet
RESEARCH MANUSCRIPT - Group 4
53 pages
Thesis SW
No ratings yet
Thesis SW
17 pages
Chapter 8 Requirements Documentation
No ratings yet
Chapter 8 Requirements Documentation
42 pages
Case-4-Fvclu-Vs-Sanama 2
No ratings yet
Case-4-Fvclu-Vs-Sanama 2
3 pages
Is 13730 0 6 2012
No ratings yet
Is 13730 0 6 2012
21 pages
50154-50189 Re SSC 10th Class Social Study Materialcompressed
No ratings yet
50154-50189 Re SSC 10th Class Social Study Materialcompressed
99 pages
Student Result System SRS
No ratings yet
Student Result System SRS
14 pages
Can Bus Protocol Description 220419
No ratings yet
Can Bus Protocol Description 220419
86 pages
BrassRing User Guide PDF
100% (1)
BrassRing User Guide PDF
32 pages
Training Report of Sonaklika International Hoshiarpur
No ratings yet
Training Report of Sonaklika International Hoshiarpur
18 pages
MC302X DIN Rail Motion Controller
No ratings yet
MC302X DIN Rail Motion Controller
30 pages
Annotated Bibliography Fast Food Restaurants
100% (1)
Annotated Bibliography Fast Food Restaurants
5 pages
LLM Course Notes
No ratings yet
LLM Course Notes
3 pages
List & Bundle of The Claimant's Documents-City Interscope Claim
No ratings yet
List & Bundle of The Claimant's Documents-City Interscope Claim
29 pages
Text Classification and Processing Using NLP
No ratings yet
Text Classification and Processing Using NLP
21 pages
Mahatma Jyotiba Phule Rohilkhand University, Bareilly: Under-Graduate (UG) Admission Registration Form
No ratings yet
Mahatma Jyotiba Phule Rohilkhand University, Bareilly: Under-Graduate (UG) Admission Registration Form
1 page
1st Draft Ict Diploma November 2024 Exam Timetable
No ratings yet
1st Draft Ict Diploma November 2024 Exam Timetable
5 pages
IA-17 UAV Brochure
No ratings yet
IA-17 UAV Brochure
4 pages
Ecosystem PDF
No ratings yet
Ecosystem PDF
17 pages
Penalty Fares: South Western Railway April 2018
No ratings yet
Penalty Fares: South Western Railway April 2018
2 pages
Modbus Manual
No ratings yet
Modbus Manual
11 pages
Urban Geography Study Guide
No ratings yet
Urban Geography Study Guide
98 pages
Elite Entrepreneur Admin Guide
100% (5)
Elite Entrepreneur Admin Guide
14 pages
Actions in Case of Breach of Contract of Carriage
No ratings yet
Actions in Case of Breach of Contract of Carriage
11 pages
Chapter 10
No ratings yet
Chapter 10
36 pages
Peran Tenaga Fisioterapi Pada Kasus Anak Delayed
No ratings yet
Peran Tenaga Fisioterapi Pada Kasus Anak Delayed
15 pages

Summary - Advanced Power Query and Data Modelling-4142

Uploaded by

Summary - Advanced Power Query and Data Modelling-4142

Uploaded by

Summary: Advanced Power Query and Data Modelling

By the end of this session, students will be able to:

KEY TOPICS AND EXAMPLES:

1.​ Power Query - 3: Handling missing values/errors and

a.​ Handling Missing Values

b.​ Handling Errors

There are 2 ways to remove errors.

Example: In the example below, there are two tables:

2.​ Models in Power BI

Types of models in Power BI:

Need of a Good Data Model in Power BI:

Example of Data Model:

3. Relationships and Cardinality

A.​ Direction of Data Flow

B.​ Active/Inactive Relationships

Active Relationships: The Workhorses:

Why Use Inactive Relationships?

You might also like

1. Power Query - 3: Handling missing values/errors and

a. Handling Missing Values

b. Handling Errors

2. Models in Power BI

A. Direction of Data Flow

B. Active/Inactive Relationships