[go: up one dir, main page]

0% found this document useful (0 votes)
90 views51 pages

BI Journal

Uploaded by

speedyrohan8
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
90 views51 pages

BI Journal

Uploaded by

speedyrohan8
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 51

PRACTICAL 1

Import the legacy data from different sources such as ( Excel , SqlServer, Oracle etc.) and load in the
target system. ( You can download sample database such as Adventureworks, Northwind, foodmart
etc.)

Step 1: Open Power BI

Step 2: Click on Get data following list will be displayed → select Excel
Step 3: Select required file and click on Open, Navigator screen appears

Step 4: Select file and click on edit


Step 5: Power query editor appears

Step 6: Again, go to Get Data and select OData feed


Step 7: Paste url as

http://services.odata.org/V3/Northwind/Northwind.svc/ Click on ok

Step 8: Select orders table

And click on edit

Note: If you just want to see preview you can just click on table name without clicking on checkbox

Click on edit to view table


PRACTICAL 2

Perform the Extraction Transformation and Loading (ETL) process to construct the database in the
Power BI. Step 1: Open Power BI, Click on Get Data → OData Feed

Paste Url : http://services.odata.org/V3/Northwind/Northwind.svc/

And Click OK

Step 2: Click on Check Box of Products table and then click on Edit
1) Remove other columns to only display columns of interest In Query Editor, select the ProductID,
ProductName, QuantityPerUnit, and UnitsInStock columns (use Ctrl+Click to select more than
one column, or Shift+Click to select columns that are beside each other).

Select Remove Columns > Remove Other Columns from the ribbon, or rightclick on a column header and
click Remove Other Columns
After selecting Remove Other Columns only selected four columns are displayed other columns are
discarded.

2. Change the data type of the UnitsInStock column a) Select the UnitsInStock column.

Check if the data type


of selected column is a
Whole number
b) Select the Data Type drop-down button in the Home ribbon.

c) If not already a Whole Number, select Whole Number for data type from the drop down (the
Data Type: button also displays the data type for the current selection).

After clicking on Whole number, you can see the changed Datatype in column header of UnitsInStock.
After above step, close query editor and click on Yes to save changes.

Now you can view fields of Products table on right side, check all the fields of table to get
representation in charts form.

3. Expand the Orders table Once You have loaded a data source, you can click on Recent Sources
to select desired table (Orders).
After selecting the URL, Navigator window will appear from which you can select Orders table.

Click on Edit.
Query Editor Window will appear

1. In the Query View, scroll to the Order_Details column.

2. In the Order_Details column, select the expand icon .

3. In the Expand drop-down:

a. Select (Select All Columns) to clear all columns.

b. Select ProductID, UnitPrice, and Quantity.

c. Click OK.

After clicking on OK following screen appears with combined columns

4. Calculate the line total for each Order_Details row


Power BI Desktop lets you to create calculations based on the columns you are importing, so you can
enrich the data that you connect to. In this step, you create a Custom Column to calculate the line total
for each Order_Details row.

Calculate the line total for each Order_Details row:

a) In the Add Column ribbon tab, click Add Custom Column.

b) In the Custom Column dialog box, in the Custom Column Formula textbox, enter
[Order_Details.UnitPrice] * [Order_Details.Quantity] by selecting from available columns and click on
insert for each column.

c) In the New column name textbox, enter LineTotal.

d) Click OK.
5. Rename and reorder columns in the query

In this step you finish making the model easy to work with when creating reports, by renaming the final
columns and changing their order.

a) In Query Editor, drag the LineTotal column to the left, after ShipCountry.

b) Remove the Order_Details. prefix from the Order_Details.ProductID, Order_Details.UnitPrice and


Order_Details.Quantity columns, by double-clicking on each column header, and then deleting that text
from the column name.
6. Combine the Products and Total Sales queries

Power BI Desktop does not require you to combine queries to report on them. Instead, you can create
relationships between datasets. These relationships can be created on any column that is common to
your datasets.

We have Orders and Products data that share a common 'ProductID' field, so we need to ensure there's
a relationship between them in the model we're using with Power BI Desktop. Simply specify in Power BI
Desktop that the columns from each table are related (i.e. columns that have the same values). Power BI
Desktop works out the direction and cardinality of the relationship for you. In some cases, it will even
detect the relationships automatically.

In this task, you confirm that a relationship is established in Power BI Desktop between the Products
and Total Sales queries

Step 1: Confirm the relationship between Products and Total Sales 1. First, we need to load the model
that we created in Query Editor into Power BI Desktop. From the Home ribbon of Query Editor, select
Close & Apply.
Step 2: Power BI Desktop loads the data from the two queries.

Step 3: Once the data is loaded, select the Manage Relationships button Home ribbon

Step 4. Select the New… button


Step 5: When we attempt to create the relationship, we see that one already exists! As shown in the
Create Relationship dialog (by the shaded columns), the ProductsID fields in each query already have an
established relationship.

Step 6: Select Cancel, and then select Relationship view in Power BI Desktop.
Step 7: We see the following, which visualizes the relationship between the queries.

Step 8: When you double-click the arrow on the line that connects the to queries, an Edit Relationship
dialog appears.
Step 9: No need to make any changes, so we'll just select Cancel to close the Edit Relationship dialog.
PRACTICAL 3

Perform the Extraction Transformation and Loading (ETL) process to construct the database in the SQL
server.

Software requirements: SQL SERVER 2012 FULL VERSION (SQLServer2012SP1-FullSlipstream-ENU-x86)

Step 1:

Open SQL Server Management Studio to restore backup file

Step 2: Right click on Databases → Restore Database


Step 3: Select Device → click on icon towards end of device box

Step 4: Click on Add → Select path of backup files

Step 5: Select files.


Step 6 : Click ok and in select backup devices window Add both files of AdventureWorks

Step 7: Open SQL Server Data Tools

Select File → New → Project → Business Intelligence → Integration Services

Project & give appropriate project name.


Environment consists of SQL Server Integration Services(SSIS)

Step 8: Right click on Connection Managers in solution explorer and click

on New Connection Manager. Add SSIS connection manager window appears.

Step 9: Select OLEDB Connection Manager and Click on Add


Step 10: Configure OLE DB Connection Manager window appears → Click on New

Step 11: Select Server name(as per your machine) from drop down and database name and click on Test
connection.

If test connection succeeded click on OK.

Step 12: Click on OK


Connection is added to connection manager

Step 13: Drag and drop Data Flow Task in Control Flow tab
Step 14: Drag OLE DB Source from Other Sources and drop into Data Flow tab

Step 15: Double click on OLE DB source → OLE DB Source Editor appears→ click on New to add
connection manager. Select [Sales].[Store] table from drop down → ok
Step 16: Drag ole db destination in data flow tab and connect both
Step 17: Double click on OLE DB destination Click on New to run the query to get [OLE DB Destination] in
Name of the table or the view.
Click on ok

Step 18: Click on start.


Step 19: Go to SQL Server Management Studio In database tab → Adventureworks → Right click on
[dbo].[OLE DB Destination] → Scrip Table as → SELECT To → New Query Editor Window

Step 20: Execute following query to get output.

USE [AdventureWorks2012]

GO

SELECT [BusinessEntityID] ,[Name] ,[SalesPersonID] ,[Demographics] ,[rowguid] ,[ModifiedDate] FROM


[dbo].[OLE DB Destination] GO
PRACTICAL 4

Import the data warehouse data in Microsoft Excel and create the Pivot table and Pivot Chart

(Ms Office Professional is used to make sure Power View is enabled for visualization.)

Step 1: Open Excel 2013 (Professional)

Go to Data tab → Get External Data → From Other Sources → From Data

Connection Wizard

Step 2: In Data Connection Wizard → Select Microsoft SQL Server → Click on

Next

Step 3: In connect to Database Server provide Server name( Microsoft SQL Server Name)

Provide password for sa account as given during installation of SQL Server 2012 full version)

Password: admin123
Click on Next

Step 4: In Select Database and Table→ Select Sales_DW (already created in SQL) → check all dimensions
and import relationships between selected tables.

Step 5: In save data connection files browse path and click on Finish
Step 6: In import data select Pivot Chart and click on OK

Step 7: In fields put SalesDateKey in filters, FullDateUK in axis and Sum of ProductActualCost in values
Step 8: In Insert Tab → go to Pivot Table

Step 9: Click on Choose Connection to select existing connection with Sales_DW and click on open
Pivot table and Pivot chart is created
PRACTICAL 5

Apply the what – if Analysis for data visualization. Design and generate necessary reports based on
the data warehouse data.

A book store and have 100 books in storage. You sell a certain % for the highest price of $50 and a
certain % for the lower price of $20.

If you sell 60% for the highest price, cell D10 calculates a total profit of 60 * 50 + 40 * 20 = 3800.

Create Different Scenarios But what if you sell 70% for the highest price? And what if you sell 80% for
the highest price? Or 90%, or even 100%? Each different percentage is a different scenario. You can use
the Scenario Manager to create these scenarios.

Note: To type different percentage into cell C4 to see the corresponding result of a scenario in cell D10
we use what if analysis.

What-if analysis enables you to easily compare the results of different scenarios.

Step 1: In Excel, On the Data tab, in the Data tools group, click What-If Analysis
Step 2: Click on What –if-Analysis and select scenario manager.

The Scenario Manager Dialog box appears.

Step 3: Add a scenario by clicking on Add.


Step 4: Type a name (60percent), select cell F10 (% sold for the highest price) for the Changing cells and
click on OK.

Click on icon which is circled.

Select F10 cell

Click back on the icon again and then click OK

Step 5: Enter the corresponding value 0.6 and click on OK again.


Step 6: To apply scenarios click on Show

Step 7: Next, add 4 other scenarios (70%, 80%, 90% and 100%) Finally, your Scenario Manager should be
consistent with the picture below:
PRACTICAL 6

Perform the data classification using classification algorithm.

OR Data Analysis using Time Series Analysis

Software required: R 3.5.1

Time series is a series of data points in which each data point is associated with a timestamp. A simple
example is the price of a stock in the stock market at different points of time on a given day. Another
example is the amount of rainfall in a region at different months of the year. R language uses many
functions to create, manipulate and plot the time series data. The data for the time series is stored in an
R object called time-series object. It is also a R data object like a vector or data frame.

The time series object is created by using the ts() function.

Syntax The basic syntax for ts() function in time series analysis is –

timeseries.object.name <- ts(data, start, end, frequency)

Following is the description of the parameters used –

• data is a vector or matrix containing the values used in the time series.

• start specifies the start time for the first observation in time series.

• end specifies the end time for the last observation in time series.

• frequency specifies the number of observations per unit time. Except the parameter "data" all other
parameters are optional.

Consider the annual rainfall details at a place starting from January 2012. We create an R time series
object for a period of 12 months and plot it.

Code to run In R

# Get the data points in form of a R vector.

rainfall <- c(799,1174.8,865.1,1334.6,635.4,918.5,685.5,998.6,784.2,985,882.8,1071)

# Convert it to a time series object.

rainfall.timeseries <- ts(rainfall,start = c(2012,1),frequency = 12)

# Print the timeseries data.

print(rainfall.timeseries)
# Give the chart file a name.

png(file = "rainfall.png")

# Plot a graph of the time series.

plot(rainfall.timeseries)

# Save the file.

dev.off()

After this again plot to get chart plot(rainfall.timeseries)’

Output:
PRACTICAL 7

Perform the data clustering using clustering algorithm.

k-means clustering using R

#apply K means to iris and store result

newiris <- iris

newiris$Species <- NULL

(kc <- kmeans(newiris,3))

#Compare the Species label with the clustering result

table(iris$Species,kc$cluster)

#Plot the clusters and their centers

plot(newiris[c("Sepal.Length","Sepal.Width")],col=kc$cluster)

points(kc$centers[,c("Sepal.Length","Sepal.Width")],col=1:3,pch=8,cex=2)

dev.off()

#Plot the clusters and their centre

plot(newiris[c("Sepal.Length","Sepal.Width")],col=kc$cluster)
PRACTICAL 8

Perform the Linear regression on the given data warehouse data.

Input Data Below is the sample data representing the observations –

# Values of height

151, 174, 138, 186, 128, 136, 179, 163, 152, 131

# Values of weight.

63, 81, 56, 91, 47, 57, 76, 72, 62, 48

lm() Function : This function creates the relationship model between the predictor and the response
variable.

Syntax : The basic syntax for lm() function in linear regression is −

lm(formula, data)

Following is the description of the parameters used :−

• formula is a symbol presenting the relation between x and y.

• data is the vector on which the formula will be applied.

A. Create Relationship Model & get the Coefficients

# Values of height

x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131)

# Values of width

y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)

# Apply the lm() function.

relation <- lm(y~x)

print(relation)

OUTPUT:-
B. Get the Summary of the Relationship

# Values of height

x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131)

# Values of width

y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)

# Apply the lm() function.

relation <- lm(y~x)

print(summary(relation))

OUTPUT:-

predict() Function

Syntax The basic syntax for predict() in linear regression is –

predict(object, newdata)

Following is the description of the parameters used –

• object is the formula which is already created using the lm() function.

• newdata is the vector containing the new value for predictor variable.

C. Predict the weight of new persons

# The predictor vector.

x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131)
# The response vector.

y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)

# Apply the lm() function.

relation <- lm(y~x)

# Find weight of a person with height 170.

a <- data.frame(x = 170)

result <- predict(relation,a)

print(result)

OUTPUT:-

D. Visualize the Regression Graphically

# Create the predictor and response variable.

x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131)

y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)

relation <- lm(y~x)

# Give the chart file a name.

png(file = "linearregression.png")

# Plot the chart.

plot(y,x,col = "blue",main = "Height & Weight Regression", abline(lm(x~y)),cex = 1.3,pch = 16,xlab =


"Weight in Kg",ylab = "Height in cm")

# Save the file.

dev.off()

OUTPUT:-
# Plot the chart.

plot(y,x,col = "blue",main = "Height & Weight Regression", abline(lm(x~y)),cex = 1.3,pch = 16,xlab =


"Weight in Kg",ylab = "Height in cm")

You might also like