[go: up one dir, main page]

0% found this document useful (0 votes)
34 views34 pages

Project Report 1

It is the report about data analytics

Uploaded by

sanyamtwn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views34 pages

Project Report 1

It is the report about data analytics

Uploaded by

sanyamtwn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 34

A

Summer Training Project

On

FLIPKART SALES DASHBOARD

Submitted in partial fulfilment of the requirements

for the award of the degree of

Bachelor of Computer Applications

To

Guru Gobind Singh Indraprastha University, Delhi

Guide: Mr Divyank Chauhan SUBMITTED BY:

RIMJHIM RANA

(00224402024)

SWATI

(00324402024)

SANYA MOTWANI

(01524402024)

ANANYA SRIVASTAVA

(02824402024)

Institute of Innovation in Technology & Management,

New Delhi - 110058


Acknowledgement

We profoundly thankful to everyone who contributed to the successful


completion of our summer trainig project.

Firstly, we would like to extend our deepest gratitude to Mr. Divyank Chauhan,
whose invaluable guidance and insights were essential throughout the training.
His expertise and patience were crucial in helping me navigate the complexities
of advanced data science and machine learning.

We also wish to sincerely thank IITM Janakpuri for collaborating with Shape
MySkills Pvt. Ltd. to provide this exceptional learning opportunity. Special
appreciation goes to our course coordinator, Dr. Meenu, for her dedicated efforts
and support during the training. Furthermore, we are deeply grateful to ,
Head of the Department, for her encouragement and for providing the necessary
resources and environment to facilitate this learning experience.

Finally, we acknowledge the unwavering support of our family and friends, who
have been our pillars of strength and encouragement throughout this journey.

Thank you all for making this experience valuable and memorable.

Sincerely,

Rimjhim

Swati

Sanya

Ananya
Certificate

This is to certify that Rimjhim, Swati, Sanya and Ananya has successfully
completed the project titled “FLIPKART SALES DASHBOARD” as part of
the Data Analytics summer training organised by IITM Janakpuri in
collaboration with ShapeMySkills Pvt. Ltd.

This project was conducted under the esteemed guidance of Mr. Divyank
Chauhan, whose expertise and mentorship were instrumental in its successful
completion. The project exemplifies a thorough understanding of data
analytics techniques, highlighting the skills acquired during the training
program.

We commend Rimjhim, Swati, Sanya and Ananya for their dedication, hard
work, and enthusiasm throughout the project duration.

Coordinator: Dr. Meenu

Head of Department:

Date:

Signature:
INDEX

S No. Title Page


1 Chapter 1 1-5
Introduction to Python

2 Chapter 2 6-11
Introduction to Python Libraries

3 Chapter 3 12-16
Sqlite basics and Data operations

4 Chapter 4 17-21
Advance Excel

5 Chapter 5 22-26
Introduction to Powerbi

6 Chapter 6 27-30
Dashboard
List of Abbreviations

S No. Name
1 OOP: Object Oriented programming

2 I/0: Input/Output

3 NumPy: Numerical Python

4 Pandas: Panel Data

5 CSV: Comma-Separated Value

6 SQL: Structured Query Language

7 JSON: JavaScript Object Notation

8 BI: Business Intelligence (Power Bi)

9 VLOOKUP: Vertical Lookup (Excel function)

10 PY: Python

11 PIVOT: Pivot Table

12 DAX: Data Analysis Expressions

13 KPI: Key Performance Indicator


Chapter 1:
Introduction to Python
Python is a powerful, high-level programming language developed by
Guido van Rossum and released in 1991. It is known for its simple,
readable syntax and versatility. Python is widely used in industries
such as finance, healthcare, marketing, and technology — particularly
in data science, artificial intelligence (AI), and machine learning
(ML).

Python’s philosophy emphasizes code readability, which means code


written in Python is often easier to write, debug, and maintain
compared to other languages.

Key Features of Python

 Simple Syntax – Python’s syntax is similar to English, making it


beginner-friendly.
 Extensive Libraries – Includes libraries for data manipulation
(Pandas), scientific computing (NumPy), machine learning (Scikit-
learn), and visualization (Matplotlib, Seaborn).
 Interpreted Language – Python executes code line by line, which
is excellent for testing and debugging.
 Cross-Platform – Python works on Windows, macOS, and Linux.
 Strong Community Support – Millions of users worldwide share
tools, documentation, and resources.

Why Python for Data Analytics?

 Extensive libraries like NumPy, Pandas, Matplotlib, Seaborn.


 Integrates well with databases, Excel, and web APIs.
 Excellent for both exploratory and statistical data analysis.
 High demand in job market.
Loops in Python

 Loops in Data Analytics


Loops are essential in data processing tasks like:

 Cleaning or modifying rows in a dataset.


 Automating repetitive tasks (e.g., generating multiple plots).
 Filtering or transforming data manually.

Loops are used to execute a block of code multiple times. In Python,


the two main types of loops are:

1. for Loop

The for loop is used to iterate over a sequence like a list, tuple, string,
or range.

Syntax
for item in sequence:
# code block

Example 1: Loop through a list


fruits = ["apple", "banana", "mango"]
for fruit in fruits:
print(fruit)

Example 2: Loop with range()


for i in range(1, 6):
print(i)

👉 range(1, 6) generates numbers from 1 to 5.

2. while Loop
The while loop keeps running as long as a given condition is true.

Syntax:
while condition:
# code block
Example:
count = 0
while count < 5:
print("Count is:", count)
count += 1

 Loop Control Statements

Keyword Description

Break Exits the loop immediately.

continue Skips the current iteration and goes to the next.

Pass Does nothing; a placeholder.

Example with break:


for num in range(10):
if num == 5:
break
print(num)

Example with continue:


for num in range(5):
if num == 2:
continue
print(num)
Variables and Data Types in Python

 Variables in Python
 A variable stores data in memory.
 You don’t need to declare the data type.

name = "Alice"
age = 25
pi = 3.14
is_valid = True

Common Data Types

Type Example Description


Int 10 Whole numbers
Float 3.14 Decimal numbers
Str "hello" Text (strings)
Bool True, False Boolean values

Dynamic Typing

x=5 # int
x = "five" # str – Python changes type automatically!
Operators in Python
Chapter 2:
Introduction to Python libraries

 NumPy
NumPy (Numerical Python) is a powerful open-source Python library
used for performing numerical computations efficiently. It is the
foundational package for scientific computing in Python and is widely
used in data analytics, machine learning, and big data applications.
NumPy provides support for arrays, matrices, and a vast collection of
mathematical functions.

Key features:

 High-performance multidimensional array object: ndarray.


 Broadcasting functions.
 Mathematical and statistical functions.
 Linear algebra, Fourier transform, and random number capabilities.
 Integration with C, C++, and Fortran code.
Why Use NumPy in Data Analytics?

1. Efficient Memory Usage: NumPy arrays consume less memory


than Python lists.
2. Fast Computation: Vectorized operations run faster than
traditional for-loop-based operations.
3. Ease of Use: Easy syntax and access to comprehensive
functionality.
4. Data Handling: Simplifies data wrangling, cleaning, and
preprocessing.

Creating and Working with Arrays

1. Creating NumPy Arrays:


import numpy as np

arr = np.array([1, 2, 3, 4])


print(arr)
2. Array Types:

 1D array: np.array([10, 20, 30])


 2D array: np.array([[1, 2], [3, 4]])
 Zeros: np.zeros((2,3))
 Ones: np.ones((3,3))
 Random: np.random.rand(2, 2)

3.Array Operations

Element-wise Operations:
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
print(a + b)
Mathematical Functions:
np.mean(arr)
np.median(arr)
np.std(arr)
np.sum(arr)
np.max(arr)
np.min(arr)

4,Array Manipulation:
 Reshape: arr.reshape(3, 2)
 Flatten: arr.flatten()
 Transpose: arr.T

 Pandas
Pandas is a fast, powerful, and flexible open-source data analysis and
data manipulation library for Python. It is built on top of NumPy and
designed for working with structured data like tables, Excel files,
CSVs, and databases. It introduces two primary data structures: Series
and DataFrame.

Key Features:

 Fast and efficient DataFrame object for data manipulation.


 Tools for reading and writing data from various formats (CSV, Excel,
SQL, JSON).
 Handling of missing data.
 Data alignment and integrated handling of time series data.

Why Use Pandas in Data Analytics?

1. Data Cleaning: Easily handle missing, duplicate, or inconsistent


data.
2. Data Exploration: Offers descriptive statistics, summaries, and
value counts.
3. Data Transformation: Supports filtering, grouping, merging,
pivoting, and reshaping.

 Core Data Structures in Pandas


1. Series – 1D labeled array
import pandas as pd
s = pd.Series([10, 20, 30])
print(s)
2. DataFrame – 2D labeled data structure (like a table)
data = {'Name': ['Amit', 'Siya'], 'Age': [23, 21]}
df = pd.DataFrame(data)
print(df)

 Reading and Writing Data

# Read from CSV


df = pd.read_csv('data.csv')

# Read from Excel


df = pd.read_excel('data.xlsx')

# Write to CSV
df.to_csv('output.csv', index=False)

 Data Exploration
df.head() # First 5 rows
df.tail() # Last 5 rows
df.info() # Summary
df.describe() # Stats summary
df.columns # Column names
df.shape # Rows and columns
 Data Selection and Indexing
df['Name'] # Access single column
df[['Name', 'Age']] # Access multiple columns
df.loc[0] # Access row by label
df.iloc[1] # Access row by index
df[ df['Age'] > 20 ] # Conditional filtering

 Data Cleaning and Preparation


df.dropna() # Remove missing values
df.fillna(0) # Replace missing with 0
df.duplicated() # Find duplicates
df.drop_duplicates() # Remove duplicates
df.rename(columns={'Name': 'FullName'}, inplace=True)

 Matplotlib
Matplotlib is a widely used data visualization library in Python. It enables
users to create static, animated, and interactive plots with high flexibility
and customization. It is particularly useful for data analysts to represent
data insights visually through graphs and charts.

 Often used alongside NumPy and Pandas


 The most commonly used module: pyplot (imported as plt).
 import matplotlib.pyplot as plt

Why Use Matplotlib?

 Create line, bar, scatter, pie, and histogram plots.


 Visualize trends, distributions, and comparisons.
 Customize plots with titles, labels, legends, colors, and styles.
 Save plots as image files (PNG, JPG, PDF).

Basic Plotting Examples

1. Line Plot:
x = [1, 2, 3, 4]
y = [10, 20, 25, 30]
plt.plot(x, y)
plt.title("Simple Line Plot")
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.show()
2. Bar Chart:
labels = ['A', 'B', 'C']
values = [10, 15, 7]
plt.bar(labels, values)
plt.title("Bar Chart")
plt.show()
3. Scatter Plot:
plt.scatter(x, y, color='red')
plt.title("Scatter Plot")
plt.show()

 Seaborn
Seaborn is a Python data visualization library built on top of
Matplotlib, designed to make statistical graphics easier and more
attractive. It works seamlessly with Pandas DataFrames, making it
ideal for data analytics tasks.

import seaborn as sns


import pandas as pd
import matplotlib.pyplot as plt

Why Use Seaborn?


 Simplifies complex visualizations
 Beautiful default styles and color schemes
 Built-in statistical plots
 Better support for multi-variable and grouped data

Common Plots in Seaborn


Plot Type Function
Line Plot sns.lineplot()
Bar Plot sns.barplot()
Histogram sns.histplot()
Box Plot sns.boxplot()
Count Plot sns.countplot()
Swarm Plot sns.swarmplot()

Example:

sns.boxplot(x='category', y='value', data=df)


plt.title("Box Plot Example")
plt.show()

Chapter 3:
Sqlite basics and Data operations

What is SQLite? — Key Points


 SQLite is an embedded, serverless, and lightweight SQL database engine.
 It stores the entire database as a single file on disk.
 Written in C and available in the public domain (free for any use).
 Comes pre-installed with Python (via sqlite3 module).

Main Features
 Serverless – No need to install or run a separate database server.
 Zero Configuration – No setup, just connect and use.
 Self-contained – Everything (code + data) is in one file.
 Portable – Database files can be copied between systems easily.
 Cross-platform – Runs on Windows, Linux, macOS, Android, and iOS.
 Fast & Efficient – Performs well for small to medium datasets.
 Reliable – Fully ACID-compliant, supports transactions.

Use Cases
 Mobile apps (e.g., WhatsApp, Android apps)
 Desktop software (e.g., Firefox, Chrome)
 Embedded systems & IoT devices (e.g., Raspberry Pi)
 Data analysis & quick prototypes (Python projects)
 Educational or test projects without setting up a full DBMS

Setting Up SQLite in Python


o Importing the SQLite Module
To start using SQLite in Python, you need to import the built-in sqlite3 module.
This module provides all the necessary functions to connect, create, and manage
a SQLite database easily without installing any external libraries

import sqlite3

o Creating a Database
You can create a new SQLite database using the connect() function. If the
specified file (e.g., students.db) does not exist, SQLite will automatically create
it. If it already exists, it will simply connect to the existing one.

conn = sqlite3.connect('students.db') # Creates or opens a database


file
o Creating a Cursor Object
A cursor object allows you to execute SQL statements and interact with the
database.

cursor = conn.cursor()

o Creating a Table
cursor.execute('''
CREATE TABLE IF NOT EXISTS students (
id INTEGER PRIMARY KEY,
name TEXT,
age INTEGER,
grade TEXT
)
''')
conn.commit()

o Closing the Connection


conn.close()
SQLite CRUD Operations in Python

CRUD stands for Create, Read, Update, and Delete — the four
basic operations for managing data in a database. With SQLite and
Python, these operations are simple and efficient using the built-in
sqlite3 module. Below are examples of each operation using a sample
table called students.

1. CREATE (Insert Data)

To add new records to the database, we use the INSERT INTO SQL
statement. This can be done using parameterized queries to prevent
SQL injection.

import sqlite3
conn = sqlite3.connect('school.db')
cursor = conn.cursor()

cursor.execute('''
CREATE TABLE IF NOT EXISTS students (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT,
age INTEGER,
grade TEXT
)
''')

cursor.execute("INSERT INTO students (name, age, grade) VALUES


(?, ?, ?)",
('Sanya', 20, 'A'))
conn.commit()

2. READ (Retrieve Data)


To fetch and view data, use the SELECT statement. Data can be
retrieved fully or conditionally and looped over for display.

cursor.execute("SELECT * FROM students")


rows = cursor.fetchall()

for row in rows:


print(row)

You can also filter records:

python
CopyEdit
cursor.execute("SELECT name, grade FROM students WHERE age
> 18")

3. UPDATE (Modify Data)


The UPDATE statement modifies existing data in the table. You must
specify the condition to avoid updating all records.

cursor.execute("UPDATE students SET grade = 'A+' WHERE name


= 'Sanya'")
conn.commit()

4. DELETE (Remove Data)

Use the DELETE FROM command to remove data. As with update,


use a condition to target specific rows.

python
CopyEdit
cursor.execute("DELETE FROM students WHERE name = 'Sanya'")
conn.commit()

Advantages of SQLite

1. Serverless Architecture
SQLite is self-contained and does not require a separate server to
operate. All you need is the database file, which can be easily created
and accessed using a simple API.
2. Lightweight and Fast
SQLite is extremely lightweight—just a few hundred kilobytes in size
—and performs very well for read-heavy operations or small-to-
medium datasets.
3. Zero Configuration
There is no setup or installation required. You can start using SQLite
immediately without configuring user accounts, permissions, or a
server.
4. Cross-Platform Compatibility
SQLite works seamlessly on Windows, macOS, Linux, Android, iOS,
and embedded systems like Raspberry Pi or Arduino.

Disadvantages of SQLite

1. Not Suitable for High-Concurrency Environments


SQLite supports multiple readers but only one writer at a time. This
makes it less suitable for applications with many simultaneous writes,
such as high-traffic web apps.
2. Limited Scalability
Although SQLite can handle databases up to 140 TB, performance
may degrade with very large datasets or complex multi-user
workloads.
3. No User Management or Access Control
Unlike MySQL or PostgreSQL, SQLite lacks built-in user accounts
and permission management. Access control must be handled by the
host application.

Chapter 4:
Advance Excel
What is Advanced Excel?
Advanced Excel refers to powerful features and tools beyond basic
spreadsheet use. These include formulas, pivot tables, data visualization,
advanced functions, macros, and data analysis tools. Mastering these tools
enhances your ability to clean, explore, and visualize data efficiently.
Why Use Excel for Data Analytics?
 User-friendly interface for handling structured data.
 Excellent for data cleaning, manipulation, and visualization.
 Supports advanced calculations and automation (macros, VBA).
 Ideal for quick analysis, dashboards, and reports.
 Commonly used in industry for finance, marketing, HR, and operations.

Real-World Applications
Industry Use Case
Finance Budgeting, forecasting, investment tracking
Marketing Campaign analysis, ROI reports
HR Attendance, payroll, performance tracking
Sales Sales pipeline, region-wise performance
Education Grading systems, student analytics
Logistics Inventory control, supply chain monitoring

How Advanced Excel Enhances Analytics


With Advanced Excel skills, users can:

 Clean messy or unstructured data efficiently


 Transform and reshape datasets for analysis
 Build interactive dashboards for visual insights

Data Cleaning and Formatting in Excel


Before analyzing data, it's important to clean and format it properly. Raw
data often includes duplicates, blanks, inconsistent formats, or errors,
which can affect the accuracy of your analysis. Excel provides several
easy-to-use tools for preparing data efficiently.

Common Data Cleaning Tools


1. Text to Columns
Splits combined data (e.g., "John Smith") into separate columns like "First
Name" and "Last Name".
Data → Text to Columns
2. Remove Duplicates
Eliminates repeated rows to ensure data accuracy.
Data → Remove Duplicates
3. Find & Replace
Helps quickly fix errors or standardize values (e.g., change "NY" to "New
York").
Shortcut: Ctrl + H
4. Flash Fill
Fills in patterns automatically (e.g., initials from names, reformatting
phone numbers).
Data → Flash Fill
5. Data Validation
Sets rules for data entry (e.g., only dates or values from a dropdown list).
Data → Data Validation

Key Formatting Techniques


 Conditional Formatting
Visually highlight important values (e.g., low sales in red, high grades in
green).
Home → Conditional Formatting
 Number Formatting
Format numbers as currency, percentages, or custom date styles.
 Freeze Panes
Keeps headers visible while scrolling through large datasets.

View → Freeze Panes

Formulas and Functions


Logical Functions

 IF(): Returns one value if condition is true, another if false.


 IFERROR(): Handles errors and returns custom message.
 AND(), OR(), NOT(): Combine multiple conditions.

=IF(A2>5000, "High", "Low")


Lookup & Reference Functions

Used to search for and return data from tables or ranges.

 VLOOKUP() – Searches vertically in a table


=VLOOKUP(101, A2:D10, 2, FALSE)

 INDEX() + MATCH() – A powerful combo for dynamic lookups
=INDEX(B2:B10, MATCH("Riya", A2:A10, 0))

 XLOOKUP() – A modern alternative to VLOOKUP (available in Excel
365)
=XLOOKUP("Riya", A2:A10, B2:B10)

Mathematical & Statistical Functions

Excel offers a wide range of functions to calculate totals, averages, and


counts:

 SUM(), AVERAGE(), MIN(), MAX() – Basic calculations


 COUNT() – Counts numeric values
 COUNTA() – Counts non-empty cells
 COUNTIF() – Counts based on a condition

=COUNTIF(C2:C20, ">70")

These are helpful in survey results, financial statements, and data


summaries.

Data Analysis with PivotTables & Charts


In data analytics, Excel’s PivotTables and charts are essential tools for
quickly summarizing and visualizing large datasets. These features help
identify trends, patterns, and outliers in a few clicks—without writing
complex formulas.

Pivot Tables
A Pivot Table allows you to automatically group, filter, and summarize
your data. It’s perfect for answering questions like:

 What is the total sales per region?


 How many students passed each subject?
 What’s the average order value by category?

Pivot Charts
A Pivot Chart is a dynamic chart connected to your PivotTable. As you
change your table (e.g., filter years), the chart updates automatically.

Common chart types include:

 Column or Bar Chart – Compare totals across categories.


 Line Chart – Show trends over time.
 Pie Chart – Show proportions (use with caution).

Pivot Charts make interactive dashboards easy to build and interpret.

Slicers and Filters


 Slicers are visual tools to filter your PivotTable or chart by field (like
region, category, or year) with just a click.
 Timelines are slicers specifically for date-based filtering.
 They make dashboards user-friendly for non-technical audiences.

Advantages of Excel
1. User-Friendly Interface
Excel is easy to learn and use, even for beginners. Its tab-based layout, intuitive
features, and familiar design make it highly accessible.
2. Powerful Analytical Tools
With formulas, functions, PivotTables, and data analysis add-ins, Excel
supports advanced analytics, modeling, and reporting without requiring
programming skills.
3. Data Visualization
Excel provides a wide variety of charts, graphs, conditional formatting, and
dashboards to help visualize and interpret data effectively.
4. Flexible and Versatile
It can handle different types of data—numeric, textual, date/time—and is used
in fields like finance, HR, education, marketing, and logistics.
5. Integration with Other Tools
Excel integrates with Power BI, SQL, Python, R, and online services, making it
useful for both standalone and collaborative workflows.

Disadvantages of Excel
1. Limited Scalability
Excel is not designed for handling very large datasets (millions of rows)
efficiently. Performance may degrade with size and complexity.
2. Error-Prone with Manual Entry
Manual data entry and formula writing increase the risk of human errors,
especially in large or complex spreadsheets.
3. No Real-Time Collaboration (Offline Versions)
While Excel 365 supports collaboration, offline versions lack true real-
time multi-user support like Google Sheets.
4. Security Concerns
Excel files can be easily copied, edited, or shared without restrictions. It
lacks strong, built-in access control and auditing features.

Chapter 5:
Introduction to Powerbi
What is Power BI?
 A business intelligence and data visualization tool developed by
Microsoft.
 Converts raw data into interactive dashboards and meaningful insights.
 Allows easy connection to multiple data sources (Excel, SQL, Azure,
etc.).
 Offers drag-and-drop interface for designing reports without coding.
 Used for creating charts, graphs, KPIs, and maps for data storytelling.
 Supports real-time data updates and monitoring through dashboards.
 Empowers both technical and non-technical users to analyze data.

Importance in Data Analytics


 Helps transform data into understandable visuals and insights.
 Enables businesses to monitor KPIs and performance in real-time.
 Reduces time spent on manual reporting and data analysis.
 Allows self-service analytics for faster, decentralized decision-making.
 Offers AI-powered features like quick insights and natural language
Q&A.
 Supports collaboration by allowing shared access to dashboards.
 Encourages a data-driven culture across teams and departments.
 Helps identify patterns, trends, and outliers easily.

Power BI Family
 Power BI Desktop – Used to build, model, and design interactive
reports.
 Power BI Service – Cloud-based platform to share, publish, and
collaborate.
 Power BI Mobile – Mobile apps to access dashboards on phones and
tablets.
 Power BI Report Server – On-premise version for hosting reports
securely.

Power BI Components and Architecture


1. Power BI Desktop
 A free Windows-based application used to create and design reports.
 Allows users to import, clean, model, and visualize data.
 Offers a user-friendly drag-and-drop interface for building visuals.
 Enables creation of calculated columns and measures using DAX.
 Most commonly used tool in the Power BI ecosystem for development.

2. Power BI Service
 A cloud-based platform used for publishing and sharing reports.
 Allows users to create dashboards and collaborate in real-time.
 Supports scheduled data refresh and alert notifications.
 Provides workspace features for managing user access and roles.
 Enables sharing reports across teams, departments, or entire
organizations.

3. Power BI Gateway
 Acts as a bridge between on-premise data sources and Power BI
services.
 Keeps cloud-based reports updated with the latest on-premise data.
 Two types: Personal gateway (for individual use) and Enterprise
gateway (for teams).
 Essential for organizations with hybrid data infrastructure.

4. Power BI Mobile
 Available for iOS, Android, and Windows devices.
 Lets users view and interact with dashboards on the go.
 Supports touch-enabled navigation, filtering, and drill-downs.
 Sends mobile alerts based on predefined thresholds or conditions.

Features of Powerbi
1. Data Connectivity
Power BI supports a wide variety of data connectors, allowing users to
import data from sources like Excel, CSV, SQL Server, MySQL, Oracle,
Azure, SharePoint, Google Analytics, Salesforce, and even web APIs. This
extensive connectivity makes it flexible for almost any data environment. It
enables users to combine data from multiple sources into a unified data
model for reporting and analysis.

2. Power Query for Data Transformation

Power BI includes Power Query Editor, a built-in tool that helps users clean
and shape their data before loading it into reports. It supports operations like
filtering rows, removing duplicates, merging tables, renaming columns, and
changing data types. These steps are recorded automatically and can be
reused or modified later, making data preparation efficient and repeatable.

3. Data Modeling with DAX


Data modeling is one of the core features of Power BI. Users can create
relationships between tables and build complex calculations using DAX
(Data Analysis Expressions). DAX is a powerful formula language that helps
in creating measures, calculated columns, KPIs, and time-based calculations.
With a well-designed data model, users can perform deeper and more
accurate analysis.

4. Interactive Visualizations
Power BI provides a rich set of built-in visuals including bar charts, line
graphs, pie charts, maps, tables, matrices, cards, gauges, and more. Users can
simply drag and drop fields to create visuals, and customize them with
colors, labels, filters, and tooltips. These visuals are interactive—clicking on
one visual updates others to reflect related data.

5. Custom Visuals
In addition to built-in visuals, Power BI supports importing custom visuals
from Microsoft AppSource or even creating your own. These visuals can be
used to meet specific business needs or create unique dashboards.

Power BI in Action – Real-World Usage

1. Business Decision-Making
Organizations use Power BI to:

 Track sales and marketing performance.


 Monitor KPIs in real-time.
 Analyze customer behavior.

2. Financial Analysis
Power BI helps finance teams to:

 Analyze profit and loss statements.


 Monitor budgets and forecasts.
 Detect anomalies in expenditures.

3.HR Analytics
Used for:

 Monitoring employee performance.


 Tracking hiring trends and turnover rates.
 Visualizing training and development progress.

4. Education and Research


Power BI is used in academics for:

 Analyzing student performance data.


 Visualizing research data trends.
 Sharing results with stakeholders.

Case Example: Retail Store


A retail company uses Power BI to:

 kjIntegrate data from sales, inventory, and customer feedback.


 Identify underperforming products.
 Optimize supply chain and restocking decisions.

Advantages, Limitations & Career Scope


Advantages of Power BI
 User-Friendly Interface: Drag-and-drop functionality.
 Cost-Effective: Free version available with rich features.
 Real-Time Analytics: Live dashboards with streaming data.
 Integration: Works seamlessly with Excel, Teams, SharePoint.
 Custom Visuals: Marketplace for importing new visuals.
 Cloud & Mobile Access: Anytime, anywhere access to insights.

Limitations
 Limited Export Options: Exporting large reports to PDF can be
restricted.
 Complex DAX Syntax: Learning curve for advanced calculations.
 Data Model Size: Performance issues with very large datasets.
 Custom Visuals Licensing: Some visuals require paid licenses.

Career Opportunities
Learning Power BI opens doors in:

 Business Intelligence
 Data Analytics
 Data Science
 Financial Analysis
 Project Management

Job Roles
 Power BI Developer
 Data Analyst
 Business Analyst
 BI Consultant

Chapter 6:
Dashboard
What is a Dashboard in Power BI?
A dashboard in Power BI is a single-page, interactive view that displays
key insights and metrics from various reports and datasets. Often referred
to as a “canvas,” a dashboard allows users to pin visuals from different
reports and sources into one unified view. It’s ideal for monitoring
business performance at a glance and quickly identifying trends or issues.

Key Features of Power BI Dashboards


 Single Page Layout: Dashboards are designed to be concise and focus
on the most important data points.
 Data from Multiple Sources: Visuals on a dashboard can come from
various datasets and reports.
 Interactive Tiles: Each visual, or "tile," on the dashboard is clickable
and can direct the user to the underlying report for more detail.
 Real-Time Data: Dashboards can be updated in real-time using
streaming data sources for live monitoring.
 Q&A Box: Users can ask questions in natural language, and Power BI
will generate visuals based on the query.
 Alerts: Set up data-driven alerts to receive notifications when certain
thresholds are met.

Benefits of Dashboards
 Provide a quick overview of KPIs and business metrics.
 Save time by summarizing complex data in a single view.
 Encourage collaboration by enabling sharing across teams.
 Help executives and managers make fast, informed decisions.

Difference Between Dashboards and Reports


 Dashboards: Single-page, summary view, can combine data from
multiple reports.
 Reports: Multi-page, detailed analysis based on a single dataset

You might also like