Dev Lab Record
Dev Lab Record
AIM
To write a step to install data analysis and visualization tool: R / Python / Tableau Public /
Power BI.
PROCEDURE
1. R:
Download R:
Visit the official R website (https://cran.r-project.org/) and download the
installer for your operating system (Windows, macOS, or Linux).
Install R by following the instructions provided in the installer.
2. Python:
Download Python:
Visit the official website (https://www.python.org/downloads/) and download the
Python installer for your OS (Windows, macOS, or Linux).
Install Python by running the installer and making sure to check the option to add Python to
your system’s PATH during installation.
jupyter-lab
Jupyter notebook
Scipy is a Python library that is useful in solving many mathematical equations and algorithms.
It is designed on the top of Numpy library. SCIPY means Scientific Python.
Pandas is a Python Package that provides fast, flexible, and expressive data structures
designed to make working with “relational” or “labeled” data both easy and intuitive.
3. Tableau Public:
Tableau Public
It is a web-based tool, so there’s no installation required. Simply visit the Tableau
Public Website (https”//public.tableau.com/s/gallery) and create an account to start
using it.
4. Power Bi:
Download Power BI Desktop:
Go to the official Power BI wenbsite (https://powerbi.microsoft.com/en-us/desktop/)
and download Power BI Desktop.
Installer Power BI Desktop by running the installer.
PROGRAM 1
import numpy as np
import pandas as pd
hafeez = [‘Hafeez’, 19]
aslam = [‘Aslam’, 21]
kareem = [‘kareem’,
18]
dataframe = pd.DataFrame([hafeez, aslam, kareem], columns = [‘Name’, ‘Age’])
print(dataframe)
Output 1
PROGRAM 1
import numpy as
np import
pandas as pd
import matplotlib.pyplot as plt
data =
pd.read_csv(“CountryData.csv”)
plt.hist(data)
plt.xlabel(“code”)
plt.ylabel(“Total_personal_income”)
plt.show()
First create a CSV file in excel with attributes ‘code’ and ‘Total_personal_income’.
Save the file with filename mentioned above “CountryData” with extension as .csv file.
Output 2
RESULT:
Ex. No:2 WORKING WITH NUMPY ARRAYS, PANDAS DATA FRAMES, BASIC PLOTS
USING MATPLOTLIB
AIM
To write the steps for Working with Numpy arrays, Pandas data frames, Basic plots using Matplotlib
PROCEDURE
1. NumPy:
NumPy is a fundamental library for numerical computing in Python. It provides support for multi-
dimensional arrays and various mathematical functions. To get started, you’ll first need to install
NumPy if you haven’t already (you can use pip):
# Basic operations
mean = np.mean(arr)
sum = np.sum(arr)
print("\nMean of the array:", mean)
print("Sum of the array:", sum)
# Mathematical functions
square_root = np.sqrt(arr)
exponential = np.exp(arr)
print("\nSquare root of the array:")
print(square_root)
# Array Operations
combined_array = np.concatenate([arr, sub_array])
print("\nCombined array:")
print(combined_array)
OUTPUT:
Pandas:
import pandas as pd
data = {
df = pd.DataFrame(data)
print("DataFrame:")
print(df)
print(df['Name'])
# Filtering data
# Sorting by a column
print(df.sort_values(by='Age', ascending=False))
# Aggregating data
print("\nAverage age:")
print(df['Age'].mean())
grouped_data = df.groupby('City')['Salary'].mean()
print(grouped_data)
df['Age_Squared'] = df['Age'].apply(lambda x: x ** 2)
# Removing a column
df = df.drop(columns=['Age_Squared'])
df.to_csv('output.csv', index=False)
new_df = pd.read_csv('output.csv')
print(new_df)
OUTPUT
DataFrame:
Name Age City
0 Alice 25 New York
1 Bob 30 Los Angeles
2 Charlie 35 Chicago
3 David 28 Houston
4 Emily 22 Miami
Average age:
28.0
2. Matplotlib:
Matplotlib is a popular library for creating static, animated, or interactive plots and
graphs. Install Matplotlib using pip:
pip install matplotlib
# Sample data
data = np.random.normal(0, 1, 1000) # Normally distributed data for histogram and box plot
x = np.linspace(0, 10, 100) # Linear space for scatter plot
y = np.sin(x) # Sine wave for scatter plot
# 1. Histogram
plt.figure(figsize=(8, 6))
plt.hist(data, bins=30, color='skyblue', edgecolor='black')
plt.title('Histogram of Normally distributed Data')
plt.xlabel('Value')
plt.ylabel('Frequency')
plt.grid(True)
plt.show()
# 2. Box Plot
plt.figure(figsize=(8, 6))
plt.boxplot(data, vert=False, patch_artist=True, boxprops=dict(facecolor='lightgreen'))
plt.title('Box Plot of Normally Distributed Data')
plt.xlabel('Value')
plt.show()
# 3. Scatter Plot
plt.figure(figsize=(8, 6))
plt.scatter(x, y, color='red', marker='o')
plt.title('Scatter Plot of Sine Wave')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.grid(True)
plt.show()
RESULT:
3. Problem Analysis: Take wine datasets and plotting different charts and graphs.
An Exploratory Data Analysis consists of methods for analyzing data in order to
extract meaningful insights and other useful characteristics of data.
4. Algorithm:
plt.show()
sns.heatmap(df_red.corr(), annot=True, fmt='.2f', linewidths=2)
plt.show()
sns.distplot(df_red['alcohol'])
plt.show()
Output:
RESULT:
AIM
To learn basic tableau functions for dashboard and to create basic Data Visualizations like
Line Chart and Bar Graph.
PROCEDURE
Step 2: Under the Sheets Tab, three sheets will become visible namely Orders, People, and Returns.
Double click on Orders Sheet, and it opens up just like a spreadsheet.
Step 3: use of Data Interpreter, also present under Sheets Tab. By clicking on it, to get a formatted
sheet.
Step 4: Go to the worksheet. Click on the tab sheet 1 at the bottom left of the tableau workspace.
Step 5: In dimension under the Data pane, drag the order date to the Column shelf.
Step 6: the measure tab, drag the sales field onto the Rows shelf.
OUTPUT:
AIM
To create bar chart in Tableau.
PROCEDURE
Step 2 : Add State and Country under Data pane to Detail on the Marks card. We obtain the map
view
Step 3: Drag Region to the Filters shelf, and then filter down to South only. The map view now
zooms in to the South region only, and a mark represents each state.
Step 4: Drag the Sales measure to the Color tab on the Marks card. We obtain a filled map with the
colors showing the range of sales in each state.
Step 5: We can change the color scheme by clicking Color on the Marks card and selecting Edit
Colors. We can experiment with the available palettes.
Step 6: In the Data pane, drag a field and drop it directly on top of another field or right-click the
field and select.
Step 7: Duplicate the Profit Map worksheet and name it Negative Profit Bar Chart.
Step 8: Click show me on the Negative Profit Bar Chart worksheet. Show me presents the number
of ways in which a graph can be plotted between items mentioned in the worksheet. From show me
select the horizontal bar option and the view updates to horizontal from vertical bars instantly.
OUTPUT:
RESULT:
AIM :
To add filters and colors to the data set.
PROCEDURE:
Step 1: Category is present under the Dimensions pane. Drag it to the columns shelf and place
it next to Year (order Date)
Step 2: The category should be placed to the right of year.
Step 3: A a bar chart type from a line is created. The chart shows the overall sales for every
product by year.
Step 4: To add labels to the view, click show mark labels on the toolbar
Step 5: Double-click or drag the sub-category dimension to the Columns shelf.
Step 6: Displays a bar for every sub-category broken down by category and year.
Step 7: Under Dimensions, right-click Order Date and select Show Filter. Repeat for Sub->
category field also.
Step 8 : In the Data pane, under Measures, drag Profit to Color on the Marks card.
OUTPUT:
RESULT:
AIM
To create Interactive dashboard using the Superstore Data Set in Tableau.
PROCEDURE
Step 2: Drag Sales in the South worksheet which is created before to the empty dashboard
Step 3: Drag Profit Map worksheet to the dashboard, and drop it on top of the Sales in the South
view. Both views can be seen at once. To be able to present data in a manner so that others can
understand it we can arrange the dashboard to our liking.
Step 4: On the Sales South worksheet in the dashboard view, click under the Region
OUTPUT:
RESULT:
3. Problem Analysis: Take company sales data set and plotting using lollipop
chart. It is a method for analyzing data in order to extract meaningful insights
and other useful characteristics of data.
4. Algorithm:
5. Coding:
Output:
RESULT:
4. Algorithm:
4. Coding:
data.replace([np.nan],data.b.mean(),inplace=Tr
ue) print(data)
Output:
0 False
1 False
2 False
3 False
4 True
5 True
dtype: bool
a b
0 one 1
1 two 1
2 one 2
3 two 3
a b c
0 one 1 2.0
1 two 1 2.0
2 one 2 2.0
3 two 3 2.0
4 one 2 2.0
5 two 3 2.0
RESULT:
4. Algorithm:
5. Coding:
# Numpy 1D and 2D Array manipulation
# Python program for
# Creation of Arrays
import numpy as np
# Creating a rank 1
Arrayarr = np.array([1, 2, 3])
print("Array with Rank 1: \n",arr)
# Creating a rank 2
Arrayarr = np.array([[1, 2, 3], [4, 5, 6]])
print("Array with Rank 2: \n", arr)
Output:
[1 3 2]
Initial Array:
[[-1. 2. 0. 4. ]
[ 4. -0.5 6. 0. ]
[ 2.6 0. 7. 8. ]
[ 3. -7. 4. 2. ]]
Elements at indices (1, 3), (1, 2), (0, 1), (3, 0):
[0. 6. 2. 3.]
RESULT: