0% found this document useful (0 votes)

6 views12 pages

Data Analytics.23031 Assignment

Uploaded by

adarshravikumar555

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views12 pages

Data Analytics.23031 Assignment

Uploaded by

adarshravikumar555

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 12

DATA ANALYTICS FOR

BUILT ENVIRONMENT
ASSIGNMENT

NAME: ADARSH .R
ENROLLMENT NO.:
A70059023054
BATCH: 2023-2025
Q1) Find mean, median and mode of the data in Figure

Figure 1: Histogram for Problem 1

ANSWER:

Mean (Average):

Mean = ∑Values = 43 ≈ 2.87

Number of Values 15

The mean value is approximately 2.87.

Median:
Median = Middle Value of Sorted Dataset
The sorted dataset is [0,0,0,0,1,2,2,3,4,4,4,5,5,6,7].
The median (middle value) is 3.

Mode = Most Frequent Value

The value 0 appears 4 times, which is the highest frequency.

Thus, the mode is 0.

Q2) What do you mean by Dispersion? State any two
measures of dispersion.

ANSWER:

Dispersion refers to the extent to which values in a distribution differ from the average of
the distribution. It provides a precise view of the distribution by explaining the disparity
of data from one another. Dispersion is a measure that indicates the scattering of data
and gives an idea about the variation and central value of individual items.

 Two common measures of dispersion are:

1. Range: It is the simplest method of measurement of dispersion and defines the

difference between the largest and the smallest item in each distribution. The range is
calculated by subtracting the smallest value from the largest value:

Range = Y max – Y min.

2. Standard Deviation: It is the square root of the arithmetic average of the square of
the deviations measured from the mean. The standard deviation is given by the formula:

Q3) Write any two assumptions of Normal Distribution.

ANSWER:
Two assumptions of the normal distribution are:

1. Symmetry: The normal distribution is symmetric around the mean, meaning that
the distribution is balanced on both sides of the mean. This is reflected in the
bell-shaped curve where most of the data points are clustered around the mean,
with fewer points at the extremes.

2. Kurtosis: The normal distribution has a kurtosis of 3, which means that the
distribution is mesokurtic, or slightly more peaked than a standard normal
distribution. This contrasts with distributions that are platykurtic (flatter) or
leptokurtic (more peaked).
Q4) Draw and compare right skewed and left skewed
distribution. Discuss the behavior of mean for the above
two types of skewness.

ANSWER:

 Right Skewed Distribution:

A right skewed distribution is characterized by a longer tail on the right side of the
distribution. This type of skewness is also known as positive skewness. In a right
skewed distribution, the mean is greater than the median, and the mode is often less
than the median.

 Left Skewed Distribution:

A left skewed distribution is characterized by a longer tail on the left side of the
distribution. This type of skewness is also known as negative skewness. In a left
skewed distribution, the mean is less than the median, and the mode is often less
than the median.

 Comparison of Right and Left Skewed Distributions:

FACTORS RIGHT SKEWED LEFT SKEWED

DISTRIBUTION DISRTIBUTION
MEAN Greater than the median. Less than the median.
MEDIAN Less than the mean. Greater than the mean.
MODE Often less than the Often less than the
median. median.
TAIL Longer on the right side. Longer on the left side.
Household income, Household income (with
EXAMPLES stock market returns. a very long left tail),
individual income
distribution.

 Behavior of Mean in Right and Left Skewed Distributions:

 Right Skewed Distribution: The mean is pulled away from the median by the
extreme values on the right side, making it greater than the median.

 Left Skewed Distribution: The mean is pulled away from the median by the
extreme values on the left side, making it less than the median.
SECTION – B

Q5) When two balanced dice are rolled, Find the

probability that
a. the sum of the dice is 11
b. the sum of the dice is 1

ANSWER:

a. The sum of the dice is 11:

The probability of rolling a sum of 11 with two dice is 1/18. This is because there are
two ways to roll a sum of 11: (5,6) and (6,5). There are a total of 36 possible outcomes
when rolling two dice, so the probability is:

b. The sum of the dice is 1:

It is not possible to roll a sum of 1 with two balanced dice. The smallest possible sum is
2, which can be achieved by rolling a 1 on each die.

Q6) Case Description: Cyclone Wind Speeds from Tropical

Cyclone Reports, published by the National Hurricane
Center, we obtained the data shown in below table, in
miles per hour (mph), for one year’s tropical cyclones in
the Atlantic Basin.
a. Prepare the data for computing the three quartiles.
b. Calculate Q1, Q2 & Q3.
c. Find out IQR.

ANSWER:

1. List all the data points:

[60, 70, 85, 65, 100, 60, 110, 45, 80, 40, 105, 80, 115, 90, 50, 45, 90, 115,
50]

2. Sort the data:

[40, 45, 45, 50, 50, 60, 60, 65, 70, 80, 80, 85, 90, 90, 100, 105, 110, 115,
115]

3. Calculate Quartiles:

Q1 (25th percentile): Find the value at the 25% position.

Q1 = 50

Q2 (Median or 50th percentile): Find the middle value.

Q2 = 80

Q3 (75th percentile): Find the value at the 75% position.

Q3 = 100

4. Calculate the IQR:

IQR = Q3 - Q1 = 100 - 50 = 50

These are your quartiles and the interquartile range for the data set.

SECTION – C
Q7) Performing linear regression model for two variables
yielded the trend equation ŷ=345 + 77x with R-squared
value of 0.12. Estimate y for x = 15 considering x to be
within the data collection range.

ANSWER:

Given the linear regression equation y^=345+77x, we can estimate the value of
y for x=15.

Linear Regression Model:

The equation provided:

y^=345+77x

This equation represents a straight line where:

 y^ is the estimated or predicted value of y.

 345 is the y-intercept of the regression line.
 77is the slope of the regression line.
 X is the independent variable.

Estimating y for x=15:

 To find the estimated y^ when x=15, substitute x=15 into the regression equation:

y^=345+77×15

Let's calculate this value.

y^=345+1155=1500

 So, the estimated y^ for x=15 is 1500.

R-squared Value:

 The R-squared value is 0.12. This indicates that approximately 12% of the
variability in the dependent variable y is explained by the independent
variable x. While this is relatively low, it doesn't affect the point estimate y^
for a given x; it just indicates that the model may not fit the data very well
overall.

Conclusion:

 Using the linear regression model y^=345+77x, the estimated value of y

when x=15 is 1500.

Q8) A snack-food company produces a 454-g bag of

pretzels. Although the actual net weights deviate slightly
from 454 g and vary from one bag to another, the
company insists that the mean net weight of the bags be
454 g. As part of its program, the quality assurance
department periodically performs a hypothesis test to
decide whether the packaging machine is working
properly, that is, to decide whether the mean net weight
of all bags packaged is 454 g

a) Determine the null hypothesis for the hypothesis test.

b) Determine the alternative hypothesis for the

hypothesis test.

c) Classify the hypothesis test as two tailed, left tailed, or

right tailed.

ANSWER:

When conducting a hypothesis test to determine if the mean net weight of the bags is
454 grams as claimed by the snack-food company, we need to establish both the null
hypothesis (H0) and the alternative hypothesis (H1).

a) Determine the Null Hypothesis (H0)

The null hypothesis represents the status quo or the claim that is being tested. In this
case, the company claims that the mean net weight of the bags is 454 grams.

H0: μ=454 g

where μ represents the true mean net weight of the bags.

b) Determine the Alternative Hypothesis (H1)

The alternative hypothesis represents the statement that we are trying to find evidence
for. In this scenario, the quality assurance department is checking if there is a deviation
from the mean weight of 454 grams. This can be either an increase or decrease,
suggesting that the packaging machine may not be working properly.

H1: μ ≠ 454 g

c) Classify the Hypothesis Test

The hypothesis test examines whether the mean weight is different from 454 grams,
regardless of whether it is higher or lower. Therefore, we are looking for any deviation
from 454 grams, which means we are testing for both directions (greater than or less
than).

This makes the hypothesis test a two-tailed test.

Summary

 Null Hypothesis (H0): The mean net weight of the bags is 454 grams.

H0: μ = 454 g

 Alternative Hypothesis (H1): The mean net weight of the bags is not equal to
454 grams.

H1: μ ≠ 454 g

 Type of Test: Two-tailed test, because we are interested in detecting deviations

in both directions (whether the mean is either less than or greater than 454
grams).

Q9) The Association of American Medical Colleges (AAMC)

compiles data on medical school faculty and publishes the
results in AAMC Faculty Roster. The following contingency
table cross-classifies medical school faculty by the
characteristics gender and rank.
a) Find P (R3)

b) Find P (R3 |G1)

c) Are events G1 and R3 independent? Explain your

answer.

ANSWER:

To analyze the data from the provided table, we will perform the following steps:

a) Find P(R3)

P(R3) represents the probability of selecting an Assistant Professor regardless of gender.

From the table:

 The total number of Assistant Professors (R3) is 40,379.

 The total number of faculty members is 98,993.

Thus, P(R3) is calculated as:

b) Find P(R3∣G1)

P(R3∣G1) represents the conditional probability of selecting an Assistant Professor given that the
faculty member is male.

From the table:

 The number of male Assistant Professors (R3 given G1) is 25,888.

 The total number of male faculty members (G1) is 70,000.

Thus, P(R3∣G1) is calculated as:

c) Are events G1G_1G1 and R3R_3R3 independent?

Two events A and B are independent if:

For our problem, the events are:

 A: Being male (G1).
 B: Being an Assistant Professor (R3).

To check if G1 and R3 are independent, we compare P(R3∣G1) and P(R3).

If P(R3∣G1) = P(R3), then G1 and R3 are independent.

To determine if the events G1 (being male) and R3 (being an Assistant Professor) are
independent, we compare P(R3∣G1) and P(R3).

From our calculations:

 P(R3) ≈ 0.408
 P(R3∣G1) ≈ 0.370

Since P(R3∣G1) is not equal to P(R3), the events G1 and R3 are not independent. This implies
that knowing a faculty member is male changes the probability of them being an Assistant
Professor compared to the overall probability.

BSA - PUT - SEM I - 21-22 Solution
No ratings yet
BSA - PUT - SEM I - 21-22 Solution
16 pages
Data Types & Probability Analysis
100% (2)
Data Types & Probability Analysis
15 pages
Continuous Continuous Continuous Continuous Continuous: Discrete Discrete
No ratings yet
Continuous Continuous Continuous Continuous Continuous: Discrete Discrete
13 pages
Statistics & Probability Guide
No ratings yet
Statistics & Probability Guide
11 pages
Assignment (1) SOlution
No ratings yet
Assignment (1) SOlution
15 pages
Data Types and Probability Analysis
85% (33)
Data Types and Probability Analysis
13 pages
Assignment 1 1
No ratings yet
Assignment 1 1
13 pages
Assignment
No ratings yet
Assignment
12 pages
ABC Business Statistics
No ratings yet
ABC Business Statistics
12 pages
Assignment
75% (4)
Assignment
13 pages
Assignment 1
No ratings yet
Assignment 1
15 pages
Business Analytics
No ratings yet
Business Analytics
47 pages
Mid Term Test Revision Homework
No ratings yet
Mid Term Test Revision Homework
7 pages
Assignment
No ratings yet
Assignment
11 pages
Binomial Distribution: Presented by
No ratings yet
Binomial Distribution: Presented by
39 pages
Statistics Paper 1
No ratings yet
Statistics Paper 1
38 pages
Assignment
No ratings yet
Assignment
10 pages
Assignment
No ratings yet
Assignment
11 pages
Basic Stats Assignment 1
No ratings yet
Basic Stats Assignment 1
22 pages
Assignment
No ratings yet
Assignment
19 pages
Assignment in Stat Level 1
No ratings yet
Assignment in Stat Level 1
17 pages
Gujarat Technological University: Mca - Semester Iv-Examination - Winter-2024
No ratings yet
Gujarat Technological University: Mca - Semester Iv-Examination - Winter-2024
3 pages
361 Assignment#2
No ratings yet
361 Assignment#2
4 pages
Statistical Inferences Solved Paper
No ratings yet
Statistical Inferences Solved Paper
7 pages
Preboard - D08 Feb 2025
No ratings yet
Preboard - D08 Feb 2025
17 pages
Assignment
100% (1)
Assignment
10 pages
Statistics & Probability Guide
No ratings yet
Statistics & Probability Guide
11 pages
Basuc Statshi
100% (3)
Basuc Statshi
20 pages
Activity Data Type
No ratings yet
Activity Data Type
11 pages
Allama Iqbal Open University, Islamabad (Department of Mathematics & Statistics) Warning
No ratings yet
Allama Iqbal Open University, Islamabad (Department of Mathematics & Statistics) Warning
4 pages
Sta301 Ch.1 To 22 For Grand Quiz
No ratings yet
Sta301 Ch.1 To 22 For Grand Quiz
16 pages
Arjun S Assignment 1 Basic Stat1
88% (8)
Arjun S Assignment 1 Basic Stat1
21 pages
Assignent-1 SDS
No ratings yet
Assignent-1 SDS
17 pages
STA 101 Exam INTRODUCTORY STATISTICS QUESTIONS 2022 - 2023
100% (1)
STA 101 Exam INTRODUCTORY STATISTICS QUESTIONS 2022 - 2023
2 pages
Sta211 2016 2017
No ratings yet
Sta211 2016 2017
5 pages
Stats Final A
No ratings yet
Stats Final A
10 pages
Data Types and Probability Analysis
No ratings yet
Data Types and Probability Analysis
17 pages
AP Statistics Midterm Review
No ratings yet
AP Statistics Midterm Review
6 pages
DeVry MATH 221 Final Exam Guide
No ratings yet
DeVry MATH 221 Final Exam Guide
8 pages
Assignment (Key) 1
100% (1)
Assignment (Key) 1
16 pages
Data Types and Probability Analysis
No ratings yet
Data Types and Probability Analysis
15 pages
5 - Stat Lecture..
No ratings yet
5 - Stat Lecture..
44 pages
Statistics Assignment Solutions
67% (3)
Statistics Assignment Solutions
31 pages
Assignment
No ratings yet
Assignment
18 pages
Assignment (Answers)
100% (1)
Assignment (Answers)
9 pages
Assignment
No ratings yet
Assignment
14 pages
Practice Questions
No ratings yet
Practice Questions
5 pages
Basic Stat-1, Descriptive Statistics and Probability
100% (1)
Basic Stat-1, Descriptive Statistics and Probability
13 pages
Digital Assignment-1: Name: Yash Karande Register Number: 17BCE0140 SLOT: G2+TG2 Course Code: Mat2001
No ratings yet
Digital Assignment-1: Name: Yash Karande Register Number: 17BCE0140 SLOT: G2+TG2 Course Code: Mat2001
21 pages
Stat 101
No ratings yet
Stat 101
21 pages
MTH302 Midterm Solved Subjective With Reference by Uzair
No ratings yet
MTH302 Midterm Solved Subjective With Reference by Uzair
30 pages
Engineering Statistics Assignment
No ratings yet
Engineering Statistics Assignment
13 pages
Statistics for Analysts
No ratings yet
Statistics for Analysts
700 pages
Assignment
No ratings yet
Assignment
5 pages
Basic Statisticks 1 - Assignment - Vivek T
100% (7)
Basic Statisticks 1 - Assignment - Vivek T
18 pages
Statistics Paper 2
No ratings yet
Statistics Paper 2
35 pages
IGNOU Assignment
0% (1)
IGNOU Assignment
9 pages
Probability & Statistics Tutorial
No ratings yet
Probability & Statistics Tutorial
2 pages
Immediate Download Business Analytics, 5e Jeffrey D. Camm Ebooks 2024
100% (17)
Immediate Download Business Analytics, 5e Jeffrey D. Camm Ebooks 2024
38 pages
The Wilcoxon Rank-Sum Test: Example 1
No ratings yet
The Wilcoxon Rank-Sum Test: Example 1
10 pages
Heteroscadasticity
No ratings yet
Heteroscadasticity
11 pages
Plane Answers To Complex Questions The Theory of Linear Models 5th Edition Ronald Christensen Instant Access 2025
No ratings yet
Plane Answers To Complex Questions The Theory of Linear Models 5th Edition Ronald Christensen Instant Access 2025
152 pages
Statistics: Chapter 4: Random Variables and Distribution
No ratings yet
Statistics: Chapter 4: Random Variables and Distribution
59 pages
EEE 305: Regression Techniques
No ratings yet
EEE 305: Regression Techniques
12 pages
GEC801 ASSIGNMENT 2 v2
No ratings yet
GEC801 ASSIGNMENT 2 v2
2 pages
Stationarity, Non Stationarity, Unit Roots and Spurious Regression
No ratings yet
Stationarity, Non Stationarity, Unit Roots and Spurious Regression
26 pages
Inverse Transform
No ratings yet
Inverse Transform
4 pages
DAPv9d Mac2011
No ratings yet
DAPv9d Mac2011
36 pages
Simple Linear Regression Guide
No ratings yet
Simple Linear Regression Guide
54 pages
Data Preprocessing Guide
No ratings yet
Data Preprocessing Guide
40 pages
M3 - L4 (Cross-Covariance and Cross-Correlation)
No ratings yet
M3 - L4 (Cross-Covariance and Cross-Correlation)
13 pages
CPE412 Pattern Recognition (Week 5) - Updated
No ratings yet
CPE412 Pattern Recognition (Week 5) - Updated
36 pages
Incorporation of Exogenous Variable in Long Memory Model ARFIMAX GARCH Framework
No ratings yet
Incorporation of Exogenous Variable in Long Memory Model ARFIMAX GARCH Framework
8 pages
Applied Maths Mock Paper 1
No ratings yet
Applied Maths Mock Paper 1
11 pages
ANOVA For One Way Classification Theory
No ratings yet
ANOVA For One Way Classification Theory
4 pages
324 Final
No ratings yet
324 Final
8 pages
Fall 2019 Ltam Syllabus PDF
No ratings yet
Fall 2019 Ltam Syllabus PDF
7 pages
IEE380 Project
No ratings yet
IEE380 Project
5 pages
Machine Learning For Sociology: Annual Review of Sociology
No ratings yet
Machine Learning For Sociology: Annual Review of Sociology
19 pages
Int 354 ML-1
No ratings yet
Int 354 ML-1
4 pages
Chapter 3
100% (2)
Chapter 3
12 pages
Module 4 - Time Series Analysis
No ratings yet
Module 4 - Time Series Analysis
6 pages
Paired T-Test: A Project Report On
No ratings yet
Paired T-Test: A Project Report On
19 pages
Mat 152 - Sas#09
No ratings yet
Mat 152 - Sas#09
10 pages
ADA Chapter5
No ratings yet
ADA Chapter5
6 pages
Chapter 2 - Frequency Distributions and Graphs: Limits Boundaries F
No ratings yet
Chapter 2 - Frequency Distributions and Graphs: Limits Boundaries F
20 pages
Assignment 4
No ratings yet
Assignment 4
1 page