0% found this document useful (0 votes)

51 views9 pages

Chapter 1 INTRODUCTION TO DATA

Notes for computer science

Uploaded by

zarahrasheed1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

51 views9 pages

Chapter 1 INTRODUCTION TO DATA

Notes for computer science

Uploaded by

zarahrasheed1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Statistics is the scientific methods of collecting, analyzing, summarizing, interpreting, and

presentation of data to make valid conclusion. Statistics is divided into: Descriptive and

Inferential.

Descriptive Statistics: It involves scientific methods to collect and present information with

graphs and numerical values.

Inferential Statistics: Involves the use of probability to generalize base on a sample of

population from a larger population to make conclusion.

DATA AND DATA SOURCES

Statistical data are raw facts of statistics. It may relate to an activity of under study, a

phenomenon, or a situation of interest. Statistical data are derived through the process of

measuring, counting and/or observing. An activity or phenomenon that generates data through its

process is termed as a variable. In other words, a variable

is one that takes on different values upon successive measurements. In statistics, data are

classified into two categories: quantitative data and qualitative data. This classification is based

on the kind of characteristics that are measured.

Quantitative Data: These are data that can be expressed numerically or quantified in definite

units of measurement.

Examples : Age of students taking STS 102, Score of UTME exam, etc. These observations are

expressed using numbers or quantified.

Depending on the nature of the variable observed for measurement, quantitative data can be

further categorized as continuous and discrete data.

Qualitative Data: These data cannot be expressed in numbers or quantified in unit of

measurement. Examples include Blood group, Sex, Nationality etc. These data are further

classified as nominal and rank data.

DATA SOURCES

The sources of data is divided into: Primary and Secondary data

Primary Data: These are data collected directly from the respondent. They are regarded as first

hand information collected by the researcher. Examples of Primary data can be obtained from:

 Census

 Survey

Secondary data: These are data already existed in form of published or unpublished source.

They are available from published source(s) which may not necessarily in the form actually

required.

Examples of secondary data include:

 Journals publication

 Research or Media organization

Methods of Data Collection

The method of data collection depends solely on the problem at hand. There are various methods

of collection of data viz-a-viz :

 Interviewing

 Questionnaire

 Observation

 Telephone
Data Presentation

A set of raw data collected are organized numerically for ease of analysis and

presentation. This is done by creating frequency table which is known as frequency

distribution. Presenting data in tables, charts, graphs gives a clearer meaning to the data.

Basic Terms

Class interval : A symbol defining a class, e.g 60–62 is called a class interval. The end numbers,

and 62, are called class limits; the smaller number (60) is the lower class limit, and the larger

number (62)

is the upper class limit.

Class Boundaries : the class boundaries are obtained by adding the upper limit of one class

interval to the

lower limit of the next-higher class interval and dividing by 2.

Class Width or Class Size: The size, or width, of a class interval is the difference between the

lower and upper class boundaries

and is also referred to as the class width, class size, or class length. If all class intervals of a

frequency

distribution have equal widths, this common width is denoted by c. In such case c is equal to the

difference between two successive lower class limits or two successive upper class limits.

Class Mark: The class mark is the midpoint of the class interval and is obtained by adding the

lower and upper

class limits and dividing by 2. The class mark is also called the class midpoint.

Frequency: A frequency is the number of times a value of the data occurs

Relative Frequency: A relative frequency is the ratio (fraction or proportion) of the number of

times a value of the data occurs in the set of all outcomes to the total number of outcomes. To

find the relative frequencies, divide each frequency by the total number of students in the

sample, n.

Cumulative Frequency: it is the sum of a frequency of the particular class to the frequencies of

the class before it.

Frequency Distribution

Frequency distribution is classified as: grouped and ungrouped frequency distribution.

Ungrouped frequency: it is basically for quantitative data sets. It is best when the range of the

data is less than 10 units. Range is the difference between the largest data value and the smallest

data value. For example, twenty students were asked how many hours they worked per day.

Their responses, in hours, are as follows:

5; 6; 3; 3; 2; 4; 8; 5; 2; 3; 5; 6; 5; 4; 4; 3; 5; 2; 5; 3.

Range= 8-2

Since the range is 6, we will keep each data value separate and not group them together. To

create an ungrouped frequency distribution is a simple task. Place the data values from smallest

to the largest without skipping any values on the first column. Place the frequency, the count of

each data value, in the corresponding row of the second column.

The table below shows the different data values in ascending order and their frequencies. Notice

all the data values are listed including seven which is not listed on the original data set.
Data Values Frequency(f)

2 3

3 5

4 3

5 6

6 2

7 0

8 1

Frequency distribution of students work hours

Grouped Frequency Distribution

This second type of frequency distribution is also used when there is quantitative data. However,

it is used when the range is large and the data values need to be grouped together. For example,

28 students were asked how many hours they worked per week. Their responses, in hours, are as

follows:

15; 26; 13; 33; 22; 14; 27; 15; 32; 23; 5; 26; 25; 14; 34; 13; 15; 22; 15; 28; 10; 18; 21; 24; 20; 18;

34; 20;

Here there are too many different data values to list them separately as in the ungrouped

frequency distribution. Notice the range is 29 (highest – lowest = 34 – 5). Therefore we need to

construct a grouped frequency distribution and group data values into classes.

A class is an interval where the lowest value of the interval is known as the lower limit and the

highest value of the interval is known as the upper limit.

Guidelines for classes:

 There should be between 5 and 20 classes

 Classes must be mutually exclusive (no overlap of data values)

 Classes must be all inclusive and continuous

 Classes must be equal in width

Constructing a Grouped Frequency Distribution:

1.) Find Range (R) (highest data value – lowest data value)

2.) Determine the number of classes (C) (usually the minimum is 5 classes and a maximum of 20

classes)

There are several suggested guide lines aimed at helping one decided on how many class

intervals to employ. Two of such methods are:

(a) C = 1 +3.322(log10 𝑛)

(b) C = 𝑛 where n = number of observations.

𝑅
3. Determine the width of the class interval (W), given as W= 𝐶 , where R is the Range of values,

and C is number of classes.

Note: Class width are rounded up to give number of classes.

4. Choose first lower limit (usually the lowest data value)

5. Create the other lower limits of the classes by adding the class width to the previous lower

limit

6. Create the upper limits by not overlapping the limits

7. Determine the numbers of observations falling into each class interval i.e. find the class

frequencies.

.
Example1: The following are the marks of 50 students in STS 102:

48 70 60 47 51 55 59 63 68 63 47 53 72 53 67 62 64 70 57 56

48 51 58 63 65 62 49 64 53 59 63 50 61 67 72 56 64 66 49 52

61 71 58 53 63 69 59 64 73 56.

(a) Construct a frequency table for the above data.

(b) Answer the following questions using the table obtained:

(i) how many students scored between 51 and 62?

(ii) how many students scored above 50?

(iii) what is the probability that a student selected at random from the class will

score less than 63?

Solution:

(a) Range (R) = Largest value – Smallest Value

= 73-47=26

No of classes(C) = 𝑛 = 50= 7.07≅ 7

𝑅 26
Class size or width (W)= 𝐶 = = 3.7 ≅ 4
7

Frequency Table

Marks Tally Frequency (f)

47-50 |||| || 7

51-54 |||| || 7

55-58 |||| || 7

59-62 |||| ||| 8

63-66 |||| |||| | 11

67-70 |||| | 6

71-74 |||| 4

b. i. 7+7+8 = 22

ii. 7+7+8+11+6+4= 43

iii. scores less than 63= 8+7+7+7= 29

Total number of students= 50

Prob(less than 63) = 29/50= 0.58

Example2: Twenty-eight students were asked how many hours they worked per week. Their

responses, in hours, are as follows: 15; 26; 13; 33; 22; 14; 27; 15; 32; 23; 5; 26; 25; 14; 34; 13;

15; 22; 15; 28; 10; 18; 21; 24; 20; 18; 34; 20; construct a grouped frequency distribution using 5

classes

Solution:

1. Range = 34 – 5 = 29

2. Use 5 classes

3. Class Width = 29/5 = 5.8 round up to 6

4. First lower limit will be 5 which is the minimum data value

5. The other lower limits will be 11, 17, 23, 29 by adding the class width of 6 to the previous

lower limit
6. The first upper limit will be 10 since the next class begins at 11. Using class width again, the

other upper limits are 16, 22, 28, 34

Class Tally Frequency (f)

5- 10 || 2

11-16 |||| ||| 8

17- 22 |||| || 7

23- 28 |||| || 7

29-34 |||| 4

ASSIGNMENT 1

The following data represent the ages (in years) of people living in a housing estate
in Abeokuta.
18 31 30 6 16 17 18 43 2 8 32 33 9 18 33 19 21 13 13 14
14 6 52 45 61 23 26 15 14 15 14 27 36 19 37 11 12 11 20 12
39 20 40 69 63 29 64 27 15 28.
Present the above data in a frequency table showing the following columns; class

interval, class boundary, class mark (mid-point), tally, frequency and cumulative

ASSIGNMENT 2

The grade points of 40 students are given below, using class 8 classes, construct a frequency

distribution and relative frequency

48 70 60 47 51 55 59 63 68 63 47 53 72 53 67 62 64 70 57 56

48 51 58 63 65 62 49 64 53 59 63 50 61 67 72 56 64 66 49 52

Data Presentation
No ratings yet
Data Presentation
19 pages
Adv Stat Data Presentation
No ratings yet
Adv Stat Data Presentation
57 pages
Episode 2
No ratings yet
Episode 2
11 pages
2020 - Statistics 1 - Session 2
No ratings yet
2020 - Statistics 1 - Session 2
7 pages
Frequency Distribution
No ratings yet
Frequency Distribution
14 pages
UDSM Statistics and Probability For Non-Majors
No ratings yet
UDSM Statistics and Probability For Non-Majors
148 pages
Unit 7 Lecture Note
No ratings yet
Unit 7 Lecture Note
25 pages
CH#14 1
No ratings yet
CH#14 1
9 pages
Frequency
100% (1)
Frequency
36 pages
18bst5el U2
No ratings yet
18bst5el U2
21 pages
Unit 2 Statistics Analytics
No ratings yet
Unit 2 Statistics Analytics
10 pages
Data Organization Techniques in Statistics
No ratings yet
Data Organization Techniques in Statistics
14 pages
Statistics 2025
No ratings yet
Statistics 2025
160 pages
Chapter 1
No ratings yet
Chapter 1
9 pages
Chapter 18 - Statistics Presentation
No ratings yet
Chapter 18 - Statistics Presentation
44 pages
Intro to Descriptive Statistics
No ratings yet
Intro to Descriptive Statistics
13 pages
Statistics Combine
No ratings yet
Statistics Combine
65 pages
3 Organizing Data
No ratings yet
3 Organizing Data
20 pages
Frequency Distribution Module
No ratings yet
Frequency Distribution Module
5 pages
Aj Ka Kaam
No ratings yet
Aj Ka Kaam
21 pages
Module Two: Frequency Distribution and Their Graphic Representations
No ratings yet
Module Two: Frequency Distribution and Their Graphic Representations
14 pages
STA 111 - Topic One - Lecture 2
No ratings yet
STA 111 - Topic One - Lecture 2
20 pages
Chapter-2-Methods of Data Presentation
No ratings yet
Chapter-2-Methods of Data Presentation
17 pages
Grouped Frequency Distribution
100% (1)
Grouped Frequency Distribution
5 pages
Business Statistics Chapter 2
No ratings yet
Business Statistics Chapter 2
33 pages
UNIT 3 Methods of Organizing and Presenting Data
No ratings yet
UNIT 3 Methods of Organizing and Presenting Data
24 pages
Chapter 2
No ratings yet
Chapter 2
46 pages
2jane - Frequency Dist and Graphs FINAL March 2022
No ratings yet
2jane - Frequency Dist and Graphs FINAL March 2022
11 pages
Frequency Distribution: A Frequency Distribution Is Constructed For Three Main Reasons
No ratings yet
Frequency Distribution: A Frequency Distribution Is Constructed For Three Main Reasons
15 pages
Assessment Learning 2. M4
No ratings yet
Assessment Learning 2. M4
10 pages
Frequency Distribution
No ratings yet
Frequency Distribution
27 pages
STA112 Week 2 Class Note
No ratings yet
STA112 Week 2 Class Note
102 pages
2 LESSON 2 Freq Graphs FQ
No ratings yet
2 LESSON 2 Freq Graphs FQ
21 pages
Chapter 2-Descriptive Statistics and Data Presentation
No ratings yet
Chapter 2-Descriptive Statistics and Data Presentation
7 pages
Unit 2
No ratings yet
Unit 2
10 pages
Methods of Data Presntation
No ratings yet
Methods of Data Presntation
53 pages
Assessment in Learning 2 Module RIVERA
No ratings yet
Assessment in Learning 2 Module RIVERA
159 pages
Ch2 Statistics
No ratings yet
Ch2 Statistics
41 pages
Chapter1 Statistics
No ratings yet
Chapter1 Statistics
12 pages
Sta111 Complete Note
No ratings yet
Sta111 Complete Note
74 pages
Variables and Attributes
No ratings yet
Variables and Attributes
4 pages
10th Class Maths Notes 2024 CH 6
No ratings yet
10th Class Maths Notes 2024 CH 6
33 pages
Statistics Chapter-II
No ratings yet
Statistics Chapter-II
66 pages
Statistics: Class Mark Cumulative Frequency Histogram Frequency Polygon Mean Median Mode
No ratings yet
Statistics: Class Mark Cumulative Frequency Histogram Frequency Polygon Mean Median Mode
27 pages
Descriptive Statistics Guide
No ratings yet
Descriptive Statistics Guide
32 pages
L1 SK
No ratings yet
L1 SK
2 pages
Statistics 11 Ncert Notes
No ratings yet
Statistics 11 Ncert Notes
80 pages
Assessment 2 Chapter 1
No ratings yet
Assessment 2 Chapter 1
23 pages
7 Module
No ratings yet
7 Module
5 pages
09042020212858practical Statistical Methods 2019-20
No ratings yet
09042020212858practical Statistical Methods 2019-20
91 pages
Chapter-2-Methods of Data Presentation
No ratings yet
Chapter-2-Methods of Data Presentation
17 pages
"Probability and Statistics (For Engineering) 235 M: Summer Session 2019/2020
No ratings yet
"Probability and Statistics (For Engineering) 235 M: Summer Session 2019/2020
45 pages
Types of Data
No ratings yet
Types of Data
10 pages
Frequency Distribution Guide
100% (2)
Frequency Distribution Guide
14 pages
Data Organization Techniques
No ratings yet
Data Organization Techniques
24 pages
Topic 6
No ratings yet
Topic 6
7 pages
Graphical Representation of Data Statistics p2
No ratings yet
Graphical Representation of Data Statistics p2
35 pages
Technical Terms Used in Formulation Frequency Distribution
100% (1)
Technical Terms Used in Formulation Frequency Distribution
22 pages
Chapter 2
No ratings yet
Chapter 2
32 pages
Tiddly Wiki Cheat Sheet
No ratings yet
Tiddly Wiki Cheat Sheet
1 page
Bamberg, DeFina - Schiffrin (2011) - Discourse and Identity Construction
No ratings yet
Bamberg, DeFina - Schiffrin (2011) - Discourse and Identity Construction
21 pages
22 Business English Expressions You Cant Live Without and How To Use Them
No ratings yet
22 Business English Expressions You Cant Live Without and How To Use Them
8 pages
General Physics 2 Current, Resistivity, and Resistance
100% (3)
General Physics 2 Current, Resistivity, and Resistance
36 pages
Name of Institution: - Inclusion Date (Quarter)
No ratings yet
Name of Institution: - Inclusion Date (Quarter)
3 pages
Wheeled-Range Broch Lowq
No ratings yet
Wheeled-Range Broch Lowq
12 pages
S1-521 Ticket Counter Details 02
No ratings yet
S1-521 Ticket Counter Details 02
1 page
Nervous Shock or Psychiatric Illness (EXAM NOTES)
No ratings yet
Nervous Shock or Psychiatric Illness (EXAM NOTES)
10 pages
Immediate Inference Explained
No ratings yet
Immediate Inference Explained
14 pages
Intellectual Property Essentials
No ratings yet
Intellectual Property Essentials
7 pages
Lubrication &ignition Systems
100% (1)
Lubrication &ignition Systems
26 pages
Electrical Inverter Inspection Checklist
No ratings yet
Electrical Inverter Inspection Checklist
1 page
Đề IELTS Reading - Passage 1 - 23.3.2024
No ratings yet
Đề IELTS Reading - Passage 1 - 23.3.2024
3 pages
Site Clearance Corrected
No ratings yet
Site Clearance Corrected
4 pages
Project Standard Specification: Packaged Booster Pumps 15444 - Page 1/6
No ratings yet
Project Standard Specification: Packaged Booster Pumps 15444 - Page 1/6
6 pages
Mechanics 1: Projectiles, Constrained Motion & Friction
No ratings yet
Mechanics 1: Projectiles, Constrained Motion & Friction
7 pages
Chanda's Resume
No ratings yet
Chanda's Resume
3 pages
13-Feb Avnish Sharma Show Questioons
No ratings yet
13-Feb Avnish Sharma Show Questioons
17 pages
Rpe-Qaqc-Pr-06-Hydro Test Procedure
No ratings yet
Rpe-Qaqc-Pr-06-Hydro Test Procedure
6 pages
Content
No ratings yet
Content
118 pages
Cara Menjagakualitas Penimbangan Berbasis Resiko
No ratings yet
Cara Menjagakualitas Penimbangan Berbasis Resiko
32 pages
My Reaction Paper About The Movies of Buhos - Docx (SCIENCE)
100% (3)
My Reaction Paper About The Movies of Buhos - Docx (SCIENCE)
2 pages
Higher Order Linear Differential Equations
No ratings yet
Higher Order Linear Differential Equations
68 pages
Research Paper Convo
No ratings yet
Research Paper Convo
6 pages
Hasselblad Special Edition Technical Sheet
No ratings yet
Hasselblad Special Edition Technical Sheet
7 pages
FluOro Guide
No ratings yet
FluOro Guide
21 pages
Evaluation of Technical Drilling Operation - Case Study in Indonesia
No ratings yet
Evaluation of Technical Drilling Operation - Case Study in Indonesia
55 pages
Vocabulary Exercises for Students
No ratings yet
Vocabulary Exercises for Students
2 pages
DAILY LESSON PLAN 4 - Lines and Planes in 3-Dimensions
No ratings yet
DAILY LESSON PLAN 4 - Lines and Planes in 3-Dimensions
4 pages
Start Permissive: Double Check
No ratings yet
Start Permissive: Double Check
7 pages

Chapter 1 INTRODUCTION TO DATA

Uploaded by

Chapter 1 INTRODUCTION TO DATA

Uploaded by

Statistics is the scientific methods of collecting, analyzing, summarizing, interpreting, and

graphs and numerical values.

Inferential Statistics: Involves the use of probability to generalize base on a sample of

population from a larger population to make conclusion.

DATA AND DATA SOURCES

process is termed as a variable. In other words, a variable

on the kind of characteristics that are measured.

expressed using numbers or quantified.

further categorized as continuous and discrete data.

classified as nominal and rank data.

The sources of data is divided into: Primary and Secondary data

Examples of secondary data include:

 Research or Media organization

Methods of Data Collection

of collection of data viz-a-viz :

presentation. This is done by creating frequency table which is known as frequency

is the upper class limit.

lower limit of the next-higher class interval and dividing by 2.

lower and upper class boundaries

lower and upper

Frequency: A frequency is the number of times a value of the data occurs

the class before it.

Frequency distribution is classified as: grouped and ungrouped frequency distribution.

Their responses, in hours, are as follows:

each data value, in the corresponding row of the second column.

Frequency distribution of students work hours

Grouped Frequency Distribution

highest value of the interval is known as the upper limit.

Guidelines for classes:

 Classes must be mutually exclusive (no overlap of data values)

 Classes must be all inclusive and continuous

 Classes must be equal in width

Constructing a Grouped Frequency Distribution:

intervals to employ. Two of such methods are:

(b) C = 𝑛 where n = number of observations.

and C is number of classes.

Note: Class width are rounded up to give number of classes.

4. Choose first lower limit (usually the lowest data value)

6. Create the upper limits by not overlapping the limits

(a) Construct a frequency table for the above data.

(b) Answer the following questions using the table obtained:

(i) how many students scored between 51 and 62?

(ii) how many students scored above 50?

score less than 63?

(a) Range (R) = Largest value – Smallest Value

No of classes(C) = 𝑛 = 50= 7.07≅ 7

Marks Tally Frequency (f)

59-62 |||| ||| 8

iii. scores less than 63= 8+7+7+7= 29

Total number of students= 50

Prob(less than 63) = 29/50= 0.58

3. Class Width = 29/5 = 5.8 round up to 6

4. First lower limit will be 5 which is the minimum data value

other upper limits are 16, 22, 28, 34

Class Tally Frequency (f)

11-16 |||| ||| 8

distribution and relative frequency

You might also like