0% found this document useful (0 votes)

67 views8 pages

STA 100 Lab Assignment 1

Uploaded by

cloudy.mugwort

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

67 views8 pages

STA 100 Lab Assignment 1

Uploaded by

cloudy.mugwort

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

STA 100 Lab Assignment 1

Ling Ai

2024-10-03

R Workout Lab Assignment: Patient Data Analysis

Instructions: For each question below, you will be working with the patients101.csv dataset. Follow the
steps as outlined and submit your work on Gradescope. The assignment will be graded as complete, partially
complete, or missing. Your TA will go over the material in class, and you are expected to follow along and
complete each step.

Import Data and inspect Data Set

CSV (Comma-Separated Values) is a common format for datasets. To load a CSV file into R, you can use
the read.csv() function.
data = read.csv("path/to/your/file.csv")
• path/to/your/file.csv: Replace this with the actual file path of your CSV file.
• data: This is the variable where the data will be stored as a data frame.
data = read.csv("/Users/lindaai/Downloads/patients101.csv")
data

## age totalchol sysBP weight height sedmins obese marriage gender

## 1 52 193 128 92.3 152.1 60 obese other F
## 2 63 194 112 71.1 151.7 300 obese married F
## 3 48 225 128 58.1 162.9 480 normal divorced F
## 4 21 145 106 79.8 170.0 120 overweight married M
## 5 66 224 124 116.2 160.0 480 obese widowed F
## 6 31 270 118 77.5 165.8 480 overweight married M
## 7 64 165 158 88.0 183.7 20 overweight married M
## 8 73 241 124 75.8 170.2 240 overweight married M
## 9 39 240 122 100.8 170.3 2 obese married F
## 10 73 183 196 81.2 160.6 240 obese married F
## 11 50 185 106 75.7 171.5 300 overweight married M
## 12 71 292 146 83.8 149.7 480 obese divorced F
## 13 35 330 104 63.8 160.0 360 overweight married F
## 14 44 212 116 93.0 177.0 480 overweight married M
## 15 40 202 112 75.3 156.1 120 obese divorced F
## 16 23 199 114 115.7 182.3 720 obese nevermarried M
## 17 53 251 114 59.7 164.9 300 normal married F
## 18 37 176 100 64.3 161.3 300 normal married F
## 19 60 250 96 54.5 156.4 420 normal married F
## 20 80 140 124 70.4 148.5 480 obese widowed F
## 21 67 154 114 93.6 172.1 180 obese married M

1
## 22 58 179 142 106.0 167.2 600 obese married F
## 23 57 142 156 101.3 159.9 600 obese married F
## 24 72 148 132 94.3 177.4 360 obese widowed M
## 25 39 217 132 90.7 150.3 480 obese married F
## 26 35 211 136 78.8 166.6 480 overweight married F
## 27 62 216 110 96.5 182.5 240 overweight married M
## 28 52 184 130 75.0 179.8 360 normal nevermarried M
## 29 75 184 138 76.7 166.6 120 overweight married M
## 30 61 193 130 55.2 160.6 600 normal married F
## 31 73 269 180 77.7 169.7 480 overweight divorced M
## 32 20 187 124 99.5 165.0 600 obese nevermarried F
## 33 60 264 118 78.7 159.0 300 obese divorced F
## 34 56 150 184 77.0 172.5 120 overweight other M
## 35 54 171 134 67.6 180.7 120 normal nevermarried F
## 36 39 173 108 73.9 160.1 90 overweight other F
## 37 31 201 106 80.4 175.8 600 overweight nevermarried F
## 38 28 144 110 74.3 188.5 180 normal nevermarried M
## 39 31 170 100 60.0 158.2 120 normal nevermarried F
## 40 39 239 132 115.8 186.2 300 obese married M
## 41 41 182 106 65.7 160.6 720 overweight divorced F
## 42 51 184 132 77.7 183.0 240 normal other M
## 43 29 155 112 63.4 160.4 240 normal other F
## 44 31 227 126 106.9 163.0 300 obese divorced F
## 45 20 175 114 75.5 170.3 180 overweight nevermarried F
## 46 39 267 122 91.5 173.5 120 obese other M
## 47 48 241 114 60.6 159.3 240 normal married F
## 48 76 133 118 94.2 169.1 480 obese married M
## 49 80 227 120 69.0 175.5 120 normal married M
## 50 28 154 110 79.9 168.9 360 overweight nevermarried M
## 51 32 212 138 109.5 183.3 780 obese other M
## 52 27 153 112 98.4 167.1 60 obese married F
## 53 51 144 132 103.9 178.5 180 obese married M
## 54 27 196 92 70.3 166.6 360 overweight nevermarried F
## 55 44 192 126 62.3 165.8 180 normal divorced M
## 56 20 227 114 106.6 181.5 600 obese nevermarried M
## 57 30 228 104 64.7 159.5 120 overweight married F
## 58 26 164 110 55.7 168.3 360 normal other M
## 59 80 158 150 82.8 168.0 240 overweight married M
## 60 48 243 146 142.1 182.2 240 obese divorced M
## 61 37 237 120 78.0 153.1 180 obese married M
## 62 33 173 104 66.5 165.2 240 normal married F
## 63 80 165 148 49.0 158.8 120 normal married F
## 64 28 236 124 100.8 169.7 90 obese married M
## 65 44 232 102 58.2 173.9 300 normal nevermarried M
## 66 56 238 170 108.5 161.1 600 obese married F
## 67 42 264 110 82.8 171.7 120 overweight married M
## 68 48 298 116 81.5 172.8 180 overweight married M
## 69 38 200 104 71.4 158.0 360 overweight married F
## 70 75 152 124 71.3 173.8 480 normal widowed M
## 71 30 148 116 72.7 183.6 360 normal nevermarried M
## 72 45 180 120 129.2 173.4 180 obese other M
## 73 41 182 92 67.8 165.3 540 normal married F
## 74 49 202 112 82.8 164.8 240 obese nevermarried F
## 75 38 186 108 99.8 177.6 360 obese other F

2
## 76 69 205 104 100.6 184.6 120 overweight married M
## 77 61 275 154 58.2 145.8 120 overweight married F
## 78 74 217 132 99.2 156.8 360 obese widowed F
## 79 69 163 134 122.3 176.2 360 obese married M
## 80 78 276 128 75.2 168.8 240 overweight divorced F
## 81 71 196 146 89.1 148.4 180 obese married F
## 82 80 194 170 62.5 160.1 180 normal divorced F
## 83 23 198 96 66.9 163.9 180 normal nevermarried M
## 84 62 194 130 55.7 148.6 120 overweight married F
## 85 41 239 118 100.6 164.4 180 obese married M
## 86 76 162 148 70.0 148.2 360 obese widowed F
## 87 75 242 128 58.6 169.9 180 normal married M
## 88 58 204 132 87.1 170.8 300 overweight married M
## 89 45 178 116 90.2 172.8 600 obese divorced F
## 90 39 170 100 62.2 182.8 840 normal married M
## 91 73 148 176 91.3 167.4 300 obese married M
## 92 62 240 174 76.9 169.6 420 overweight nevermarried M
## 93 38 226 144 71.8 170.2 240 normal other M
## 94 26 188 106 110.5 155.1 240 obese married F
## 95 46 298 128 75.4 152.9 90 obese married F
## 96 30 203 106 100.1 161.0 420 obese other F
## 97 59 266 138 78.0 166.1 600 overweight widowed F
## 98 39 152 118 80.1 169.0 240 overweight married F
## 99 20 162 114 68.9 153.4 360 overweight married F
## 100 76 253 140 93.3 177.0 240 overweight widowed M
A data frame in R is a two-dimensional table-like structure used to store data. It’s similar to a spreadsheet
or a database table where each column represents a variable, and each row represents an observation or data
point.
Here’s a breakdown of the data frame structure: - Columns: Each column contains data of a specific type
(e.g., numeric, character, or factor). For example, in the patients101.csv dataset, age would be a numeric
column, and gender would be a character or factor column.
• Rows: Each row is a single observation. In our case, each row in the dataset represents one patient’s
information.
You can think of it as an organized collection of variables (columns) where each observation (row) holds
values for those variables.
Example: Here’s what the data frame might look like for first 6 rows of our dataset:
head(data,6)

## age totalchol sysBP weight height sedmins obese marriage gender

## 'data.frame': 100 obs. of 9 variables:

## $ age : int 52 63 48 21 66 31 64 73 39 73 ...

3
## $ totalchol: int 193 194 225 145 224 270 165 241 240 183 ...
## $ sysBP : int 128 112 128 106 124 118 158 124 122 196 ...
## $ weight : num 92.3 71.1 58.1 79.8 116.2 ...
## $ height : num 152 152 163 170 160 ...
## $ sedmins : int 60 300 480 120 480 480 20 240 2 240 ...
## $ obese : chr "obese" "obese" "normal" "overweight" ...
## $ marriage : chr "other" "married" "divorced" "married" ...
## $ gender : chr "F" "F" "F" "M" ...
To work with specific columns in a data frame, you can refer to them in a few different ways. In R, there are
several ways to call or select certain columns from a data frame:
1. Using the $ operator:
• This is one of the easiest ways to access a single column. You type the name of the data frame,
followed by $, and then the column name.
data$age

## [1] 52 63 48 21 66 31 64 73 39 73 50 71 35 44 40 23 53 37 60 80 67 58 57 72 39
## [26] 35 62 52 75 61 73 20 60 56 54 39 31 28 31 39 41 51 29 31 20 39 48 76 80 28
## [51] 32 27 51 27 44 20 30 26 80 48 37 33 80 28 44 56 42 48 38 75 30 45 41 49 38
## [76] 69 61 74 69 78 71 80 23 62 41 76 75 58 45 39 73 62 38 26 46 30 59 39 20 76
2. Using square brackets []:
• Data frames can be treated like matrices where rows and columns are accessed using square
brackets. You can select columns by specifying their index (position) or name.
data[,"age"]

## age sysBP
## 1 52 128
## 2 63 112
## 3 48 128
## 4 21 106
## 5 66 124
## 6 31 118
## 7 64 158
## 8 73 124
## 9 39 122
## 10 73 196
## 11 50 106
## 12 71 146
## 13 35 104
## 14 44 116

4
## 15 40 112
## 16 23 114
## 17 53 114
## 18 37 100
## 19 60 96
## 20 80 124
## 21 67 114
## 22 58 142
## 23 57 156
## 24 72 132
## 25 39 132
## 26 35 136
## 27 62 110
## 28 52 130
## 29 75 138
## 30 61 130
## 31 73 180
## 32 20 124
## 33 60 118
## 34 56 184
## 35 54 134
## 36 39 108
## 37 31 106
## 38 28 110
## 39 31 100
## 40 39 132
## 41 41 106
## 42 51 132
## 43 29 112
## 44 31 126
## 45 20 114
## 46 39 122
## 47 48 114
## 48 76 118
## 49 80 120
## 50 28 110
## 51 32 138
## 52 27 112
## 53 51 132
## 54 27 92
## 55 44 126
## 56 20 114
## 57 30 104
## 58 26 110
## 59 80 150
## 60 48 146
## 61 37 120
## 62 33 104
## 63 80 148
## 64 28 124
## 65 44 102
## 66 56 170
## 67 42 110
## 68 48 116

5
## 69 38 104
## 70 75 124
## 71 30 116
## 72 45 120
## 73 41 92
## 74 49 112
## 75 38 108
## 76 69 104
## 77 61 154
## 78 74 132
## 79 69 134
## 80 78 128
## 81 71 146
## 82 80 170
## 83 23 96
## 84 62 130
## 85 41 118
## 86 76 148
## 87 75 128
## 88 58 132
## 89 45 116
## 90 39 100
## 91 73 176
## 92 62 174
## 93 38 144
## 94 26 106
## 95 46 128
## 96 30 106
## 97 59 138
## 98 39 118
## 99 20 114
## 100 76 140
• The first position in the brackets [,] refers to rows, and the second position refers to columns. If you
leave the row position blank (as shown), you select all rows for that column.
By using these methods, you can focus on analyzing specific variables in your dataset without dealing with
the entire data frame.

Question
(a) Find the average systolic blood pressure of all subjects.
avg.sysBP = mean(data$sysBP)

• Answer:The blood pressure of average subjects is 125.12

(b) Find the standard deviation of systolic blood pressure of all subjects.
sd(data$sysBP)

## [1] 20.91893
• Answer: 20.91893
(c) Find the average weight by gender.

6
mean(data$weight[data$gender=="M"])

## [1] 86.54681
mean(data$weight[data$gender=="F"])

## [1] 78.26415
k = aggregate(weight ~ gender,data,mean)
knitr::kable(k)

gender weight
F 78.26415
M 86.54681

• Answer:Male:86.54681, Female:78.26415
(d) Find the standard deviation of height by gender.
sd(data$height[data$gender=="M"])

## [1] 7.204293
sd(data$height[data$gender=="F"])

## [1] 7.721785
aggregate(height~gender,data,sd)

## gender height
## 1 F 7.721785
## 2 M 7.204293
• Answer:Male:7.204293, Female:7.721785
(e) Which marriage category has the most subjects?
g = table(data$marriage)
g

##
## divorced married nevermarried other widowed
## 12 52 16 12 8
aggregate(age~marriage,data,length)

## marriage age
## 1 divorced 12
## 2 married 52
## 3 nevermarried 16
## 4 other 12
## 5 widowed 8
• Answer: married

Submission Instructions:
• Make sure your code runs without errors and produces the correct output.

7
• Upload your pdf to Gradescope under the corresponding assignment.
• Your assignment will be graded as complete, partially complete, or missing.

Grading Rubric:
• Complete: All questions are answered with correct and functional code.
• Partially complete: Some questions are answered, but there are errors or missing parts in the code.
• Missing: No code is provided or no attempt is made to answer the questions.

Lab0 R Tutorial EHS
No ratings yet
Lab0 R Tutorial EHS
9 pages
Basic Stats For Ecology
No ratings yet
Basic Stats For Ecology
26 pages
UNIT-II R Programming
No ratings yet
UNIT-II R Programming
41 pages
Dar - II Bca - IV - Final Lab Doc - Balu Sir
No ratings yet
Dar - II Bca - IV - Final Lab Doc - Balu Sir
44 pages
MDPN460 Lecture05
No ratings yet
MDPN460 Lecture05
32 pages
6 Working With Data Frames in R
No ratings yet
6 Working With Data Frames in R
8 pages
IntroR 2
No ratings yet
IntroR 2
18 pages
Unit2 R PGM
No ratings yet
Unit2 R PGM
33 pages
Lab4-Factors & DataFrames
No ratings yet
Lab4-Factors & DataFrames
5 pages
Lecture 5 (Managing and Understanding Data)
No ratings yet
Lecture 5 (Managing and Understanding Data)
9 pages
R Record-1
No ratings yet
R Record-1
57 pages
R
No ratings yet
R
15 pages
L3 Notes-1
No ratings yet
L3 Notes-1
8 pages
R Data Handling & File Operations
No ratings yet
R Data Handling & File Operations
41 pages
Midterm Project Group 6
No ratings yet
Midterm Project Group 6
41 pages
Ma 3
No ratings yet
Ma 3
32 pages
Module 8
No ratings yet
Module 8
59 pages
Unit II - R Programming
No ratings yet
Unit II - R Programming
29 pages
R1 Uptovisualisation
No ratings yet
R1 Uptovisualisation
122 pages
Exploratory Data Analysis and Visualization
No ratings yet
Exploratory Data Analysis and Visualization
10 pages
STAT501 Online - HW2R - Spring2024
No ratings yet
STAT501 Online - HW2R - Spring2024
7 pages
Final Cost Practical
No ratings yet
Final Cost Practical
29 pages
Gries Stefan Thomas (2013) - Statistics For Linguistics With R - 2
No ratings yet
Gries Stefan Thomas (2013) - Statistics For Linguistics With R - 2
100 pages
All Values in The First Column
No ratings yet
All Values in The First Column
7 pages
Unit 1 R Reading-Writing Files
No ratings yet
Unit 1 R Reading-Writing Files
8 pages
Experiment 5
No ratings yet
Experiment 5
13 pages
R Lecture 2-1
No ratings yet
R Lecture 2-1
28 pages
Lecture 1
No ratings yet
Lecture 1
167 pages
Lab 5
0% (1)
Lab 5
5 pages
R Docs
No ratings yet
R Docs
45 pages
Lab Manual - DSR
No ratings yet
Lab Manual - DSR
32 pages
Analysis Using Statistical: Introduction & Data Exploration
No ratings yet
Analysis Using Statistical: Introduction & Data Exploration
23 pages
STA 272 Chapter 02 Notes and Codes Data Frames in R
No ratings yet
STA 272 Chapter 02 Notes and Codes Data Frames in R
5 pages
Statistics and Data Science With R Part - 4
No ratings yet
Statistics and Data Science With R Part - 4
23 pages
CIND123 Lab 1 Console
No ratings yet
CIND123 Lab 1 Console
4 pages
Simple Tutorial in R
No ratings yet
Simple Tutorial in R
15 pages
Week 7
No ratings yet
Week 7
10 pages
DA Lab Week-1
No ratings yet
DA Lab Week-1
7 pages
Unit 4-1
No ratings yet
Unit 4-1
21 pages
R Cheat Sheet
No ratings yet
R Cheat Sheet
9 pages
Assignment Ans
No ratings yet
Assignment Ans
4 pages
UL2
No ratings yet
UL2
2 pages
R Data Types 8
No ratings yet
R Data Types 8
7 pages
Introduction To STATA: Introduction To STATA About STATA Basic Operations Regression Analysis Panel Data Analysis
No ratings yet
Introduction To STATA: Introduction To STATA About STATA Basic Operations Regression Analysis Panel Data Analysis
27 pages
R ggplot2 Code Examples & Tips
No ratings yet
R ggplot2 Code Examples & Tips
22 pages
4.18 Data Wrangling Slides Part1
No ratings yet
4.18 Data Wrangling Slides Part1
54 pages
Seconda Settimana R
No ratings yet
Seconda Settimana R
30 pages
Data Cleaning in R
No ratings yet
Data Cleaning in R
2 pages
Dataframes
No ratings yet
Dataframes
13 pages
Read and Write CSV Files in R
No ratings yet
Read and Write CSV Files in R
39 pages
Group 5 - Applied Statistics and Experimental 152611
No ratings yet
Group 5 - Applied Statistics and Experimental 152611
28 pages
Basic R Dplyr Session 4 Demonstration
No ratings yet
Basic R Dplyr Session 4 Demonstration
18 pages
R Chapter4
No ratings yet
R Chapter4
8 pages
Lab 02 - Compound Data Structures
No ratings yet
Lab 02 - Compound Data Structures
12 pages
楊睿中統計學合併版
No ratings yet
楊睿中統計學合併版
557 pages
Factors in R
No ratings yet
Factors in R
6 pages
BQL Record PDF
No ratings yet
BQL Record PDF
65 pages
FMPro 2 User Guide
No ratings yet
FMPro 2 User Guide
31 pages
Makerwys - Exe Version 4.891: by Pete Dowson © 2019 Instructions
No ratings yet
Makerwys - Exe Version 4.891: by Pete Dowson © 2019 Instructions
11 pages
Ossila Contact Angle Goniometer User Manual
No ratings yet
Ossila Contact Angle Goniometer User Manual
32 pages
Bot Development Best Practices - A2019
No ratings yet
Bot Development Best Practices - A2019
29 pages
List of MQL4 Functions - MQL4 Reference
No ratings yet
List of MQL4 Functions - MQL4 Reference
29 pages
Practical File Index
No ratings yet
Practical File Index
2 pages
Lacerte Trial Balance Utility
No ratings yet
Lacerte Trial Balance Utility
56 pages
Sai Srikar 124111029 Ra.
No ratings yet
Sai Srikar 124111029 Ra.
3 pages
S1agile en RN G PDF
No ratings yet
S1agile en RN G PDF
10 pages
LSMW Steps For Material Master
No ratings yet
LSMW Steps For Material Master
6 pages
Data Acquisition Python
No ratings yet
Data Acquisition Python
12 pages
Python Data Handling Lab Guide
No ratings yet
Python Data Handling Lab Guide
8 pages
Alteryx to KNIME Transition Guide
No ratings yet
Alteryx to KNIME Transition Guide
40 pages
Hima OPC Server Manual
100% (3)
Hima OPC Server Manual
36 pages
Autodesk Geotechnical Module Help
100% (1)
Autodesk Geotechnical Module Help
23 pages
SAS TXT Import
No ratings yet
SAS TXT Import
13 pages
University of Zimbabwe: Time: 2 Hours
100% (1)
University of Zimbabwe: Time: 2 Hours
5 pages
R Programming: Hospital Data Analysis
No ratings yet
R Programming: Hospital Data Analysis
5 pages
Aegis - Designer and Analyzer
No ratings yet
Aegis - Designer and Analyzer
26 pages
Spreadsheet Basics for Beginners
No ratings yet
Spreadsheet Basics for Beginners
20 pages
07 Winfeed
No ratings yet
07 Winfeed
27 pages
Supplementary Data-Chained Manuscript B
No ratings yet
Supplementary Data-Chained Manuscript B
21 pages
Odoo Document
No ratings yet
Odoo Document
125 pages
Acknowledgement
No ratings yet
Acknowledgement
24 pages
Python Basics for Beginners
No ratings yet
Python Basics for Beginners
53 pages
R With RStudio For Introductory Statistics
No ratings yet
R With RStudio For Introductory Statistics
163 pages
File Handling Mcqs 1. To Open A File C:/test - TXT For Reading, We Should Give The
No ratings yet
File Handling Mcqs 1. To Open A File C:/test - TXT For Reading, We Should Give The
19 pages
Pig Operations Load Store Dump Describe
No ratings yet
Pig Operations Load Store Dump Describe
8 pages
CCS341-DW LAB Manual - Chumma Chumma Practical Notes
No ratings yet
CCS341-DW LAB Manual - Chumma Chumma Practical Notes
89 pages
Python GTU Study Material Presentations Unit-3 20112020032538AM
100% (1)
Python GTU Study Material Presentations Unit-3 20112020032538AM
70 pages

STA 100 Lab Assignment 1

Uploaded by

STA 100 Lab Assignment 1

Uploaded by

STA 100 Lab Assignment 1

R Workout Lab Assignment: Patient Data Analysis

Import Data and inspect Data Set

## age totalchol sysBP weight height sedmins obese marriage gender

## age totalchol sysBP weight height sedmins obese marriage gender

## 'data.frame': 100 obs. of 9 variables:

• Answer:The blood pressure of average subjects is 125.12

You might also like