R Programming 101:
Nuts and Bolts
Jocelyn Mara
Discipline of Sport and Exercise Science
R
• A free programming language and software environment
• Primarily used for statistical computing and graphics
• Uses command line interface for most processes
• RStudio is the graphical interface (but still heavily reliant on CLI)
• Runs on any operating system
• Users can use the built-in functions or create their own
https://www.r-project.org
First step
Download and install R and RStudio using the
guide provided
RStudio
RStudio
RStudio
RStudio
The Prompt >
• Informally stands for “what’s next”
• R is waiting for you to give it some instructions
Calculations in R
• We can use R as a calculator
>4+3
[1] 7
> 20 / 5
[1] 4
>5*4
[1] 20
> 64 - 57
[1] 7
>8^2
[1] 64
Value assignment
• The <- symbol is the assignment operator
> x <- 7
> print(x)
[1] 7
>x
[1] 7
• The [1] indicates that x is a vector and 7 is the first element
Value assignment
> x <- 1:20
>x
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
• The : operator is used to create integer sequences
Value assignment
Value assignment
> x <- 7
>x+3
[1] 10
> x+y
Error: object ‘y’ not found
> y <- 3
> x+y
[1] 10
Value assignment
Objects
• Anything we manipulate/analyse/encounter in R is an object
• Single values (e.g. x <- 7)
• Vectors (e.g. Numerical, Matrices, Dataframes)
• Plots
Object Classes
Classes in R describe the type of values within an object
• Numeric (real numbers, e.g. 2.73)
• Integers (whole numbers, e.g. 2, 7, 68)
• Factor (e.g. 1 = Male, 2 = Female)
• Logical (true/false)
• Character (e.g. “Hello World”)
• Complex (e.g. 2n + i)
Object Classes
Object Classes
Object Classes
• If you want a number to be an integer you need to use the suffix ‘L’
Object Classes
• If you want to check the class of an object you can use the class function
> class(x)
[1] "numeric”
> class(y)
[1] ”integer”
Vectors
• Vectors are objects which contain multiple values of the same class (with
the exception of a list and dataframe)
> x <- rnorm(n = 20)
>x
[1] 1.31 2.35 0.91 -1.06 -0.54 0.86 -0.15 1.19 -1.40 -0.60 -1.44 -1.70 2.02
[14] -0.50 1.72 0.23 -0.61 -2.78 1.41 -1.57
> class(x)
[1] "numeric"
Vectors
> y <- x > 0
>y
[1] TRUE TRUE TRUE FALSE FALSE TRUE FALSE TRUE FALSE FALSE FALSE
[12] FALSE TRUE FALSE TRUE TRUE FALSE FALSE TRUE FALSE
> class(y)
[1] ”logical"
Creating Vectors
• Use the c() function to create vectors (combine values)
> x <- c(12.3, 27.8)
>x
[1] 12.3 27.8
Creating Vectors
• Use the c() function to create vectors (combine values)
> x <- c(TRUE, FALSE, TRUE)
>x
[1] TRUE FALSE TRUE
Creating Vectors
• Use the c() function to create vectors (combine values)
> x <- c(“This”, “Is”, “Fun!”)
>x
[1] "This" "is" "Fun!"
Mixing Classes
• When values of different classes are mixed in a vector, coercion occurs so
that every element in the vector is of the same class
> x <- c(12.3, TRUE, “foo”)
> class(x)
[1] “Character”
>x
[1] "12.3" "TRUE" "foo"
Mixing Classes
• When values of different classes are mixed in a vector, coercion occurs so
that every element in the vector is of the same class
> x <- c(TRUE, 1.7, FALSE)
> class(x)
[1] “Numeric”
>x
[1] 1.0 1.7 0.0
Explicit Coercion
• Objects can be explicitly coerced from one class to another using the as.*
functions
> x <- 0:6
> class(x)
[1] “integer”
> as.numeric(x)
[1] 0 1 2 3 4 5 6
> as.logical(x)
[1] FALSE TRUE TRUE TRUE TRUE TRUE
> as.character(x)
[1] “0” “1” “2” “3” “4” “5” “6”
Explicit Coercion
• A coercion that doesn’t make sense will result in NAs
> x <- c(“a”, “b”, ”c”)
> as.numeric(x)
[1] NA NA NA
Warning message:
NAs introduced by coercion
> as.logical(x)
[1] NA NA NA
Warning message:
Nas introduced by coercion
Matrices
• A matrix is a vector with a dimensions attribute (nrow, ncol)
> mat <- matrix(x, nrow = 5, ncol = 4)
> mat
[,1] [,2] [,3] [,4]
[1,] 1.31 0.86 -1.44 0.23
[2,] 2.35 -0.15 -1.70 -0.61
[3,] 0.91 1.19 2.02 -2.78
[4,] -1.06 -1.40 -0.50 1.41
[5,] -0.54 -0.60 1.72 -1.57
Matrices
• Use the dim function to check the dimensions of a vector
> mat <- matrix(x, nrow = 5, ncol = 4)
> dim(mat)
[1] 5 4
Dataframes
• Are vectors with dimensions and variable names (attributes)
• Arranged with each column as a variable and each row a case
> df <- as.data.frame(mat)
> df
V1 V2 V3 V4
1 1.31 0.86 -1.44 0.23
2 2.35 -0.15 -1.70 -0.61
3 0.91 1.19 2.02 -2.78
4 -1.06 -1.40 -0.50 1.41
5 -0.54 -0.60 1.72 -1.57
Dataframes
• Can contain different classes
• But each column (variable) should have the same class
> df
Subject Position Distance
1 Centre 1200
2 Back 1759
3 Forward 1680
Dataframes
• Use the dim function to check dimensions
• Use the names function to check variable names
> dim(df)
[1] 5 4
> names(df)
[1] "V1" "V2" "V3" "V4"
Lists
• Vectors that can contain elements of different classes
> x <- list(c(17.1, 23.2), TRUE, "a")
>x
[[1]]
[1] 17.1 23.2
[[2]]
[1] TRUE
[[3]]
[1] "a"
Attributes
• Names (variable names or dim names)
• Dimensions (nrow, ncol)
• Length (n values if vector with no dim or a matrix, ncol if dataframe)
• Class
Attributes
• Use attributes function to check attributes of a vector
> attributes(df)
$names
[1] "V1” "V2" "V3” "V4”
$row.names
[1] 1 2 3 4 5
$class
[1] "data.frame"
Missing Values
• Missing values represented by NA
• NaN (not a number) is used for undefined mathematical operations (e.g.
0/0)
• is.na( ) is used to test if there are missing values in an object
• is.nan( ) is used to test for NaN
• A NaN value is also NA, but a NA is not a NAN
Missing Values
> x <- c(1, 2, NA, 10, 3)
> is.na(x)
[1] FALSE FALSE TRUE FALSE FALSE
> is.nan(x)
[1] FALSE FALSE FALSE FALSE FALSE
> y <- c(1, 2, NaN, NA, 4)
> is.na(y)
[1] FALSE FALSE TRUE TRUE FALSE
> is.nan(y)
[1] FALSE FALSE TRUE FALSE FALSE
Basic Functions
So far in this lesson I’ve used:
• class( )
• rnorm( )
• c( )
• as.numeric( )
• as.logical( )
• as.character( )
• as.data.frame( )
• matrix( )
• dim( )
• attributes( )
• is.na( )
Basic Functions
Rather than doing this to find the mean..
> (1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10) / 10
[1] 5.5
... I can do this...
> mean(x)
[1] 5.5
Basic Functions
Some other examples:
• sd( )
• min( )
• max( )
• median( )
• range( )
Basic Functions
Functions have this format...
function-name(arg 1, arg 2, ...)
Example:
mean(x, trim = 0, na.rm = FALSE)
Function name/ Other
describes what we’re arguments
doing The object
we’re applying
the function to
Function Arguments
• Functions have named arguments which sometimes have default values
mean(x, trim = 0, na.rm = FALSE)
• If I just typed mean(x)...
.... this would be equivalent to mean(x, trim = 0, na.rm = FALSE)
Argument Matching
• Function arguments can be matched by position or by name
• E.g. the following calls are all equivalent:
> mydata <- 1:20
> mean(x = mydata, trim = 0, na.rm = FALSE)
> mean(mydata, 0, FALSE)
> mean(na.rm = FALSE, trim = 0, x = mydata)
> mean(mydata, trim = 0, FALSE)
> mean(mydata)
But don’t mess around with it too much
Function Arguments
• To see the arguments for a function you can use the args( ) function
> args(lm)
function (formula, data, subset, weights, na.action, method = "qr", model = TRUE, x =
FALSE, y = FALSE, qr = TRUE, singular.ok = TRUE, contrasts = NULL, offset, ...)
Function Arguments
• You can also use ?function-name to see more information about the function and it’s
arguments
> ?mean
The “...” Argument
• Generic functions use “...” so extra arguments can be passed in the
function later
> x <- rnorm(n = 20)
> y <- rnorm(n = 20)
> args(plot)
function (x, y, ...)
NULL
> plot(x, y)
The “...” Argument
• Generic functions use “...” so extra arguments can be passed in the
function later
> x <- rnorm(n = 20)
> y <- rnorm(n = 20)
> args(plot)
function (x, y, ...)
NULL
> plot(x, y)
The “...” Argument
• Generic functions use “...” so extra arguments can be passed in the
function later
> plot(x, y, col = “red”)
The “...” Argument
• The “...” argument is also necessary when the number of arguments
passed to the function is not known in advance
> args(paste)
function (..., sep = " ", collapse = NULL)
NULL
> paste(“This”, “is”, “Fun”, sep = “ ”, collapse = NULL)
[1] "This is Fun"
The “...” Argument
• The catch – any arguments coming after the “...” must be explicitly named
> paste("This", "is", "Fun"," ", NULL)
[1] "This is Fun "
Summary
• Value assignment
• Objects, classes, attributes
• Missing values
• Basic functions and their arguments