[go: up one dir, main page]

0% found this document useful (0 votes)
34 views85 pages

UNIT1 INTRODUCTION TO R PROGRAMMING

Uploaded by

divyashree
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views85 pages

UNIT1 INTRODUCTION TO R PROGRAMMING

Uploaded by

divyashree
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 85

R Programming Language – Introduction



The R Language stands out as a powerful tool in the modern era of


statistical computing and data analysis. Widely embraced by
statisticians, data scientists, and researchers, the R Language offers
an extensive suite of packages and libraries tailored for data
manipulation, statistical modeling, and visualization
R programming language is an implementation of the S
programming language. It also combines with lexical scoping
semantics(variable scope based on position in the source code)
inspired by Scheme. Moreover, the project was conceived in 1992,
with an initial version released in 1995 and a stable beta version in
2000.

R Programming Language
What is R Programming Language?
R programming is a leading tool for machine learning, statistics, and
data analysis, allowing for the easy creation of objects, functions,
and packages. Designed by Ross Ihaka and Robert Gentleman at the
University of Auckland and developed by the R Development Core
Team, R Language is platform-independent and open-source,
making it accessible for use across all operating systems without
licensing costs. Beyond its capabilities as a statistical package, R
integrates with other languages like C and C++, facilitating
interaction with various data sources and statistical tools. With a
growing community of users and high demand in the Data Science
job market, R is one of the most sought-after programming
languages today. Originating as an implementation of the S
programming language with influences from Scheme, R has evolved
since its conception in 1992, with its first stable beta version
released in 2000.

Why Use R Language?


The R Language is a powerful tool widely used for data analysis,
statistical computing, and machine learning. Here are several
reasons why professionals across various fields prefer R:
1. Comprehensive Statistical Analysis:
 R language is specifically designed for statistical analysis and
provides a vast array of statistical techniques and tests, making it
ideal for data-driven research.
2. Extensive Packages and Libraries:
 The R Language boasts a rich ecosystem of packages and
libraries that extend its capabilities, allowing users to perform
advanced data manipulation, visualization, and machine learning
tasks with ease.
3. Strong Data Visualization Capabilities:
 R language excels in data visualization, offering powerful tools
like ggplot2 and plotly, which enable the creation of detailed and
aesthetically pleasing graphs and plots.
4. Open Source and Free:
 As an open-source language, R is free to use, which makes it
accessible to everyone, from individual researchers to large
organizations, without the need for costly licenses.
5. Platform Independence:
 The R Language is platform-independent, meaning it can run on
various operating systems, including Windows, macOS, and Linux,
providing flexibility in development environments.
6. Integration with Other Languages:
 R can easily integrate with other programming languages such as
C, C++, Python, and Java, allowing for seamless interaction with
different data sources and statistical packages.
7. Growing Community and Support:
 R language has a large and active community of users and
developers who contribute to its continuous improvement and
provide extensive support through forums, mailing lists, and
online resources.
8. High Demand in Data Science:
 R is one of the most requested programming languages in the
Data Science job market, making it a valuable skill for
professionals looking to advance their careers in this field.
Features of R Programming Language
The R Language is renowned for its extensive features that make it
a powerful tool for data analysis, statistical computing, and
visualization. Here are some of the key features of R:
1. Comprehensive Statistical Analysis:
 R langauge provides a wide array of statistical techniques,
including linear and nonlinear modeling, classical statistical tests,
time-series analysis, classification, and clustering.
2. Advanced Data Visualization:
 With packages like ggplot2, plotly, and lattice, R excels at
creating complex and aesthetically pleasing data visualizations,
including plots, graphs, and charts.
3. Extensive Packages and Libraries:
 The Comprehensive R Archive Network (CRAN) hosts thousands of
packages that extend R’s capabilities in areas such as machine
learning, data manipulation, bioinformatics, and more.
4. Open Source and Free:
 R is free to download and use, making it accessible to everyone.
Its open-source nature encourages community contributions and
continuous improvement.
5. Platform Independence:
 R is platform-independent, running on various operating systems,
including Windows, macOS, and Linux, which ensures flexibility
and ease of use across different environments.
6. Integration with Other Languages:
 R language can integrate with other programming languages
such as C, C++, Python, Java, and SQL, allowing for seamless
interaction with various data sources and computational
processes.
7. Powerful Data Handling and Storage:
 R efficiently handles and stores data, supporting various data
types and structures, including vectors, matrices, data frames,
and lists.
8. Robust Community and Support:
 R has a vibrant and active community that provides extensive
support through forums, mailing lists, and online resources,
contributing to its rich ecosystem of packages and
documentation.
9. Interactive Development Environment (IDE):
 RStudio, the most popular IDE for R, offers a user-friendly
interface with features like syntax highlighting, code completion,
and integrated tools for plotting, history, and debugging.
10. Reproducible Research:
 R supports reproducible research practices with tools like R
Markdown and Knitr, enabling users to create dynamic reports,
presentations, and documents that combine code, text, and
visualizations.
Advantages of R language
 R is the most comprehensive statistical analysis package. As new
technology and concepts often appear first in R.
 As R programming language is an open source. Thus, you can run
R anywhere and at any time.
 R programming language is suitable for GNU/Linux and Windows
operating systems.
 R programming is cross-platform and runs on any operating
system.
 In R, everyone is welcome to provide new packages, bug fixes,
and code enhancements.
Disadvantages of R language
 In the R programming language, the standard of some packages
is less than perfect.
 Although, R commands give little pressure on memory
management. So R programming language may consume all
available memory.
 In R basically, nobody to complain if something doesn’t work.
 R programming language is much slower than other programming
languages such as Python and MATLAB.
Applications of R language
 We use R for Data Science. It gives us a broad variety of libraries
related to statistics. It also provides the environment for
statistical computing and design.
 R is used by many quantitative analysts as its programming tool.
Thus, it helps in data importing and cleaning.
 R is the most prevalent language. So many data analysts and
research programmers use it. Hence, it is used as a fundamental
tool for finance.
 Tech giants like Google, Facebook, Bing, Twitter, Accenture,
Wipro, and many more using R nowadays.

Interesting Facts about R Programming


Language


R is an open-source programming language that is widely used as a


statistical software and data analysis tool. R generally comes with
the Command-line interface. R is available across widely used
platforms like Windows, Linux, and macOS. Also, the R programming
language is the latest cutting-edge tool. It was designed by Ross
Ihaka and Robert Gentleman at the University of Auckland, New
Zealand, and is currently developed by the R Development Core

Team. Here are some interesting facts about the R


programming language:
 R programming language is an implementation of the S
programming language. It also combines with lexical scoping
semantics inspired by Scheme. It is named partly after the first
names of the first two R authors and partly as a play on the name
of S.
 R supports both procedural programming and object-oriented
programming. Procedural programming includes the procedure,
records, modules, and procedure calls. While object-oriented
programming language includes class, objects, and generic
functions.
 R language is an interpreted language instead of a compiled
language. Therefore, it doesn’t need a compiler to compile code
into an executable program. This makes running an R script much
less time-consuming.
 The number of R packages available either through CRAN or
GitHub is 1, 00, 000 and they do epic stuff with just one line of
code. It could range from Regression to Bayesian analysis.
 R is growing faster than any other data science language. It’s the
most-used data science language after SQL. It is used by 70% of
data miners.
 One of the packages in R namely rmarkdown package helps you
create reproducible Word documents and reproducible
Powerpoint Presentations from your R markdown code just by
changing one line in the YAML! (“YAML Ain’t Markup Language!”)
 It is really very easy in R to connect to almost any database using
the dbplyr package. This makes possible for an R user to work
independently and pulling data from almost all common database
types. You can also use packages like bigquery to work directly
with BigQuery and other high-performance data stores.
 You can build and host interactive web apps in just a few lines of
code in R. Using the flexdashboard package in R you can create
interactive web apps with a few lines of code. And using the
rsconnect package you can also host your web apps on your own
server or, even easier, host them on a cloud server.
 You can not only deploy web apps but also can make them into
awesome video games in R. The nessy package helps you create
NES(The Nintendo Entertainment System) looking Shiny apps and
deploy them just like you would any other Shiny app.
 You can build APIs and serve them from R. The plumber package
in R helps you convert R functions to web APIs that can be
integrated into downstream applications.
 According to PYPL PopularitY of Programming Language R is #7 of
all programming languages. R is the #1 Google Search for
Advanced Analytics software. It has more than 3 million users
worldwide make a huge community for R programming language.
 The origin of R programming language can be traced back to
1993 when Ross Ihaka and Robert Gentleman at the University of
Auckland, New Zealand introduced it.
 R is an open-source language and it is available for free for
everyone to use for statistical and graphical purposes.
 The R programming language has a supportive and enthusiastic
user community, providing ample resources and assistance to
users.
 The widespread usage of R in fields such as data science,
machine learning, and statistical modeling has made it one of the
most sought-after programming languages.
 R has a wealth of packages and libraries, allowing users to
perform complex tasks easily and extend its functionality.
 Industries such as finance, healthcare, pharmaceuticals, and
marketing make use of R for data analysis and modeling.
 In academic research, R has become a crucial tool across various
disciplines such as biology, psychology, and economics.
 R operates seamlessly on different platforms like Windows,
macOS, and Linux, making it easily accessible to users regardless
of the operating system they use.

R vs Python



R Programming Language and Python are both used extensively
for Data Science. Both are very useful and open-source languages
as well. For data analysis, statistical computing, and machine
learning Both languages are strong tools with sizable communities
and huge libraries for data science jobs. A theoretical comparison
between R and Python is provided below:

R vs Python

In this article, we will cover the following topics:


 R Programming Language
 Python Programming Language
 Difference between R Programming and Python
Programming
 Ecosystem in R Programming and Python Programming
 Advantages and disadvantages in R Programming and
Python Programming
 R and Python usages in Data Science
 Example in R and Python
R Programming Language
R Programming Language is used for machine learning
algorithms, linear regression, time series, statistical inference, etc. It
was designed by Ross Ihaka and Robert Gentleman in 1993. R is an
open-source programming language that is widely used as a
statistical software and data analysis tool. R generally comes with
the Command-line interface. R is available across widely used
platforms like Windows, Linux, and macOS. Also, the R programming
language is the latest cutting-edge tool.
Python Programming Language
Python is a widely-used general-purpose, high-level programming
language. It was created by Guido van Rossum in 1991 and further
developed by the Python Software Foundation. It was designed with
an emphasis on code readability, and its syntax allows programmers
to express their concepts in fewer lines of code.
Difference between R Programming and
Python Programming
Below are some major differences between R and Python:
Feature R Python

R is a language and
Python is a general-purpose
environment for statistical
programming language for
Introduction programming which
data analysis and scientific
includes statistical
computing
computing and graphics.

It can be used to develop GUI


It has many features which
applications and web
Objective are useful for statistical
applications as well as with
analysis and representation.
embedded systems

It has many easy-to-use It can easily perform matrix


Workability packages for performing computation as well as
tasks optimization

Various popular R IDEs are Various popular Python IDEs


Integrated development
Rstudio, RKward, R are Spyder, Eclipse+Pydev,
environment
commander, etc. Atom, etc.

Some essential packages and


There are many packages
libraries
Libraries and packages and libraries
are Pandas, Numpy, Scipy,
like ggplot2, caret, etc.
etc.

It is mainly used for It takes a more streamlined


Scope complex data analysis in approach for data science
data science. projects.
Ecosystem in R Programming and Python
Programming
Python supports a very large community of general-purpose data
science. One of the most basic uses for data analysis, primarily
because of the fantastic ecosystem of data-centric Python packages.
Pandas and NumPy are one of those packages that make importing
and analyzing, and visualization of data much easier.
R Programming has a rich ecosystem to use in standard machine
learning and data mining techniques. It works in statistical analysis
of large datasets, and it offers a number of different options for
exploring data and It makes it easier to use probability distributions,
apply different statistical tests.

R vs Python

Features R Python

It is used for data analysts It is used in all kinds of data


Data collection to import data from Excel, formats including SQL
CSV, and text files. tables
Features R Python

It optimized for the


You can explore data with
Data exploration statistical analysis of large
Pandas
datasets

It supports Tidyverse,
You can use NumPy,
making it easy to import,
Data modeling SciPy, scikit-learn, TansorFl
manipulate, visualize, and
ow
report on data.

You can use ggplot2 and


ggplot tools to plots You can use Matplotlib,
Data visualization
complex scatter plots with Pandas, Seaborn
regression lines.

Statistical Analysis and Machine Learning In


R and Python
Statistical analysis and machine learning are critical components of
data science, involving the application of statistical methods,
models, and techniques to extract insights, identify patterns, and
draw meaningful conclusions from data. Both R and Python have
widely used programming languages for statistical analysis, each
offering a variety of libraries and packages to perform diverse
statistical and machine learning tasks. Some comparison of
statistical analysis and modeling capabilities in R and Python.
Capability R Python

Built-in functions (mean,


Basic Statistics NumPy (mean, median, etc.)
median, etc.)

Statsmodels (OLS)
Ordinary Least Squares
Linear Regression lm() function and Formulas
(OLS) Method

Generalized Linear
glm() function Statsmodels (GLM)
Models (GLM)
Capability R Python

Time Series packages


Time Series Analysis Statsmodels (Time Series)
(forecast)

Built-in functions (aov,


ANOVA and t-tests SciPy (ANOVA, t-tests)
t.test)

Built-in functions SciPy (Mann-Whitney,


Hypothesis Tests
(wilcox.test, etc.) Kruskal-Wallis)

Principal Component
princomp() function scikit-learn (PCA)
Analysis (PCA)

Clustering (K-Means, scikit-learn (KMeans,


kmeans(), hclust()
Hierarchical) AgglomerativeClustering)

scikit-learn
Decision Trees rpart() function
(DecisionTreeClassifier)

scikit-learn
Random Forest randomForest() function
(RandomForestClassifier)

Advantages in R Programming and Python


Programming
R Programming Python Programming

It supports a large dataset for statistical General-purpose programming to use data


analysis analyze

Primary users are Scholar and R&D Primary users are Programmers and
R Programming Python Programming

developers

Support packages like tidyverse, ggplot2, Support packages like pandas, scipy, scikit-
caret, zoo learn, TensorFlow, caret

Support RStudio and It has a wide range of


Support Conda environment with Spyder,
statistics and general data analysis and
Ipython Notebook
visualization capabilities.

Disadvantages in R Programming and Python


Programming
R Programming Python Programming

R is much more difficult as compared to


Python does not have too many libraries for
Python because it mainly uses for statistics
data science as compared to R.
purposes.

R might not be as fast as languages like Python might not be as specialized for
Python, especially for computationally statistics and data analysis as R. Some
intensive tasks and large-scale data statistical functions and visualization
processing. capabilities might be more streamlined in R.

Memory management in R might not be as


Python visualization capabilities might not
efficient as in some other languages, which
be as polished and streamlined as those
can lead to performance issues and memory-
offered by R’s ggplot2.
related errors
R Syntax
Syntax
To output text in R, use single or double quotes:

Example
"Hello World!"

To output numbers, just type the number (without quotes):

Example
5
10
25

To do simple calculations, add numbers together:

Example
5 + 5

However, R does have a print() function available if you want to use it. This
might be useful if you are familiar with other programming languages, such
as Python, which often uses the print() function to output code.

Example
print("Hello World!")

nd there are times you must use the print() function to output code, for
example when working with for loops (which you will learn more about in a later
chapter):

Example
for (x in 1:10) {
print(x)
}

R Comments
Comments
Comments can be used to explain R code, and to make it more readable. It can
also be used to prevent execution when testing alternative code.

Comments starts with a #. When executing code, R will ignore anything that
starts with #.

This example uses a comment before a line of code:

Example
# This is a comment
"Hello World!"

his example uses a comment at the end of a line of code:

Example
"Hello World!" # This is a comment

Comments does not have to be text to explain the code, it can also be used to
prevent R from executing the code:

Example
# "Good morning!"
"Good night!"
Multiline CommentsUnlike other programming languages,
such as Java, there are no syntax in R for multiline comments. However, we can
just insert a # for each line to create multiline comments:

Example
# This is a comment
# written in
# more than just one line
"Hello World!"

R Variables
Creating Variables in R
Variables are containers for storing data values.

R does not have a command for declaring a variable. A variable is created the
moment you first assign a value to it. To assign a value to a variable, use
the <- sign. To output (or print) the variable value, just type the variable name:

Example
name <- "John"
age <- 40

name # output "John"


age # output 40
Try it Yourself »

From the example above, name and age are variables,


while "John" and 40 are values.

In other programming language, it is common to use = as an assignment


operator. In R, we can use both = and <- as assignment operators.

However, <- is preferred in most cases because the = operator can be forbidden
in some contexts in R.

Print / Output Variables


Compared to many other programming languages, you do not have to use a
function to print/output variables in R. You can just type the name of the
variable:

Example
name <- "John Doe"

name # auto-print the value of the name variable

However, R does have a print() function available if you want to use it. This
might be useful if you are familiar with other programming languages, such
as Python, which often use a print() function to output variables.

Example
name <- "John Doe"

print(name) # print the value of the name variable


Try it Yourself »

And there are times you must use the print() function to output code, for
example when working with for loops (which you will learn more about in a
later chapter):

Example
for (x in 1:10) {
print(x)
}

R Concatenate Elements
Concatenate Elements
You can also concatenate, or join, two or more elements, by using
the paste() function.

To combine both text and a variable, R uses comma ( ,):

Example
text <- "awesome"

paste("R is", text)

You can also use , to add a variable to another variable:

Example
text1 <- "R is"
text2 <- "awesome"

paste(text1, text2)

For numbers, the + character works as a mathematical operator:

Example
num1 <- 5
num2 <- 10

num1 + num2

If you try to combine a string (text) and a number, R will give you an error:

Example
num <- 5
text <- "Some text"

num + text

Result:

Error in num + text : non-numeric argument to binary operator

Multiple Variables
R allows you to assign the same value to multiple variables in one line:

Example
# Assign the same value to multiple variables in one line
var1 <- var2 <- var3 <- "Orange"
# Print variable values
var1
var2
var3

R Variable Names
(Identifiers)
❮ PreviousNext ❯

Variable Names
A variable can have a short name (like x and y) or a more descriptive name
(age, carname, total_volume). Rules for R variables are:

 A variable name must start with a letter and can be a combination of


letters, digits, period(.)
and underscore(_). If it starts with period(.), it cannot be followed by a
digit.
 A variable name cannot start with a number or underscore (_)
 Variable names are case-sensitive (age, Age and AGE are three different
variables)
 Reserved words cannot be used as variables (TRUE, FALSE, NULL, if...)

# Legal variable names:


myvar <- "John"
my_var <- "John"
myVar <- "John"
MYVAR <- "John"
myvar2 <- "John"
.myvar <- "John"

# Illegal variable names:


2myvar <- "John"
my-var <- "John"
my var <- "John"
_my_var <- "John"
my_v@ar <- "John"
TRUE <- "John"
Remember that variable names are case-sensitive!

R Data Types
❮ PreviousNext ❯

Data Types
In programming, data type is an important concept.

Variables can store data of different types, and different types can do different
things.

In R, variables do not need to be declared with any particular type, and can
even change type after they have been set:

Example
my_var <- 30 # my_var is type of numeric
my_var <- "Sally" # my_var is now of type character (aka string)
Try it Yourself »

R has a variety of data types and object classes. You will learn much more about
these as you continue to get to know R.

Basic Data Types


Basic data types in R can be divided into the following types:

 numeric - (10.5, 55, 787)


 integer - (1L, 55L, 100L, where the letter "L" declares this as an integer)
 complex - (9 + 3i, where "i" is the imaginary part)
 character (a.k.a. string) - ("k", "R is exciting", "FALSE", "11.5")
 logical (a.k.a. boolean) - (TRUE or FALSE)

We can use the class() function to check the data type of a variable:
Example
# numeric
x <- 10.5
class(x)

# integer
x <- 1000L
class(x)

# complex
x <- 9i + 3
class(x)

# character/string
x <- "R is exciting"
class(x)

# logical/boolean
x <- TRUE
class(x)

R Numbers
❮ PreviousNext ❯

Numbers
There are three number types in R:

 numeric
 integer
 complex

Variables of number types are created when you assign a value to them:

Example
x <- 10.5 # numeric
y <- 10L # integer
z <- 1i # complex
Numeric
A numeric data type is the most common type in R, and contains any number
with or without a decimal, like: 10.5, 55, 787:

Example
x <- 10.5
y <- 55

# Print values of x and y


x
y

# Print the class name of x and y


class(x)
class(y)

Integer
Integers are numeric data without decimals. This is used when you are certain
that you will never create a variable that should contain decimals. To create
an integer variable, you must use the letter L after the integer value:

Example
x <- 1000L
y <- 55L

# Print values of x and y


x
y

# Print the class name of x and y


class(x)
class(y)

Complex
A complex number is written with an "i" as the imaginary part:

Example
x <- 3+5i
y <- 5i

# Print values of x and y


x
y

# Print the class name of x and y


class(x)
class(y)
Try it Yourself »

Type Conversion
You can convert from one type to another with the following functions:

 as.numeric()
 as.integer()
 as.complex()

Example
x <- 1L # integer
y <- 2 # numeric

# convert from integer to numeric:


a <- as.numeric(x)

# convert from numeric to integer:


b <- as.integer(y)

# print values of x and y


x
y

# print the class name of a and b


class(a)
class(b)
R Math
Simple Math
In R, you can use operators to perform common mathematical operations on
numbers.

The + operator is used to add together two values:

Example
10 + 5

And the - operator is used for subtraction:

Example
10 - 5
You will learn more about available operators in our R Operators Tutorial.

Built-in Math Functions


R also has many built-in math functions that allows you to perform
mathematical tasks on numbers.

For example, the min() and max() functions can be used to find the lowest or
highest number in a set:

Example
max(5, 10, 15)

min(5, 10, 15)


sqrt()
The sqrt() function returns the square root of a number:

Example
sqrt(16)

abs()
The abs() function returns the absolute (positive) value of a number:

Example
abs(-4.7)

ceiling() and floor()


The ceiling() function rounds a number upwards to its nearest integer, and
the floor() function rounds a number downwards to its nearest integer, and
returns the result:

Example
ceiling(1.4)

floor(1.4)

R Strings
String Literals
Strings are used for storing text.

A string is surrounded by either single quotation marks, or double quotation


marks:

"hello" is the same as 'hello':


Example
"hello"
'hello'

Assign a String to a Variable


Assigning a string to a variable is done with the variable followed by
the <- operator and the string:

Example
str <- "Hello"
str # print the value of str

Multiline Strings
You can assign a multiline string to a variable like this:

Example
str <- "Lorem ipsum dolor sit amet,
consectetur adipiscing elit,
sed do eiusmod tempor incididunt
ut labore et dolore magna aliqua."

str # print the value of str

However, note that R will add a "\n" at the end of each line break. This is called
an escape character, and the n character indicates a new line.

If you want the line breaks to be inserted at the same position as in the code,
use the cat() function:

Example
str <- "Lorem ipsum dolor sit amet,
consectetur adipiscing elit,
sed do eiusmod tempor incididunt
ut labore et dolore magna aliqua."

cat(str)

String Length
There are many usesful string functions in R.

For example, to find the number of characters in a string, use


the nchar() function:

Example
str <- "Hello World!"

nchar(str)

Check a String
Use the grepl() function to check if a character or a sequence of characters
are present in a string:

Example
str <- "Hello World!"

grepl("H", str)
grepl("Hello", str)
grepl("X", str)

Combine Two Strings


Use the paste() function to merge/concatenate two strings:

Example
str1 <- "Hello"
str2 <- "World"

paste(str1, str2)

R Escape Characters
Escape Characters
To insert characters that are illegal in a string, you must use an escape
character.

An escape character is a backslash \ followed by the character you want to


insert.

An example of an illegal character is a double quote inside a string that is


surrounded by double quotes:

Example
str <- "We are the so-called "Vikings", from the north."

str

Result:

Error: unexpected symbol in "str <- "We are the so-called "Vikings"

To fix this problem, use the escape character \":

Example
The escape character allows you to use double quotes when you normally would
not be allowed:

str <- "We are the so-called \"Vikings\", from the north."

str
cat(str)

Note that auto-printing the str variable will print the backslash in the output.
You can use the cat() function to print it without backslash.
Code Result

\\ Backslash

\n New Line

\r Carriage Return

\t Tab

\b Backspace

Other escape characters in R:

R Booleans / Logical Values


Booleans (Logical Values)
In programming, you often need to know if an expression is true or false.

You can evaluate any expression in R, and get one of two


answers, TRUE or FALSE.

When you compare two values, the expression is evaluated and R returns the
logical answer:

Example
10 > 9 # TRUE because 10 is greater than 9
10 == 9 # FALSE because 10 is not equal to 9
10 < 9 # FALSE because 10 is greater than 9

You can also compare two variables:


Example
a <- 10
b <- 9

a > b

You can also run a condition in an if statement, which you will learn much more
about in the if..else chapter.

Example
a <- 200
b <- 33

if (b > a) {
print ("b is greater than a")
} else {
print("b is not greater than a")
}

R Operators
Operators
Operators are used to perform operations on variables and values.

In the example below, we use the + operator to add together two values:

Example
10 + 5

R divides the operators in the following groups:

 Arithmetic operators
 Assignment operators
 Comparison operators
 Logical operators
 Miscellaneous operators
R Arithmetic Operators

Operator Name Example

+ Addition x+y

- Subtraction x-y

* Multiplication x*y

/ Division x/y

^ Exponent x^y

%% Modulus x %% y
(Remainder
from division)

%/% Integer x%/%y


Division
Arithmetic operators are used with numeric values to perform common
mathematical operations:

R Assignment Operators
Assignment operators are used to assign values to variables:

Example
my_var <- 3

my_var <<- 3

3 -> my_var

3 ->> my_var

my_var # print my_var

R Comparison Operators
Comparison operators are used to compare two values:

Operator Name Example

== Equal x == y

!= Not equal x != y

> Greater than x>y


< Less than x<y

>= Greater than or x >= y


equal to

<= Less than or x <= y


equal to

R Logical Operators
Logical operators are used to combine conditional statements:

Operator Description

& Element-wise Logical AND operator. Returns TRUE if both


elements are TRUE

&& Logical AND operator - Returns TRUE if both statements are


TRUE

| Elementwise- Logical OR operator. Returns TRUE if one of the


statements is TRUE
|| Logical OR operator. Returns TRUE if one of the statements is
TRUE

! Logical NOT - Returns FALSE if statement is TRUE

R Miscellaneous Operators
Miscellaneous operators are used to manipulate data:

Operator Description Example

: Creates a series of numbers in a x <- 1:10


sequence

%in% Find out if an element belongs x %in% y


to a vector

%*% Matrix Multiplication x <- Matrix1


%*%
Matrix2

R If ... Else
Conditions and If Statements
R supports the usual logical conditions from mathematics:

Operator Name Example

== Equal x == y

!= Not equal x != y

> Greater than x>y

< Less than x<y

>= Greater than or x >= y


equal to

<= Less than or equal x <= y


to

These conditions can be used in several ways, most commonly in "if statements"
and loops.

The if Statement
An "if statement" is written with the if keyword, and it is used to specify a block
of code to be executed if a condition is TRUE:
Example
a <- 33
b <- 200

if (b > a) {
print("b is greater than a")
}

In this example we use two variables, a and b, which are used as a part of the if
statement to test whether b is greater than a. As a is 33, and b is 200, we know
that 200 is greater than 33, and so we print to screen that "b is greater than a".

R uses curly brackets { } to define the scope in the code.

Else If
The else if keyword is R's way of saying "if the previous conditions were not
true, then try this condition":

Example
a <- 33
b <- 33

if (b > a) {
print("b is greater than a")
} else if (a == b) {
print ("a and b are equal")
}

In this example a is equal to b, so the first condition is not true, but the else
if condition is true, so we print to screen that "a and b are equal".

You can use as many else if statements as you want in R.

If Else
The else keyword catches anything which isn't caught by the preceding
conditions:

Example
a <- 200
b <- 33

if (b > a) {
print("b is greater than a")
} else if (a == b) {
print("a and b are equal")
} else {
print("a is greater than b")
}

In this example, a is greater than b, so the first condition is not true, also
the else if condition is not true, so we go to the else condition and print to
screen that "a is greater than b".

You can also use else without else if:

Example
a <- 200
b <- 33

if (b > a) {
print("b is greater than a")
} else {
print("b is not greater than a")
}

R Nested If
Nested If Statements
You can also have if statements inside if statements, this is
called nested if statements.
Example
x <- 41

if (x > 10) {
print("Above ten")
if (x > 20) {
print("and also above 20!")
} else {
print("but not above 20.")
}
} else {
print("below 10.")
}

R - AND OR Operators
AND
The & symbol (and) is a logical operator, and is used to combine
conditional statements:

Example
Test if a is greater than b, AND if c is greater than a:

a <- 200
b <- 33
c <- 500
if (a > b & c > a) {

print("Both conditions are true")


}

OR
The | symbol (or) is a logical operator, and is used to combine conditional
statements:
Example
Test if a is greater than b, or if c is greater than a:

a <- 200
b <- 33
c <- 500

if (a > b | a > c) {
print("At least one of the conditions is true")
}

R While Loop
Loops can execute a block of code as long as a specified condition is reached.

Loops are handy because they save time, reduce errors, and they make code
more readable.

R has two loop commands:

 while loops
 for loops

R While Loops
With the while loop we can execute a set of statements as long as a condition is
TRUE:

Example
Print i as long as i is less than 6:

i <- 1
while (i < 6) {
print(i)
i <- i + 1
}

In the example above, the loop will continue to produce numbers ranging from 1
to 5. The loop will stop at 6 because 6 < 6 is FALSE.

The while loop requires relevant variables to be ready, in this example we need
to define an indexing variable, i, which we set to 1.

Note: remember to increment i, or else the loop will continue forever.

Break
With the break statement, we can stop the loop even if the while condition is
TRUE:

Example
Exit the loop if i is equal to 4.

i <- 1
while (i < 6) {
print(i)
i <- i + 1
if (i == 4) {
break
}
}

The loop will stop at 3 because we have chosen to finish the loop by using
the break statement when i is equal to 4 (i == 4).

Next
With the next statement, we can skip an iteration without terminating the loop:

Example
Skip the value of 3:
i <- 0
while (i < 6) {
i <- i + 1
if (i == 3) {
next
}
print(i)
}

When the loop passes the value 3, it will skip it and continue to loop.

Yahtzee!
If .. Else Combined with a While Loop
To demonstrate a practical example, let us say we play a game of Yahtzee!

Example
Print "Yahtzee!" If the dice number is 6:

dice <- 1
while (dice <= 6) {
if (dice < 6) {
print("No Yahtzee")
} else {
print("Yahtzee!")
}
dice <- dice + 1
}

If the loop passes the values ranging from 1 to 5, it prints "No Yahtzee".
Whenever it passes the value 6, it prints "Yahtzee!".

R For Loop
For Loops
A for loop is used for iterating over a sequence:
Example
for (x in 1:10) {
print(x)
}

This is less like the for keyword in other programming languages, and works
more like an iterator method as found in other object-oriented programming
languages.

With the for loop we can execute a set of statements, once for each item in a
vector, array, list, etc..

Example
Print every item in a list:

fruits <- list("apple", "banana", "cherry")

for (x in fruits) {
print(x)
}

Example
Print the number of dices:

dice <- c(1, 2, 3, 4, 5, 6)

for (x in dice) {
print(x)
}

The for loop does not require an indexing variable to set beforehand, like
with while loops.

Break
With the break statement, we can stop the loop before it has looped through all
the items:

Example
Stop the loop at "cherry":

fruits <- list("apple", "banana", "cherry")

for (x in fruits) {
if (x == "cherry") {
break
}
print(x)
}

The loop will stop at "cherry" because we have chosen to finish the loop by
using the break statement when x is equal to "cherry" (x == "cherry").

Next
With the next statement, we can skip an iteration without terminating the loop:

Example
Skip "banana":

fruits <- list("apple", "banana", "cherry")

for (x in fruits) {
if (x == "banana") {
next
}
print(x)
}

When the loop passes "banana", it will skip it and continue to loop.
Yahtzee!
If .. Else Combined with a For Loop
To demonstrate a practical example, let us say we play a game of Yahtzee!

Example
Print "Yahtzee!" If the dice number is 6:

dice <- 1:6

for(x in dice) {
if (x == 6) {
print(paste("The dice number is", x, "Yahtzee!"))
} else {
print(paste("The dice number is", x, "Not Yahtzee"))
}
}

If the loop reaches the values ranging from 1 to 5, it prints "No Yahtzee" and its
number. When it reaches the value 6, it prints "Yahtzee!" and its number.

R Nested Loops
Nested Loops
It is also possible to place a loop inside another loop. This is called a nested
loop:

Example
Print the adjective of each fruit in a list:

adj <- list("red", "big", "tasty")

fruits <- list("apple", "banana", "cherry")


for (x in adj) {
for (y in fruits) {
print(paste(x, y))
}
}

R Functions
A function is a block of code which only runs when it is called.

You can pass data, known as parameters, into a function.

A function can return data as a result.

Creating a Function
To create a function, use the function() keyword:

Example
my_function <- function() { # create a function with the name
my_function
print("Hello World!")
}

Call a Function
To call a function, use the function name followed by parenthesis,
like my_function():

Example
my_function <- function() {
print("Hello World!")
}

my_function() # call the function named my_function

Arguments
Information can be passed into functions as arguments.
Arguments are specified after the function name, inside the parentheses. You
can add as many arguments as you want, just separate them with a comma.

The following example has a function with one argument (fname). When the
function is called, we pass along a first name, which is used inside the function
to print the full name:

Example
my_function <- function(fname) {
paste(fname, "Griffin")
}

my_function("Peter")
my_function("Lois")
my_function("Stewie")

Parameters or Arguments?
The terms "parameter" and "argument" can be used for the same thing:
information that are passed into a function.

From a function's perspective:

A parameter is the variable listed inside the parentheses in the function


definition.

An argument is the value that is sent to the function when it is called.

Number of Arguments
By default, a function must be called with the correct number of arguments.
Meaning that if your function expects 2 arguments, you have to call the function
with 2 arguments, not more, and not less:

Example
This function expects 2 arguments, and gets 2 arguments:

my_function <- function(fname, lname) {


paste(fname, lname)
}
my_function("Peter", "Griffin")

If you try to call the function with 1 or 3 arguments, you will get an error:

Example
This function expects 2 arguments, and gets 1 argument:

my_function <- function(fname, lname) {


paste(fname, lname)
}

my_function("Peter")

Default Parameter Value


The following example shows how to use a default parameter value.

If we call the function without an argument, it uses the default value:

Example
my_function <- function(country = "Norway") {
paste("I am from", country)
}

my_function("Sweden")
my_function("India")
my_function() # will get the default value, which is Norway
my_function("USA")

Return Values
To let a function return a result, use the return() function:

Example
my_function <- function(x) {
return (5 * x)
}

print(my_function(3))
print(my_function(5))
print(my_function(9))

The output of the code above will be:

[1] 15
[1] 25
[1] 45

R Nested Functions
Nested Functions
There are two ways to create a nested function:

 Call a function within another function.


 Write a function within a function.

Example
Call a function within another function:

Nested_function <- function(x, y) {


a <- x + y
return(a)
}

Nested_function(Nested_function(2,2), Nested_function(3,3))

Example Explained

The function tells x to add y.

The first input Nested_function(2,2) is "x" of the main function.


The second input Nested_function(3,3) is "y" of the main function.

The output is therefore (2+2) + (3+3) = 10.

Example
Write a function within a function:

Outer_func <- function(x) {


Inner_func <- function(y) {
a <- x + y
return(a)
}
return (Inner_func)
}
output <- Outer_func(3) # To call the Outer_func
output(5)

Example Explained

You cannot directly call the function because the Inner_func has been defined
(nested) inside the Outer_func.

We need to call Outer_func first in order to call Inner_func as a second step.

We need to create a new variable called output and give it a value, which is 3
here.

We then print the output with the desired value of "y", which in this case is 5.

The output is therefore 8 (3 + 5).

R Function Recursion
Recursion
R also accepts function recursion, which means a defined function can call itself.

Recursion is a common mathematical and programming concept. It means that


a function calls itself. This has the benefit of meaning that you can loop through
data to reach a result.
The developer should be very careful with recursion as it can be quite easy to
slip into writing a function which never terminates, or one that uses excess
amounts of memory or processor power. However, when written correctly,
recursion can be a very efficient and mathematically-elegant approach to
programming.

In this example, tri_recursion() is a function that we have defined to call itself


("recurse"). We use the k variable as the data, which decrements (-1) every time
we recurse. The recursion ends when the condition is not greater than 0 (i.e.
when it is 0).

To a new developer it can take some time to work out how exactly this works,
best way to find out is by testing and modifying it.

Example
tri_recursion <- function(k) {
if (k > 0) {
result <- k + tri_recursion(k - 1)
print(result)
} else {
result = 0
return(result)
}
}
tri_recursion(6)

R Global Variables
Global Variables
Variables that are created outside of a function are known as global variables.

Global variables can be used by everyone, both inside of functions and outside.

Example
Create a variable outside of a function and use it inside the function:

txt <- "awesome"


my_function <- function() {
paste("R is", txt)
}
my_function()

If you create a variable with the same name inside a function, this variable will
be local, and can only be used inside the function. The global variable with the
same name will remain as it was, global and with the original value.

Example
Create a variable inside of a function with the same name as the global variable:

txt <- "global variable"


my_function <- function() {
txt = "fantastic"
paste("R is", txt)
}

my_function()

txt # print txt

If you try to print txt, it will return "global variable" because we are
printing txt outside the function.

The Global Assignment Operator


Normally, when you create a variable inside a function, that variable is local,
and can only be used inside that function.

To create a global variable inside a function, you can use the global
assignment operator <<-

Example
If you use the assignment operator <<-, the variable belongs to the global scope:

my_function <- function() {


txt <<- "fantastic"
paste("R is", txt)
}
my_function()

print(txt)

Also, use the global assignment operator if you want to change a global
variable inside a function:

Example
To change the value of a global variable inside a function, refer to the variable
by using the global assignment operator <<-:

txt <- "awesome"


my_function <- function() {
txt <<- "fantastic"
paste("R is", txt)
}

my_function()

paste("R is", txt)


R Vectors
Vectors
A vector is simply a list of items that are of the same type.

To combine the list of items to a vector, use the c() function and separate the
items by a comma.

In the example below, we create a vector variable called fruits, that combine
strings:

Example
# Vector of strings
fruits <- c("banana", "apple", "orange")

# Print fruits
fruits

In this example, we create a vector that combines numerical values:

Example
# Vector of numerical values
numbers <- c(1, 2, 3)

# Print numbers
numbers

To create a vector with numerical values in a sequence, use the : operator:

Example
# Vector with numerical values in a sequence
numbers <- 1:10

numbers

You can also create numerical values with decimals in a sequence, but note that
if the last element does not belong to the sequence, it is not used:

Example
# Vector with numerical decimals in a sequence
numbers1 <- 1.5:6.5
numbers1

# Vector with numerical decimals in a sequence where the last


element is not used
numbers2 <- 1.5:6.3
numbers2

Result:

[1] 1.5 2.5 3.5 4.5 5.5 6.5


[1] 1.5 2.5 3.5 4.5 5.5

In the example below, we create a vector of logical values:

Example
# Vector of logical values
log_values <- c(TRUE, FALSE, TRUE, FALSE)

log_values

Vector Length
To find out how many items a vector has, use the length() function:

Example
fruits <- c("banana", "apple", "orange")

length(fruits)
Sort a Vector
To sort items in a vector alphabetically or numerically, use the sort() function:

Example
fruits <- c("banana", "apple", "orange", "mango", "lemon")
numbers <- c(13, 3, 5, 7, 20, 2)

sort(fruits) # Sort a string


sort(numbers) # Sort numbers

Access Vectors
You can access the vector items by referring to its index number inside
brackets []. The first item has index 1, the second item has index 2, and so on:

Example
fruits <- c("banana", "apple", "orange")

# Access the first item (banana)


fruits[1]

You can also access multiple elements by referring to different index positions
with the c() function:

Example
fruits <- c("banana", "apple", "orange", "mango", "lemon")

# Access the first and third item (banana and orange)


fruits[c(1, 3)]

You can also use negative index numbers to access all items except the ones
specified:

Example
fruits <- c("banana", "apple", "orange", "mango", "lemon")

# Access all items except for the first item


fruits[c(-1)]

Change an Item
To change the value of a specific item, refer to the index number:

Example
fruits <- c("banana", "apple", "orange", "mango", "lemon")

# Change "banana" to "pear"


fruits[1] <- "pear"

# Print fruits
fruits

Repeat Vectors
To repeat vectors, use the rep() function:

Example
Repeat each value:

repeat_each <- rep(c(1,2,3), each = 3)

repeat_each
Example
Repeat the sequence of the vector:

repeat_times <- rep(c(1,2,3), times = 3)

repeat_times
Example
Repeat each value independently:

repeat_indepent <- rep(c(1,2,3), times = c(5,2,1))

repeat_indepent
Generating Sequenced Vectors
One of the examples on top, showed you how to create a vector with numerical
values in a sequence with the : operator:

Example
numbers <- 1:10

numbers

To make bigger or smaller steps in a sequence, use the seq() function:

Example
numbers <- seq(from = 0, to = 100, by = 20)

numbers

Note: The seq() function has three parameters: from is where the sequence
starts, to is where the sequence stops, and by is the interval of the sequence.

R Lists
Lists
A list in R can contain many different data types inside it. A list is a collection of
data which is ordered and changeable.

To create a list, use the list() function:

Example
# List of strings
thislist <- list("apple", "banana", "cherry")

# Print the list


thislist
Access Lists
You can access the list items by referring to its index number, inside brackets.
The first item has index 1, the second item has index 2, and so on:

Example
thislist <- list("apple", "banana", "cherry")

thislist[1]

Change Item Value


To change the value of a specific item, refer to the index number:

Example
thislist <- list("apple", "banana", "cherry")
thislist[1] <- "blackcurrant"

# Print the updated list


thislist

List Length
To find out how many items a list has, use the length() function:

Example
thislist <- list("apple", "banana", "cherry")

length(thislist)

Check if Item Exists


To find out if a specified item is present in a list, use the %in% operator:

Example
Check if "apple" is present in the list:

thislist <- list("apple", "banana", "cherry")

"apple" %in% thislist

Add List Items


To add an item to the end of the list, use the append() function:

Example
Add "orange" to the list:

thislist <- list("apple", "banana", "cherry")

append(thislist, "orange")

To add an item to the right of a specified index, add " after=index number" in
the append() function:

Example
Add "orange" to the list after "banana" (index 2):

thislist <- list("apple", "banana", "cherry")

append(thislist, "orange", after = 2)

Remove List Items


You can also remove list items. The following example creates a new, updated
list without an "apple" item:
Example
Remove "apple" from the list:

thislist <- list("apple", "banana", "cherry")

newlist <- thislist[-1]

# Print the new list


newlist

Range of Indexes
You can specify a range of indexes by specifying where to start and where to
end the range, by using the : operator:

Example
Return the second, third, fourth and fifth item:

thislist <-
list("apple", "banana", "cherry", "orange", "kiwi", "melon", "mango
")

(thislist)[2:5]

Note: The search will start at index 2 (included) and end at index 5 (included).

Remember that the first item has index 1.

Loop Through a List


You can loop through the list items by using a for loop:

Example
Print all items in the list, one by one:

thislist <- list("apple", "banana", "cherry")

for (x in thislist) {
print(x)
}

Join Two Lists


There are several ways to join, or concatenate, two or more lists in R.

The most common way is to use the c() function, which combines two elements
together:

Example
list1 <- list("a", "b", "c")
list2 <- list(1,2,3)
list3 <- c(list1,list2)

list3

R Matrices
Matrices
A matrix is a two dimensional data set with columns and rows.
A column is a vertical representation of data, while a row is a horizontal
representation of data.

A matrix can be created with the matrix() function. Specify


the nrow and ncol parameters to get the amount of rows and columns:

Example
# Create a matrix
thismatrix <- matrix(c(1,2,3,4,5,6), nrow = 3, ncol = 2)

# Print the matrix


thismatrix

Note: Remember the c() function is used to concatenate items together.

You can also create a matrix with strings:

Example
thismatrix <- matrix(c("apple", "banana", "cherry", "orange"), nrow
= 2, ncol = 2)

thismatrix

Access Matrix Items


You can access the items by using [ ] brackets. The first number "1" in the
bracket specifies the row-position, while the second number "2" specifies the
column-position:

Example
thismatrix <- matrix(c("apple", "banana", "cherry", "orange"), nrow
= 2, ncol = 2)

thismatrix[1, 2]

The whole row can be accessed if you specify a comma after the number in the
bracket:
Example
thismatrix <- matrix(c("apple", "banana", "cherry", "orange"), nrow
= 2, ncol = 2)

thismatrix[2,]

The whole column can be accessed if you specify a comma before the number
in the bracket:

Example
thismatrix <- matrix(c("apple", "banana", "cherry", "orange"), nrow
= 2, ncol = 2)

thismatrix[,2]

Access More Than One Row


More than one row can be accessed if you use the c() function:

Example
thismatrix <-
matrix(c("apple", "banana", "cherry", "orange","grape", "pineapple"
, "pear", "melon", "fig"), nrow = 3, ncol = 3)

thismatrix[c(1,2),]

Access More Than One Column


More than one column can be accessed if you use the c() function:

Example
thismatrix <-
matrix(c("apple", "banana", "cherry", "orange","grape", "pineapple"
, "pear", "melon", "fig"), nrow = 3, ncol = 3)

thismatrix[, c(1,2)]
Add Rows and Columns
Use the cbind() function to add additional columns in a Matrix:

Example
thismatrix <-
matrix(c("apple", "banana", "cherry", "orange","grape", "pineapple"
, "pear", "melon", "fig"), nrow = 3, ncol = 3)

newmatrix <- cbind(thismatrix,


c("strawberry", "blueberry", "raspberry"))

# Print the new matrix


newmatrix

Note: The cells in the new column must be of the same length as the existing
matrix.

Use the rbind() function to add additional rows in a Matrix:

Example
thismatrix <-
matrix(c("apple", "banana", "cherry", "orange","grape", "pineapple"
, "pear", "melon", "fig"), nrow = 3, ncol = 3)

newmatrix <- rbind(thismatrix,


c("strawberry", "blueberry", "raspberry"))

# Print the new matrix


newmatrix

Note: The cells in the new row must be of the same length as the existing
matrix.

Remove Rows and Columns


Use the c() function to remove rows and columns in a Matrix:
Example
thismatrix <-
matrix(c("apple", "banana", "cherry", "orange", "mango", "pineapple
"), nrow = 3, ncol =2)

#Remove the first row and the first column


thismatrix <- thismatrix[-c(1), -c(1)]

thismatrix

Check if an Item Exists


To find out if a specified item is present in a matrix, use the %in% operator:

Example
Check if "apple" is present in the matrix:

thismatrix <- matrix(c("apple", "banana", "cherry", "orange"), nrow


= 2, ncol = 2)

"apple" %in% thismatrix

Number of Rows and Columns


Use the dim() function to find the number of rows and columns in a Matrix:

Example
thismatrix <- matrix(c("apple", "banana", "cherry", "orange"), nrow
= 2, ncol = 2)

dim(thismatrix)

Matrix Length
Use the length() function to find the dimension of a Matrix:

Example
thismatrix <- matrix(c("apple", "banana", "cherry", "orange"), nrow
= 2, ncol = 2)

length(thismatrix)

Total cells in the matrix is the number of rows multiplied by number of columns.

In the example above: Dimension = 2*2 = 4.

Loop Through a Matrix


You can loop through a Matrix using a for loop. The loop will start at the first
row, moving right:

Example
Loop through the matrix items and print them:

thismatrix <- matrix(c("apple", "banana", "cherry", "orange"), nrow


= 2, ncol = 2)

for (rows in 1:nrow(thismatrix)) {


for (columns in 1:ncol(thismatrix)) {
print(thismatrix[rows, columns])
}
}

Combine two Matrices


Again, you can use the rbind() or cbind() function to combine two or more
matrices together:

Example
# Combine matrices
Matrix1 <- matrix(c("apple", "banana", "cherry", "grape"), nrow
= 2, ncol = 2)
Matrix2 <- matrix(c("orange", "mango", "pineapple", "watermelon"),
nrow = 2, ncol = 2)
# Adding it as a rows
Matrix_Combined <- rbind(Matrix1, Matrix2)
Matrix_Combined

# Adding it as a columns
Matrix_Combined <- cbind(Matrix1, Matrix2)
Matrix_Combined

R Arrays
Arrays
Compared to matrices, arrays can have more than two dimensions.

We can use the array() function to create an array, and the dim parameter to
specify the dimensions:

Example
# An array with one dimension with values ranging from 1 to 24
thisarray <- c(1:24)
thisarray

# An array with more than one dimension


multiarray <- array(thisarray, dim = c(4, 3, 2))
multiarray

Example Explained
In the example above we create an array with the values 1 to 24.

How does dim=c(4,3,2) work?


The first and second number in the bracket specifies the amount of rows and
columns.
The last number in the bracket specifies how many dimensions we want.

Note: Arrays can only have one data type.

Access Array Items


You can access the array elements by referring to the index position. You can
use the [] brackets to access the desired elements from an array:

Example
thisarray <- c(1:24)
multiarray <- array(thisarray, dim = c(4, 3, 2))

multiarray[2, 3, 2]

The syntax is as follow: array[row position, column position, matrix level]

You can also access the whole row or column from a matrix in an array, by using
the c() function:

Example
thisarray <- c(1:24)

# Access all the items from the first row from matrix one
multiarray <- array(thisarray, dim = c(4, 3, 2))
multiarray[c(1),,1]

# Access all the items from the first column from matrix one
multiarray <- array(thisarray, dim = c(4, 3, 2))
multiarray[,c(1),1]

A comma (,) before c() means that we want to access the column.

A comma (,) after c() means that we want to access the row.

ADVERTISEMENT

Check if an Item Exists


To find out if a specified item is present in an array, use the %in% operator:

Example
Check if the value "2" is present in the array:

thisarray <- c(1:24)


multiarray <- array(thisarray, dim = c(4, 3, 2))

2 %in% multiarray

Amount of Rows and Columns


Use the dim() function to find the amount of rows and columns in an array:

Example
thisarray <- c(1:24)
multiarray <- array(thisarray, dim = c(4, 3, 2))

dim(multiarray)

Array Length
Use the length() function to find the dimension of an array:

Example
thisarray <- c(1:24)
multiarray <- array(thisarray, dim = c(4, 3, 2))

length(multiarray)

Loop Through an Array


You can loop through the array items by using a for loop:

Example
thisarray <- c(1:24)
multiarray <- array(thisarray, dim = c(4, 3, 2))

for(x in multiarray){
print(x)
}

R Data Frames
Data Frames
Data Frames are data displayed in a format as a table.

Data Frames can have different types of data inside it. While the first column
can be character, the second and third can be numeric or logical. However, each
column should have the same type of data.

Use the data.frame() function to create a data frame:

Example
# Create a data frame
Data_Frame <- data.frame (
Training = c("Strength", "Stamina", "Other"),
Pulse = c(100, 150, 120),
Duration = c(60, 30, 45)
)

# Print the data frame


Data_Frame

Summarize the Data


Use the summary() function to summarize the data from a Data Frame:

Example
Data_Frame <- data.frame (
Training = c("Strength", "Stamina", "Other"),
Pulse = c(100, 150, 120),
Duration = c(60, 30, 45)
)

Data_Frame
summary(Data_Frame)

You will learn more about the summary() function in the statistical part of the R
tutorial.

Access Items
We can use single brackets [ ], double brackets [[ ]] or $ to access columns
from a data frame:

Example
Data_Frame <- data.frame (
Training = c("Strength", "Stamina", "Other"),
Pulse = c(100, 150, 120),
Duration = c(60, 30, 45)
)

Data_Frame[1]

Data_Frame[["Training"]]

Data_Frame$Training

Add Rows
Use the rbind() function to add new rows in a Data Frame:

Example
Data_Frame <- data.frame (
Training = c("Strength", "Stamina", "Other"),
Pulse = c(100, 150, 120),
Duration = c(60, 30, 45)
)

# Add a new row


New_row_DF <- rbind(Data_Frame, c("Strength", 110, 110))
# Print the new row
New_row_DF

Add Columns
Use the cbind() function to add new columns in a Data Frame:

Example
Data_Frame <- data.frame (
Training = c("Strength", "Stamina", "Other"),
Pulse = c(100, 150, 120),
Duration = c(60, 30, 45)
)

# Add a new column


New_col_DF <- cbind(Data_Frame, Steps = c(1000, 6000, 2000))

# Print the new column


New_col_DF

Remove Rows and Columns


Use the c() function to remove rows and columns in a Data Frame:

Example
Data_Frame <- data.frame (
Training = c("Strength", "Stamina", "Other"),
Pulse = c(100, 150, 120),
Duration = c(60, 30, 45)
)

# Remove the first row and column


Data_Frame_New <- Data_Frame[-c(1), -c(1)]

# Print the new data frame


Data_Frame_New

Amount of Rows and Columns


Use the dim() function to find the amount of rows and columns in a Data Frame:

Example
Data_Frame <- data.frame (
Training = c("Strength", "Stamina", "Other"),
Pulse = c(100, 150, 120),
Duration = c(60, 30, 45)
)

dim(Data_Frame)

You can also use the ncol() function to find the number of columns
and nrow() to find the number of rows:

Example
Data_Frame <- data.frame (
Training = c("Strength", "Stamina", "Other"),
Pulse = c(100, 150, 120),
Duration = c(60, 30, 45)
)

ncol(Data_Frame)
nrow(Data_Frame)

Data Frame Length


Use the length() function to find the number of columns in a Data Frame (similar
to ncol()):

Example
Data_Frame <- data.frame (
Training = c("Strength", "Stamina", "Other"),
Pulse = c(100, 150, 120),
Duration = c(60, 30, 45)
)

length(Data_Frame)

Combining Data Frames


Use the rbind() function to combine two or more data frames in R vertically:

Example
Data_Frame1 <- data.frame (
Training = c("Strength", "Stamina", "Other"),
Pulse = c(100, 150, 120),
Duration = c(60, 30, 45)
)

Data_Frame2 <- data.frame (


Training = c("Stamina", "Stamina", "Strength"),
Pulse = c(140, 150, 160),
Duration = c(30, 30, 20)
)

New_Data_Frame <- rbind(Data_Frame1, Data_Frame2)


New_Data_Frame

And use the cbind() function to combine two or more data frames in R
horizontally:

Example
Data_Frame3 <- data.frame (
Training = c("Strength", "Stamina", "Other"),
Pulse = c(100, 150, 120),
Duration = c(60, 30, 45)
)

Data_Frame4 <- data.frame (


Steps = c(3000, 6000, 2000),
Calories = c(300, 400, 300)
)

New_Data_Frame1 <- cbind(Data_Frame3, Data_Frame4)


New_Data_Frame1

R Factors
Factors
Factors are used to categorize data. Examples of factors are:

 Demography: Male/Female
 Music: Rock, Pop, Classic, Jazz
 Training: Strength, Stamina

To create a factor, use the factor() function and add a vector as argument:

Example
# Create a factor
music_genre <-
factor(c("Jazz", "Rock", "Classic", "Classic", "Pop", "Jazz", "Rock
", "Jazz"))

# Print the factor


music_genre

Result:

[1] Jazz Rock Classic Classic Pop Jazz Rock Jazz


Levels: Classic Jazz Pop Rock

You can see from the example above that that the factor has four levels
(categories): Classic, Jazz, Pop and Rock.

To only print the levels, use the levels() function:

Example
music_genre <-
factor(c("Jazz", "Rock", "Classic", "Classic", "Pop", "Jazz", "Rock
", "Jazz"))

levels(music_genre)

Result:

[1] "Classic" "Jazz" "Pop" "Rock"

You can also set the levels, by adding the levels argument inside
the factor() function:

Example
music_genre <-
factor(c("Jazz", "Rock", "Classic", "Classic", "Pop", "Jazz", "Rock
", "Jazz"), levels = c("Classic", "Jazz", "Pop", "Rock", "Other"))

levels(music_genre)

Result:

[1] "Classic" "Jazz" "Pop" "Rock" "Other"

Factor Length
Use the length() function to find out how many items there are in the factor:

Example
music_genre <-
factor(c("Jazz", "Rock", "Classic", "Classic", "Pop", "Jazz", "Rock
", "Jazz"))

length(music_genre)

Result:

[1] 8

Access Factors
To access the items in a factor, refer to the index number, using [] brackets:

Example
Access the third item:

music_genre <-
factor(c("Jazz", "Rock", "Classic", "Classic", "Pop", "Jazz", "Rock
", "Jazz"))

music_genre[3]

Result:

[1] Classic
Levels: Classic Jazz Pop Rock
Change Item Value
To change the value of a specific item, refer to the index number:

Example
Change the value of the third item:

music_genre <-
factor(c("Jazz", "Rock", "Classic", "Classic", "Pop", "Jazz", "Rock
", "Jazz"))

music_genre[3] <- "Pop"

music_genre[3]

Result:

[1] Pop
Levels: Classic Jazz Pop Rock

Note that you cannot change the value of a specific item if it is not already
specified in the factor. The following example will produce an error:

Example
Trying to change the value of the third item ("Classic") to an item that does not
exist/not predefined ("Opera"):

music_genre <-
factor(c("Jazz", "Rock", "Classic", "Classic", "Pop", "Jazz", "Rock
", "Jazz"))

music_genre[3] <- "Opera"

music_genre[3]

Result:

Warning message:
In `[<-.factor`(`*tmp*`, 3, value = "Opera") :
invalid factor level, NA generated
However, if you have already specified it inside the levels argument, it will
work:

Example
Change the value of the third item:

music_genre <-
factor(c("Jazz", "Rock", "Classic", "Classic", "Pop", "Jazz", "Rock
", "Jazz"), levels = c("Classic", "Jazz", "Pop", "Rock", "Opera"))

music_genre[3] <- "Opera"

music_genre[3]

Result:

[1] Opera
Levels: Classic Jazz Pop Rock Opera
R - Switch Statement
A switch statement allows a variable to be tested for equality against a list of
values. Each value is called a case, and the variable being switched on is
checked for each case.

Syntax
The basic syntax for creating a switch statement in R is −

switch(expression, case1, case2, case3....)


The following rules apply to a switch statement −

 If the value of expression is not a character string it is coerced to integer.


 You can have any number of case statements within a switch. Each case is
followed by the value to be compared to and a colon.
 If the value of the integer is between 1 and nargs()−1 (The max number of
arguments)then the corresponding element of case condition is evaluated and
the result returned.
 If expression evaluates to a character string then that string is matched
(exactly) to the names of the elements.
 If there is more than one match, the first matching element is returned.
 No Default argument is available.
 In the case of no match, if there is a unnamed element of ... its value is
returned. (If there is more than one such argument an error is returned.)

Flow Diagram

Example
Live Demo
x <- switch(
3,
"first",
"second",
"third",
"fourth"
)
print(x)

When the above code is compiled and executed, it produces the following result

[1] "third"

R - Packages
R packages are a collection of R functions, complied code and sample data. They are stored
under a directory called "library" in the R environment. By default, R installs a set of
packages during installation. More packages are added later, when they are needed for some
specific purpose. When we start the R console, only the default packages are available by
default. Other packages which are already installed have to be loaded explicitly to be used by the
R program that is going to use them.
All the packages available in R language are listed at R Packages.

Below is a list of commands to be used to check, verify and use the R packages.

Check Available R Packages


Get library locations containing R packages

Live Demo

.libPaths()

When we execute the above code, it produces the following result. It may vary depending on the
local settings of your pc.

[2] "C:/Program Files/R/R-3.2.2/library"

Get the list of all the packages installed


Live Demo

library()
When we execute the above code, it produces the following result. It may vary depending on the
local settings of your pc.

Packages in library ‘C:/Program Files/R/R-3.2.2/library’:

base The R Base Package


boot Bootstrap Functions (Originally by Angelo Canty
for S)
class Functions for Classification
cluster "Finding Groups in Data": Cluster Analysis
Extended Rousseeuw et al.
codetools Code Analysis Tools for R
compiler The R Compiler Package
datasets The R Datasets Package
foreign Read Data Stored by 'Minitab', 'S', 'SAS',
'SPSS', 'Stata', 'Systat', 'Weka', 'dBase', ...
graphics The R Graphics Package
grDevices The R Graphics Devices and Support for Colours
and Fonts
grid The Grid Graphics Package
KernSmooth Functions for Kernel Smoothing Supporting Wand
& Jones (1995)
lattice Trellis Graphics for R
MASS Support Functions and Datasets for Venables and
Ripley's MASS
Matrix Sparse and Dense Matrix Classes and Methods
methods Formal Methods and Classes
mgcv Mixed GAM Computation Vehicle with GCV/AIC/REML
Smoothness Estimation
nlme Linear and Nonlinear Mixed Effects Models
nnet Feed-Forward Neural Networks and Multinomial
Log-Linear Models
parallel Support for Parallel computation in R
rpart Recursive Partitioning and Regression Trees
spatial Functions for Kriging and Point Pattern
Analysis
splines Regression Spline Functions and Classes
stats The R Stats Package
stats4 Statistical Functions using S4 Classes
survival Survival Analysis
tcltk Tcl/Tk Interface
tools Tools for Package Development
utils The R Utils Package

Get all packages currently loaded in the R environment

Live Demo

search()

When we execute the above code, it produces the following result. It may vary depending on the
local settings of your pc.

[1] ".GlobalEnv" "package:stats" "package:graphics"


[4] "package:grDevices" "package:utils" "package:datasets"
[7] "package:methods" "Autoloads" "package:base"

Explore our latest online courses and learn new skills at your own pace. Enroll and
become a certified expert to boost your career.

Install a New Package


There are two ways to add new R packages. One is installing directly from the CRAN directory
and another is downloading the package to your local system and installing it manually.

Install directly from CRAN


The following command gets the packages directly from CRAN webpage and installs the
package in the R environment. You may be prompted to choose a nearest mirror. Choose the one
appropriate to your location.

install.packages("Package Name")
# Install the package named "XML".
install.packages("XML")

Install package manually


Go to the link R Packages to download the package needed. Save the package as a .zip file in a
suitable location in the local system.

Now you can run the following command to install this package in the R environment.

install.packages(file_name_with_path, repos = NULL, type =


"source")

# Install the package named "XML"


install.packages("E:/XML_3.98-1.3.zip", repos = NULL, type =
"source")

Load Package to Library


Before a package can be used in the code, it must be loaded to the current R environment. You
also need to load a package that is already installed previously but not available in the current
environment.

A package is loaded using the following command −

library("package Name", lib.loc = "path to library")

# Load the package named "XML"


install.packages("E:/XML_3.98-1.3.zip", repos = NULL, type = "source")
Print Page

You might also like