[go: up one dir, main page]

0% found this document useful (0 votes)
57 views36 pages

Introduction To R PDF

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 36

Engineering Data Analysis

Introduction to R

Department of Mathematics and Statistics


College of Science

1/16
What is R?

2/16
What is R?
R is an interpreted programming language for statistical computing and graphics. It
is widely used in academia, by large companies, and by researchers. It is specifically
designed for doing statistical analysis.

2/16
What is R?
R is an interpreted programming language for statistical computing and graphics. It
is widely used in academia, by large companies, and by researchers. It is specifically
designed for doing statistical analysis.
Why use R for Data analysis?

2/16
What is R?
R is an interpreted programming language for statistical computing and graphics. It
is widely used in academia, by large companies, and by researchers. It is specifically
designed for doing statistical analysis.
Why use R for Data analysis?
I R is free to download and use.

2/16
What is R?
R is an interpreted programming language for statistical computing and graphics. It
is widely used in academia, by large companies, and by researchers. It is specifically
designed for doing statistical analysis.
Why use R for Data analysis?
I R is free to download and use.

I R is open-source.

2/16
What is R?
R is an interpreted programming language for statistical computing and graphics. It
is widely used in academia, by large companies, and by researchers. It is specifically
designed for doing statistical analysis.
Why use R for Data analysis?
I R is free to download and use.
I R is open-source.
I It is very easy to share your output from R.

2/16
What is R?
R is an interpreted programming language for statistical computing and graphics. It
is widely used in academia, by large companies, and by researchers. It is specifically
designed for doing statistical analysis.
Why use R for Data analysis?
I R is free to download and use.

I R is open-source.

I It is very easy to share your output from R.

I Data processing in R is very easy.

2/16
What is R?
R is an interpreted programming language for statistical computing and graphics. It
is widely used in academia, by large companies, and by researchers. It is specifically
designed for doing statistical analysis.
Why use R for Data analysis?
I R is free to download and use.

I R is open-source.

I It is very easy to share your output from R.

I Data processing in R is very easy.

I Data visualization tools in R are very extensive.

2/16
What is R?
R is an interpreted programming language for statistical computing and graphics. It
is widely used in academia, by large companies, and by researchers. It is specifically
designed for doing statistical analysis.
Why use R for Data analysis?
I R is free to download and use.

I R is open-source.

I It is very easy to share your output from R.

I Data processing in R is very easy.

I Data visualization tools in R are very extensive.

I Advanced functionality often used in practice by scientists is available in R.

2/16
What is R?
R is an interpreted programming language for statistical computing and graphics. It
is widely used in academia, by large companies, and by researchers. It is specifically
designed for doing statistical analysis.
Why use R for Data analysis?
I R is free to download and use.

I R is open-source.

I It is very easy to share your output from R.

I Data processing in R is very easy.

I Data visualization tools in R are very extensive.

I Advanced functionality often used in practice by scientists is available in R.

I R provides reproducibility for your analyses

2/16
What do we need?

3/16
What do we need?

Install R:
Go to R Link, select the version of R software applicable to your computer and
install the software.

3/16
What do we need?

Install R:
Go to R Link, select the version of R software applicable to your computer and
install the software.

Install RStudio:
Once you are done, download the RStudio installer. Go to R Studio Link, select the
applicable version of RStudio and install the software.

3/16
The Difference between R and RStudio

4/16
The Difference between R and RStudio

RStudio is actually an add-on to R: it takes the R software and adds a very


user-friendly graphical interface. Thus, when one uses RStudio, they are still
using the full version of R while also getting the benefit of greater functionality
and usability due to an improved user interface. As a result, when using R, one
should always use RStudio; working with R itself is very cumbersome.

4/16
The Difference between R and RStudio

RStudio is actually an add-on to R: it takes the R software and adds a very


user-friendly graphical interface. Thus, when one uses RStudio, they are still
using the full version of R while also getting the benefit of greater functionality
and usability due to an improved user interface. As a result, when using R, one
should always use RStudio; working with R itself is very cumbersome.
Since RStudio is an add-on to R, you must first download and install R before
installing RStudio. On your computer, you will see R and RStudio as separate
installed programs. When using R for data analysis, you will always open and work in
RStudio; you must leave R installed on the computer for RStudio to work, even
though you will likely never open R itself.

4/16
Four Pane of RStudio

I Console Pane
I R Script or Source Pane
I The environment and history pane
I The final pane

5/16
Four Pane of RStudio

I Console Pane - output and error messages are displayed.

6/16
Four Pane of RStudio

I Console Pane - output and error messages are displayed.


I R Script or Source Pane - you can type and save your commands and make
notes to yourself about projects.

7/16
Assigning value to variables:
Make sure that the name you assign your variable is accurately descriptive and
understandable to another reader.

The command for naming object:


= or <-

Reminders:
I R is case sensitive: It will tell the difference between uppercase and lowercase.

I Respect the naming rules for objects.


I no spaces
I should not start with a number, underscore, or any special characters
I avoid using function built in names

8/16
Assigning value to variables:
Make sure that the name you assign your variable is accurately descriptive and
understandable to another reader.

The command for naming object:


= or <-

Reminders:
I R is case sensitive: It will tell the difference between uppercase and lowercase.

I Respect the naming rules for objects.


I no spaces
I should not start with a number, underscore, or any special characters
I avoid using function built in names

Exercise 1: Assign the values 1, 2, and 3 to variables x, y, and z respectively.


8/16
Print out the values of the variable:
When we say print out, it simply means that all the values of the variable are
displayed in the console.

When assigning a value to an object, R does not print anything. You can force R to
print a value using the following syntax:
I (<object name/expression>)
I <object name>
I print(<object name/expression>)

9/16
Print out the values of the variable:
When we say print out, it simply means that all the values of the variable are
displayed in the console.

When assigning a value to an object, R does not print anything. You can force R to
print a value using the following syntax:
I (<object name/expression>)
I <object name>
I print(<object name/expression>)

Exercise 2: Print the variable x and the letter ”x” in 3 ways.

9/16
Four Pane of RStudio
I Console Pane - output and error messages are displayed.
I R Script or Source Pane - you can type and save your commands and make
notes to yourself about projects.
I The environment and history pane is where you will see the different objects
you create or the different datasets you import.

I attach() - a function used to attach files


in the workspace
I library() - a function used to load a
package in the workspace

10/16
Four Pane of RStudio

I Console Pane - output and error messages are displayed.


I R Script or Source Pane - you can type and save your commands and make
notes to yourself about projects.
I The environment and history pane is where you will see the different objects
you create or the different datasets you import.
I The final pane contains everything else including help, plots, packages, etc.

11/16
Steps to Install R Packages
2. In the Install Packages dialog, write the
1. Go to your final pane. Click Packages, then package name you want to install under the
click Install. Packages field and then click install. This will
install the package you searched for or give
you a list of matching packages based on your
package text.

12/16
Loading Data into R

Import Excel File


1. Download and install the package readxl to read excel files.
2. Click ”Import Dataset” in the Environment pane, then select ”From Excel”.
The dialog box will appear.
3. Select your Excel file you want to import.

Reminders:
I Make sure to attach your file to the R search path attach(). This means that
the database is searched by R when evaluating a variable, so objects in the
database can be accessed by simply giving their names. Detach the dataset
when you are done detach().

13/16
Extracting Column from a Data Frame in R

Extracting means selecting columns. An imported Excel data file in R is classified as


a dataframe. Using the $ operator, we can extract a single column from a dataframe
with the following syntax:

<dataframe>$<column>

Exercise 3: Select a column from the SAMPLE DATA Excel file, save it to the
variable column 1, and print it.

14/16
Basic Data Types in R

15/16
Basic Data Types in R

R works with numerous data types. Some of the most basic types to get started are:

15/16
Basic Data Types in R

R works with numerous data types. Some of the most basic types to get started are:
I Number values like 3.6 are called numerics. These values are recognized in R
as is.

15/16
Basic Data Types in R

R works with numerous data types. Some of the most basic types to get started are:
I Number values like 3.6 are called numerics. These values are recognized in R
as is.
I Boolean values true or false are called logical. They are recognized in R as
TRUE or T for true and FALSE or F for false.

15/16
Basic Data Types in R

R works with numerous data types. Some of the most basic types to get started are:
I Number values like 3.6 are called numerics. These values are recognized in R
as is.
I Boolean values true or false are called logical. They are recognized in R as
TRUE or T for true and FALSE or F for false.
I Text (or string) values are called characters. Characters in R are recognized
with ”” or ”.

15/16
Basic Data Types in R

R works with numerous data types. Some of the most basic types to get started are:
I Number values like 3.6 are called numerics. These values are recognized in R
as is.
I Boolean values true or false are called logical. They are recognized in R as
TRUE or T for true and FALSE or F for false.
I Text (or string) values are called characters. Characters in R are recognized
with ”” or ”.
To check the data type of a variable, use the class function.
Syntax: class(<expression>)

15/16
References

I UOFT Coders
I Monash Bioinformatics Platform
I Bookdown

Thank You!

16/16

You might also like