5 Sem Notes R Programing Unit 12345 Bca BSC
5 Sem Notes R Programing Unit 12345 Bca BSC
R -Overview
R is a programming language and software environment for statistical analysis,
graphics representation and reporting. R was created by Ross Ihaka and Robert
Gentleman at the University of Auckland, New Zealand, and is currently
developed by the R Development Core Team.
F
SU
R is freely available under the GNU General Public License, and pre-compiled
binary versions are provided for various operating systems like Linux, Windows
and Mac.
U
R is free software distributed under a GNU-style copy left, and an official part of
the GNU project called GNU S.
YO
Evolution of R
AD
Since mid-1997 there has been a core group (the "R Core Team") who can
modify the R source code archive.
Features of R
H
U
important features of R −
_________________________________________________________________________
Prepared by MUHAMMAD YOUSUF
________________________________________________________________
R - Basic Syntax
F
SU
As a convention, we will start learning R programming by writing a "Hello,
World!" program. Depending on the needs, you can program either at R
U
command prompt or you can use an R script file to write your program. Let's
check both one by one.
R Command Prompt
YO
Once you have R environment setup, then it’s easy to start your R command
AD
$R
M
This will launch R interpreter and you will get a prompt > where you can start
typing your program as follows −
AM
Here first statement defines a string variable myString, where we assign a string
U
"Hello, World!" and then next statement print() is being used to print the value
stored in variable myString.
M
R Script File
Usually, you will do your programming by writing your programs in script files
and then you execute those scripts at your command prompt with the help of R
interpreter called Rscript. So let's start with writing following code in a text file
called test.R as under − Demo
_________________________________________________________________________
Prepared by MUHAMMAD YOUSUF
print ( myString)
Save the above code in a file test.R and execute it at Linux command prompt as
given below. Even if you are using Windows or other system, syntax will remain
same.
$ Rscript test.R
F
SU
Comments
Comments are like helping text in your R program and they are ignored by the
interpreter while executing your actual program. Single comment is written using
U
# in the beginning of the statement as follows −
if(FALSE) {
"This is a demo for multi-line comments and it should be put
inside either a
M
Though above comments will be executed by R interpreter, they will not interfere
with your actual program. You should put such comments inside, either single or
U
double quote.
M
_____________________________________________________
R - Data Types
_________________________________________________________________________
Prepared by MUHAMMAD YOUSUF
Vectors
F
Lists
Matrices
SU
Arrays
Factors
Data Frames
U
The simplest of these objects is the vector object and there are six data types
YO
of these atomic vectors, also termed as six classes of vectors. The other
R-Objects are built upon the atomic vectors.
AD
print(class(v))
it produces the following result −
AM
[1] "logical"
v <- 23.5
print(class(v))
U
_________________________________________________________________________
Prepared by MUHAMMAD YOUSUF
F
it produces the following result −
[1] "character"
SU
Raw "Hello" is stored as 48 Live Demo
U
65 6c 6c 6f v <- charToRaw("Hello")
print(class(v))
YO
it produces the following result −
[1] "raw"
AD
In R programming, the very basic data types are the R-objects called vectors
which hold elements of different classes as shown above. Please note in R the
number of classes is not confined to only the above six types. For example, we
M
can use many atomic vectors and create an array whose class will become array.
_______________________________________________
AM
Vectors
H
When you want to create vector with more than one element, you should use
c() function which means to combine the elements into a vector.
U
Live Demo
M
# Create a vector.
apple <- c('red','green',"yellow")
print(apple)
F
[1] 10.69
SU
___________________________________________________________
R - Matrices
U
Matrices are the R objects in which the elements are arranged in a
two-dimensional rectangular layout. They contain elements of the same
YO
atomic types. Though we can create a matrix containing only characters or
only logical values, they are not of much use. We use matrices containing
AD
Syntax
AM
data is the input vector which becomes the data elements of the
matrix.
M
Example
Create a matrix taking a vector of numbers as input.
_________________________________________________________________________
Prepared by MUHAMMAD YOUSUF
Live Demo
# Elements are arranged sequentially by row.
M <- matrix(c(3:14), nrow = 4, byrow = TRUE)
print(M)
F
colnames = c("col1", "col2", "col3")
SU
P <- matrix(c(3:14), nrow = 4, byrow = TRUE, dimnames =
list(rownames, colnames))
print(P)
U
When we execute the above code, it produces the following result −
[4,] 12 13 14
[,1] [,2] [,3]
[1,] 3 7 11
M
[2,] 4 8 12
[3,] 5 9 13
AM
[4,] 6 10 14
col1 col2 col3
row1 3 4 5
row2 6 7 8
H
row3 9 10 11
row4 12 13 14
U
Elements of a matrix can be accessed by using the column and row index of
the element. We consider the matrix P above to find the specific elements
below.
Live Demo
# Define the column and row names.
rownames = c("row1", "row2", "row3", "row4")
colnames = c("col1", "col2", "col3")
_________________________________________________________________________
Prepared by MUHAMMAD YOUSUF
F
print(P[2,])
SU
# Access only the 3rd column.
print(P[,3])
U
When we execute the above code, it produces the following result −
[1] 5
[1] 13
col1 col2 col3
YO
6 7 8
row1 row2 row3 row4
AD
5 8 11 14
Matrix Computations
M
The dimensions (number of rows and columns) should be same for the
matrices involved in the operation.
H
Live Demo
M
_________________________________________________________________________
Prepared by MUHAMMAD YOUSUF
F
[,1] [,2] [,3]
SU
[1,] 5 0 3
[2,] 2 9 4
Result of addition
U
[,1] [,2] [,3]
[1,] 8 -1 5
[2,] 11 13 10
Result of subtraction
[,1] [,2] [,3]
YO
[1,] -2 -1 -1
AD
[2,] 7 -5 2
print(matrix1)
cat("Result of multiplication","\n")
print(result)
_________________________________________________________________________
Prepared by MUHAMMAD YOUSUF
F
[1,] 0.6 -Inf 0.6666667
[2,] 4.5 0.4444444 1.5000000
SU
_______________________________________________________
U
Arrays are the R data objects which can store data in more than two
dimensions. For example − If we create an array of dimension (2, 3, 4) then
YO
it creates 4 rectangular matrices each with 2 rows and 3 columns. Arrays can
store only data type.
AD
An array is created using the array() function. It takes vectors as input and
uses the values in the dim parameter to create an array.
M
Example
AM
The following example creates an array of two 3x3 matrices each with 3 rows
and 3 columns.Live Demo
H
_________________________________________________________________________
Prepared by MUHAMMAD YOUSUF
,,1
[,1] [,2] [,3]
[1,] 5 10 13
[2,] 9 11 14
[3,] 3 12 15
,,2
[,1] [,2] [,3]
F
[1,] 5 10 13
SU
[2,] 9 11 14
[3,] 3 12 15
U
Naming Columns and Rows
YO
We can give names to the rows, columns and matrices in the array by using
the dimnames parameter.Live Demo
AD
matrix.names))
print(result)
_________________________________________________________________________
Prepared by MUHAMMAD YOUSUF
, , Matrix2
COL1 COL2 COL3
F
ROW1 5 10 13
SU
ROW2 9 11 14
ROW3 3 12 15
U
Accessing Array Elements
Live Demo
# Create two vectors of different lengths.
YO
vector1 <- c(5,9,3)
AD
print(result[3,,2])
# Print the element in the 1st row and 3rd column of the 1st
matrix.
print(result[1,3,1])
_________________________________________________________________________
Prepared by MUHAMMAD YOUSUF
F
SU
Manipulating Array Elements
As array is made up matrices in multiple dimensions, the operations on
U
elements of array are carried out by accessing elements of the matrices.
Demo YO
# Create two vectors of different lengths.
vector1 <- c(5,9,3)
AD
_________________________________________________________________________
Prepared by MUHAMMAD YOUSUF
F
Calculations Across Array Elements
SU
We can do calculations across the elements in an array using the apply()
function.
U
Syntax
apply(x, margin, fun)
YO
Following is the description of the parameters used −
AD
x is an array.
margin is the name of the data set used.
M
Example
H
We use the apply() function below to calculate the sum of the elements in
the rows of an array across all the matrices.
U
Live Demo
M
_________________________________________________________________________
Prepared by MUHAMMAD YOUSUF
# Use apply to calculate the sum of the rows across all the
matrices.
result <- apply(new.array, c(1), sum)
print(result)
F
,,1
SU
[,1] [,2] [,3]
[1,] 5 10 13
[2,] 9 11 14
U
[3,] 3 12 15
,,2
YO
[,1] [,2] [,3]
AD
[1,] 5 10 13
[2,] 9 11 14
[3,] 3 12 15
M
AM
[1] 56 68 60
_______________________________________________________
Non numerics
H
In R programming, non-numeric values refer to data types and values that are not
represented as numbers. Non-numeric values are essential for working with diverse
U
types of data, including text, categorical data, logical values, and more. Here are
some common non-numeric data types in R:
M
_________________________________________________________________________
Prepared by MUHAMMAD YOUSUF
Logical Values:
● Logical values represent binary data with only two possible values:
TRUE or FALSE.
F
● They are often used for conditional statements and logical operations.
SU
Example:is_raining <- TRUE
U
Date and Time Data:
● R provides data types for working with date and time information,
including Date, POSIXct, and POSIXlt.
YO
● You can manipulate and perform calculations with dates and times
using these data types.
● Example:my_date <- as.Date("2023-11-09")
AD
Complex Data:
M
● Example:complex_number <- 3 + 2i
●
Missing Values:
H
● Example:my_value <- NA
●
M
Special Values:
● R includes special values such as NaN (Not-a-Number) and Inf (Infinity)
for specific mathematical situations.
● NaN is used to indicate undefined or unrepresentable results in
calculations.
● Inf represents positive or negative infinity.
● Example: result <- 1 / 0 # Results in Inf
_________________________________________________________________________
Prepared by MUHAMMAD YOUSUF
_____________________________________________
Lists
A list is an R-object which can contain many different types of elements inside it
F
like vectors, functions and even another list inside it.
SU
Live Demo
# Create a list.
list1 <- list(c(2,5,3),21.3,sin)
U
# Print the list.
print(list1)
[1] 2 5 3
[[2]]
[1] 21.3
M
[[3]]
AM
R - Data Frames
H
U
M
_________________________________________________________________________
Prepared by MUHAMMAD YOUSUF
F
emp_id = c (1:5),
SU
emp_name = c("Rick","Dan","Michelle","Ryan","Gary"),
salary = c(623.3,515.2,611.0,729.0,843.25),
U
start_date = as.Date(c("2012-01-01", "2013-09-23",
"2014-11-15", "2014-05-11",
"2015-03-27")),
YO
stringsAsFactors = FALSE
AD
)
# Print the data frame.
print(emp.data)
M
Live Demo
_________________________________________________________________________
Prepared by MUHAMMAD YOUSUF
F
"2015-03-27")),
stringsAsFactors = FALSE
SU
)
# Get the structure of the data frame.
U
str(emp.data)
$ emp_id : int 1 2 3 4 5
$ emp_name : chr "Rick" "Dan" "Michelle" "Ryan" ...
$ salary : num 623 515 611 729 843
M
The statistical summary and nature of the data can be obtained by applying
U
summary() function.
M
Live Demo
# Create the data frame.
emp.data <- data.frame(
emp_id = c (1:5),
emp_name = c("Rick","Dan","Michelle","Ryan","Gary"),
salary = c(623.3,515.2,611.0,729.0,843.25),
_________________________________________________________________________
Prepared by MUHAMMAD YOUSUF
F
SU
emp_id emp_name salary start_date
Min. :1 Length:5 Min. :515.2 Min. :2012-01-01
1st Qu.:2 Class :character 1st Qu.:611.0 1st Qu.:2013-09-23
U
Median :3 Mode :character Median :623.3 Median :2014-05-11
Mean :3
3rd Qu.:4
Mean :664.4 Mean :2014-01-14
3rd Qu.:729.0 3rd Qu.:2014-11-15
YO
Max. :5 Max. :843.2 Max. :2015-03-27
AD
Live Demo
# Create the data frame.
H
emp_name = c("Rick","Dan","Michelle","Ryan","Gary"),
M
salary = c(623.3,515.2,611.0,729.0,843.25),
start_date =
as.Date(c("2012-01-01","2013-09-23","2014-11-15","2014-05-11",
"2015-03-27")),
stringsAsFactors = FALSE
)
# Extract Specific columns.
_________________________________________________________________________
Prepared by MUHAMMAD YOUSUF
emp.data.emp_name emp.data.salary
1 Rick 623.30
2 Dan 515.20
3 Michelle 611.00
F
4 Ryan 729.00
SU
5 Gary 843.25
U
Extract the first two rows and then all columns
Live Demo
# Create the data frame.
YO
emp.data <- data.frame(
AD
emp_id = c (1:5),
emp_name = c("Rick","Dan","Michelle","Ryan","Gary"),
salary = c(623.3,515.2,611.0,729.0,843.25),
M
AM
stringsAsFactors = FALSE
)
U
_________________________________________________________________________
Prepared by MUHAMMAD YOUSUF
Extract 3rd and 5th row with 2nd and 4th column
Live Demo
# Create the data frame.
emp.data <- data.frame(
emp_id = c (1:5),
emp_name = c("Rick","Dan","Michelle","Ryan","Gary"),
F
salary = c(623.3,515.2,611.0,729.0,843.25),
SU
start_date = as.Date(c("2012-01-01", "2013-09-23",
"2014-11-15", "2014-05-11",
U
"2015-03-27")),
)
stringsAsFactors = FALSE YO
AD
# Extract 3rd and 5th row with 2nd and 4th column.
result <- emp.data[c(3,5),c(2,4)]
print(result)
M
emp_name start_date
3 Michelle 2014-11-15
H
5 Gary 2015-03-27
U
M
Add Column
Live Demo
_________________________________________________________________________
Prepared by MUHAMMAD YOUSUF
F
"2015-03-27")),
stringsAsFactors = FALSE
SU
)
U
# Add the "dept" coulmn.
emp.data$dept <- c("IT","Operations","IT","HR","Finance")
v <- emp.data
YO
print(v)
AD
Add Row
M
In the example below we create a data frame with new rows and merge it
with the existing data frame to create the final data frame.
_________________________________________________________________________
Prepared by MUHAMMAD YOUSUF
Live Demo
# Create the first data frame.
emp.data <- data.frame(
emp_id = c (1:5),
emp_name = c("Rick","Dan","Michelle","Ryan","Gary"),
salary = c(623.3,515.2,611.0,729.0,843.25),
F
"2014-11-15", "2014-05-11",
SU
"2015-03-27")),
dept = c("IT","Operations","IT","HR","Finance"),
stringsAsFactors = FALSE
U
)
emp_name = c("Rasmi","Pranab","Tusar"),
salary = c(578.0,722.5,632.8),
M
start_date =
as.Date(c("2013-05-21","2013-07-30","2014-06-17")),
AM
dept = c("IT","Operations","Fianance"),
stringsAsFactors = FALSE
)
H
U
print(emp.finaldata)
_________________________________________________________________________
Prepared by MUHAMMAD YOUSUF
SPECIAL VALUES
F
In R programming, special values refer to specific values that are used in special cases,
SU
often in mathematical and computational contexts. These values have distinct meanings
and are used to represent exceptional situations. Some of the commonly used special
values in R are:
U
NA (Not Available):
YO
● NA represents missing or undefined data. It is used when data is not available
or cannot be determined.
AD
NaN (Not-a-Number):
AM
result.
M
Inf (Infinity):
● Inf represents positive infinity. It's used to indicate values that are larger than
any finite number.
● It can result from operations like division by zero.
● Example: result <- 1 / 0 # Results in Inf
_________________________________________________________________________
Prepared by MUHAMMAD YOUSUF
F
Classes and Coercion
SU
In R programming, "Classes" and "Coercion" are concepts related to data types and
U
type conversion. Let's explore these concepts:
Classes:
structure.
● Each object in R belongs to one or more classes that define its
behavior and available methods.
● Classes are essential for object-oriented programming (OOP) in R.
M
Common Classes in R:
● Numeric: Objects of class "numeric" represent real numbers.
AM
predefined levels.
● Data Frame: Objects of class "data.frame" represent structured data
U
tables.
● List: Objects of class "list" can hold elements of different classes.
M
What is Coercion?
● Coercion refers to the automatic or manual conversion of objects from
one class to another.
● R will automatically coerce objects when performing operations
involving different classes to ensure compatibility.
Implicit and Explicit Coercion:
● Implicit Coercion: R performs implicit (automatic) coercion when
necessary to make operations work. For example, when adding a
F
numeric and an integer, the integer is implicitly coerced to a numeric.
● Explicit Coercion: You can explicitly coerce objects from one class to
SU
another using functions like as.numeric(), as.character(), etc.
Example of Coercion:
● If you want to add a numeric vector and a character vector, R will
U
automatically coerce the character vector to numeric if possible. This
can lead to unexpected results if the character vector contains
non-numeric values. YO
● Explicit coercion can be used to control the conversion process, for
instance, by using as.numeric() to convert a character vector to
AD
numeric explicitly.
y <- "10"
result <- x + y # Implicit coercion of "y" to numeric: result is 15
AM
y <- "10"
y <- as.numeric(y) # Explicitly coerce "y" to numeric
U
Understanding classes and coercion is important for handling different data types
and ensuring that your data manipulations and calculations are performed correctly
in R.
______________________________________________________________
BASIC PLOTTING
_________________________________________________________________________
Prepared by MUHAMMAD YOUSUF
Plotting Functions:
● R provides a variety of plotting functions, including plot(), hist(),
barplot(), and more, to create different types of plots.
● The choice of function depends on the nature of your data and the type
F
of plot you want to generate.
Common Types of Basic Plots:
SU
● Scatter Plots: Used to visualize the relationship between two
continuous variables.
● Histograms: Display the distribution of a single variable by dividing it
U
into bins.
columns. YO
● Bar Charts: Represent categorical or discrete data using bars or
Graphics Parameters:
● You can set global graphical parameters using functions like par() to
U
_________________________________________________________________________
Prepared by MUHAMMAD YOUSUF
In this example, we use the plot() function to create a scatter plot of two variables x
and y. We customize the plot with a title, axis labels, point style (pch), and color. This
is a basic example of how you can create and customize plots in R for data
exploration and presentation.
F
V
SU
U
YO
AD
M
AM
H
U
M
_________________________________________________________________________
Prepared by MUHAMMAD YOUSUF