[go: up one dir, main page]

0% found this document useful (0 votes)
20 views16 pages

SSMDA Expt 7

This document introduces R programming, detailing its history, applications, and installation steps for R and RStudio. It covers basic operations, including data structures like vectors, lists, matrices, arrays, factors, and data frames, along with examples of their creation and usage. The conclusion emphasizes the importance of practice in harnessing R's capabilities for data analysis and engineering problem-solving.

Uploaded by

studybuddy060903
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views16 pages

SSMDA Expt 7

This document introduces R programming, detailing its history, applications, and installation steps for R and RStudio. It covers basic operations, including data structures like vectors, lists, matrices, arrays, factors, and data frames, along with examples of their creation and usage. The conclusion emphasizes the importance of practice in harnessing R's capabilities for data analysis and engineering problem-solving.

Uploaded by

studybuddy060903
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

EXPERIMENT-7

AIM: Introduction to R Programming and implementation of basic operations in R.

Introduction to R programming
R is a programming language and free software developed by Ross Ihaka and Robert Gentleman in 1993. R
possesses an extensive catalog of statistical and graphical methods. It includes machine learning algorithms,
linear regression, time series, and statistical inference to name a few. Most of the R libraries are written in R,
but for heavy computational tasks, C, C++ and Fortran codes are preferred. R is not only entrusted by
academic, but many large companies also use R programming language, including Uber, Google, Airbnb,
Facebook and so on.

Data analysis with R is done in a series of steps; programming, transforming, discovering, modelling and
communicate the results.
Program: R is a clear and accessible programming tool
Transform: R is made up of a collection of libraries designed specifically for data science
Discover: Investigate the data, refine your hypothesis and analyze them
Model: R provides a wide array of tools to capture the right model for your data
Communicate: Integrate codes, graphs, and outputs to a report with R Markdown or build
Shiny apps to share with the world
What is R used for?
1. Statistical inference
2. Data analysis
3. Machine learning algorithm

Installation of R-Studio on windows:


Step 1: Download RStudio (https://posit.co/download/rstudio-desktop/) using the following Steps
1: Install R
Step 1: Click on download and install
Step 2: Download R for windows / macOS / Linux

Step 3: Click install R (base) for the first time

Step 4: Download R-4.3.2 and click on R-4.3.2-win.exe in the download folder


Step 5: Click Ok Step 6: Click Next

Step 6: Select Destination Location Step 7: Select Components

Step 8: Choose No (accept defaults) Step 9: Select Start Menu Folder


Step 10: Select Additional Tasks Step 11: Installation will begin

Step 12: Click Finish to complete the installation

2: Install R Studio

Step 1: Click on download RStudio Desktop for Windows


Step 2: Click RStudio-2023.12.0-369 exe file and click Next on the welcome window.

Step 3: Enter/browse the path to the installation folder and click Next to proceed.

Step 4: Select the folder for the start menu shortcut or click on do not create shortcuts and then
click Next.
Step 5: Wait for the installation process to complete.

Step 6: Finish the installation

Getting started with Scilab


Installing Packages:-
The most common place to get packages from is CRAN. To install packages from CRAN you use
install.packages("package name"). For instance, if you want to install the ggplot2 package, which is a very
popular visualization package, you would type the following in the console:-
Syntax:-
# install package from CRAN install.packages("ggplot2")
Loading Packages:-
Once the package is downloaded to your computer you can access the functions and resources provided by
the package in two different ways:
# load the package to use in the current R session library (packagename)
Getting Help on Packages:-
For more direct help on packages that are installed on your computer you can use the help and
vignette functions. Here we can get help on the ggplot2 package with the following:
help(package = "ggplot2") # provides details regarding contents of a package vignette(package
= "ggplot2") # list vignettes available for a specific package vignette("ggplot2-specs") # view
specific vignette
vignette() # view all vignettes on your computer

Assignment Operators:-
The first operator you’ll run into is the assignment operator. The assignment operator is used
to assign a value. For instance we can assign the value 3 to the variable x using the <-
assignment operator.
# assignment
x <- 3 or x=3
Evaluation
We can then evaluate the variable by simply typing x at the command line which will return
the value of x.
# evaluation
x
## [1] 3
Basic Arithmetic
At its most basic function R can be used as a calculator. When applying basic arithmetic, the PEMDAS order
of operations applies: parentheses first followed by exponentiation, multiplication and division, and final
addition and subtraction.
8+9/5^2
## [1] 8.36
8 + 9 / (5 ^ 2)
## [1] 8.36
8 + (9 / 5) ^ 2
## [1] 11.24
(8 + 9) / 5 ^ 2
## [1] 0.68
By default R will display seven digits but this can be changed using options() as previously outlined.
1/7
## [1] 0.1428571
options(digits = 3)
1/7
## [1] 0.143
pi
## [1] 3.141592654
options(digits = 22) pi
## [1] 3.141592653589793115998
We can also perform integer divide (%/%) and modulo (%%) functions. The integer divide function will give
the integer part of a fraction while the modulo will provide the remainder.
42 / 4 # regular division
## [1] 10.5
42 %/% 4 # integer division
## [1] 10
42 %% 4 # modulo (remainder)
## [1] 2
R Objects:-
a) Vectors
b) Lists
c) Matrices
d) Arrays
e) Factors
f) Data Frames
a) Vectors
R Vectors are the same as the arrays in R language which are used to hold multiple data values of the same
type. One major key point is that in R Programming Language the indexing of the vector will start from ‘1’
and not from ‘0’. We can create numeric vectors and character vectors as well.

# R program to create Vectors


# we can use the c function
# to combine the values as a vector.
# By default the type will be double
X<- c(61, 4, 21, 67, 89, 2)
cat('using c function', X, '\n')
# seq() function for creating
# a sequence of continuous values.
# length.out defines the length of vector.
Y<- seq(1, 10, length.out = 5)
cat('using seq() function', Y, '\n')
# use':' to create a vector
# of continuous values.
Z<- 2:7
cat('using colon', Z)
Output:
using c function 61 4 21 67 89 2
using seq() function 1 3.25 5.5 7.75 10
using colon 2 3 4 5 6 7

b) Lists
A list in R is a generic object consisting of an ordered collection of objects. Lists are onedimensional,
heterogeneous data structures. The list can be a list of vectors, a list of matrices, a list of characters and a list
of functions, and so on.

Creating a List
To create a List in R you need to use the function called “list()”. In other words, a list is a generic vector
containing other objects. To illustrate how a list looks, we take an example here. We want to build a list of
employees with the details. So for this, we want attributes such as ID, employee name, and the number of
employees.
# R program to create a List

# The first attributes is a numeric vector


# containing the employee IDs which is created
# using the command here
empId = c(1, 2, 3, 4)
# The second attribute is the employee name
# which is created using this line of code here
# which is the character vector
empName = c("Debi", "Sandeep", "Subham", "Shiba")
# The third attribute is the number of employees
# which is a single numeric variable.
numberOfEmp = 4
# We can combine all these three different
# data types into a list
# containing the details of employees
# which can be done using a list command
empList = list(empId, empName, numberOfEmp)
print(empList)
Output:
[[1]]
[1] 1 2 3 4
[[2]]
[1] "Debi" "Sandeep" "Subham" "Shiba"
[[3]]
[1] 4

c) Matrices
Matrix is a rectangular arrangement of numbers in rows and columns. In a matrix, as we
know rows are the ones that run horizontally and columns are the ones that run vertically.
In R programming, matrices are two-dimensional, homogeneous data structures. These are
some examples of matrices:

To create a matrix in R you need to use the function called matrix(). The arguments to
this matrix() are the set of elements in the vector. You have to pass how many numbers
of rows and how many numbers of columns you want to have in your matrix.
# R program to create a matrix

A = matrix(

# Taking sequence of elements


c(1, 2, 3, 4, 5, 6, 7, 8, 9),

# No of rows
nrow = 3,

# No of columns
ncol = 3,

# By default matrices are in column-wise order


# So this parameter decides how to arrange the matrix
byrow = TRUE
)
# Naming rows
rownames(A) = c("a", "b", "c")

# Naming columns
colnames(A) = c("c", "d", "e")

cat("The 3x3 matrix:\n")


print(A)

Output:
The 3x3 matrix:
c d e
a 1 2 3
b 4 5 6
c 7 8 9
d) Arrays
Arrays are essential data storage structures defined by a fixed number of dimensions. Arrays are used for the
allocation of space at contiguous memory locations.

In R Programming Language Uni-dimensional arrays are called vectors with the length being their only
dimension. Two-dimensional arrays are called matrices, consisting of fixed numbers of rows and columns. R
Arrays consist of all elements of the same data type. Vectors are supplied as input to the function and then
create an array based on the number of dimensions.

Creating an Array
An R array can be created with the use of array() the function. A list of elements is passed to the array()
functions along with the dimensions as required.
Syntax:
array(data, dim = (nrow, ncol, nmat), dimnames=names)
where
nrow: Number of rows
ncol : Number of columns
nmat: Number of matrices of dimensions nrow * ncol
dimnames : Default value = NULL.
Uni-Dimensional Array
A vector is a uni-dimensional array, which is specified by a single dimension, length. A Vector can be
created using ‘c()‘ function. A list of values is passed to the c() function to create a vector.
vec1 <- c(1, 2, 3, 4, 5, 6, 7, 8, 9)
print (vec1)
# cat is used to concatenate
# strings and print it.
cat ("Length of vector : ", length(vec1))

Output:
[1] 1 2 3 4 5 6 7 8 9
Length of vector : 9
Multi-Dimensional Array
A two-dimensional matrix is an array specified by a fixed number of rows and columns, each containing the
same data type. A matrix is created by using array() function to which the values and the dimensions are
passed.
# arranges data from 2 to 13
# in two matrices of dimensions 2x3
arr = array(2:13, dim = c(2, 3, 2))
print(arr)
Output:
,,1
[,1] [,2] [,3]
[1,] 2 4 6
[2,] 3 5 7
,,2
[,1] [,2] [,3]
[1,] 8 10 12
[2,] 9 11 13

e) Factors
Factors in R Programming Language are data structures that are implemented to categorize the data or
represent categorical data and store it on multiple levels.

They can be stored as integers with a corresponding label to every unique integer. The R factors may look
similar to character vectors, they are integers and care must be taken while using them as strings. The R
factor accepts only a restricted number of distinct values. For example, a data field such as gender may
contain values only from female, male, or transgender.

In the above example, all the possible cases are known beforehand and are predefined. These distinct values
are known as levels. After a factor is created it only consists of levels that are by default sorted
alphabetically.
Attributes of Factors in R Language
x: It is the vector that needs to be converted into a factor.
Levels: It is a set of distinct values which are given to the input vector x.
Labels: It is a character vector corresponding to the number of labels.
Exclude: This will mention all the values you want to exclude.
Ordered: This logical attribute decides whether the levels are ordered.
nmax: It will decide the upper limit for the maximum number of levels.

Creating a Factor in R Programming Language


The command used to create or modify a factor in R language is – factor() with a vector as input. The two
steps to creating an R factor :
Creating a vector
Converting the vector created into a factor using function factor()
Examples: Let us create a factor gender with levels female, male and transgender.

# Creating a vector
x <-c("female", "male", "male", "female")
print(x)
# Converting the vector x into a factor
# named gender
gender <-factor(x)
print(gender)
# Creating a factor with levels defined by programmer
gender <- factor(c("female", "male", "male", "female"),
levels = c("female", "transgender", "male"));
gender
Output
[1] female male male female
Levels: female transgender male
Further one can check the levels of a factor by using function levels().
Checking for a Factor in R
The function is.factor() is used to check whether the variable is a factor and returns “TRUE” if it is a factor.
gender <- factor(c("female", "male", "male", "female"));
print(is.factor(gender))
Output
[1] TRUE

g) Data Frames
R Programming Language is an open-source programming language that is widely used as a statistical
software and data analysis tool. Data Frames in R Language are generic data objects of R that are used to
store tabular data.
Data frames can also be interpreted as matrices where each column of a matrix can be of different data types.
R DataFrame is made up of three principal components, the data, rows, and columns.

R – Data Frames
R Data Frames Structure
As you can see in the image below, this is how a data frame is structured.
The data is presented in tabular form, which makes it easier to operate and understand.
Create Dataframe in R Programming Language
To create an R data frame use data.frame() function and then pass each of the vectors you have created as
arguments to the function.
# R program to create dataframe
# creating a data frame
friend.data <- data.frame(
friend_id = c(1:5),
friend_name = c("Sachin", "Sourav",
"Dravid", "Sehwag",
"Dhoni"),
stringsAsFactors = FALSE
)
# print the data frame
print(friend.data)

Output:
friend_id friend_name
1 1 Sachin
2 2 Sourav
3 3 Dravid
4 4 Sehwag
5 5 Dhoni
Conclusion:
This experiment provided a basic introduction to R programming and its objects. By continuing to explore
and practice, you can unlock the potential of this powerful tool for solving engineering problems and
analysing data

You might also like