Programming With R: Lecture #4
Programming With R: Lecture #4
Programming with R
http://www.r-project.org/
http://cran.r-project.org/doc/contrib/Verzani-SimpleR.pdf
Download R and RStudio
Download R :
http://cran.r-project.org/bin/
Download RStudio :
http://www.rstudio.com/ide/download/desktop
Installation
Installing R on windows PC :
Installing R on Linux:
sudo apt-get install r-base-core
Installation
Installing RStudio:
Click on the version recommended for your system, or the latest Windows
version, and save the executable file. Run the .exe file and follow the
installation instructions.
Version
Get R version
R.Version()
• You can type your own program at the prompt line >.
Getting help from R console
help.start()
help(topic)
?topic
??topic
R command in integrated environment
How to use R for simple maths
> 3+5
> 12 + 3 / 4 – 5 + 3*8
> (12 + 3 / 4 – 5) + 3*8
> pi * 2^3 – sqrt(4)
Note
>factorial(4) R ignores spaces
>log(2,10)
>log(2, base=10)
>log10(2)
>log(2)
How to store results of calculations for
future use
> x = 3+5
>x
> y = 12 + 3 / 4 – 5 + 3*8
>y
> z = (12 + 3 / 4 – 5) + 3*8
>z
> A <- 6 + 8 ## no space should be between < & -
>a ## Note: R is case sensitive
>A
Identifiers naming
Don't use underscores ( _ ) or hyphens ( - ) in identifiers.
The preferred form for variable names is all lower case letters
and words separated with dots (variable.name) but
variableName is also accepted.
Examples:
avg.clicks GOOD
avgClicks OK
avg_Clicks BAD
> data3
[1] 4 5 7 8 2 9 4 3
Scan command for making data
> d3 = scan(what = ‘character’) > d3[6]='sat'
1: mon
2: tue
3: wed thu
> d3
5: [1] "mon" "mon" "wed" "thu" NA
"sat"
> d3
[1] "mon" "tue" "wed" "thu" > d3[2]='tue'
> d3[2]
[1] "tue"
> d3[5] = 'fri'
> d3[2]='mon'
> d3
> d3 [1] "mon" "tue" "wed" "thu" "fri"
[1] "mon" "mon" "wed" "thu" "sat"
Concept of working directory
>getwd()
[1] "C:\Users\DSamanta\R\Database"
> setwd('D:\Data Analytics\Project\Database)
> dir() ## working directory listing
>ls() ## Workspace listing of objects
>rm(‘object’) ## Remove an element “object”, if exist
> rm(list = ls()) ## Cleaning
Reading data from a data file
> setwd("D:/arpita/data analytics/my work") #Set the working directory to file location
> getwd()
[1] "D:/arpita/data analytics/my work“
> dir()
[1] "Arv.txt" "DiningAtSFO" "LatentView-DPL" "TC-10-Rec.csv" "TC.csv"
rm(list=ls(all=TRUE)) # Refresh session
> data=read.csv('iris.csv', header = T, sep=",")
(data = read.table(‘iris.csv', header = T, sep = ','))
> ls()
[1] "data"
> str(data)
'data.frame': 149 obs. of 5 variables:
$ X5.1 : num 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 5.4 ...
$ X3.5 : num 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 3.7 ...
$ X1.4 : num 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 1.5 ...
$ X0.2 : num 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 0.2 ...
$ Iris.setosa: Factor w/ 3 levels "Iris-setosa",..: 1 1 1 1 1 1 1 1 1 1 ...
Accessing elements from a file
> data$X5.1
[1] 4.9 4.7 4.6 5.0 5.4 4.6 5.0 4.4 4.9 5.4 4.8 4.8 4.3 5.8 5.7
> data$X5.1[7]=5.2
> data$X5.1
[1] 4.9 4.7 4.6 5.0 5.4 4.6 5.2 4.4 4.9 5.4 4.8 4.8 4.3 5.8 5.7
#Note: This change has happened in workspace only not in the file.
How to make it permanent?
write.csv / write.table
>write.table(data, file =‘iris_mod.csv', row.names = FALSE, sep = ',')
If row.names is TRUE, R adds one ID column in the beginning of file.
So its suggested to use row.names = FALSE option
>write.csv(data, file ==‘iris_mod.csv', row.names = TRUE) ## to test
Different data items in R
Vector
Matrix
Data Frame
List
Vectors in R
>x=c(1,2,3,4,56)
>x
> x[2]
> x = c(3, 4, NA, 5)
>mean(x)
[1] NA
>mean(x, rm.NA=T)
[1] 4
> x = c(3, 4, NULL, 5)
>mean(x)
[1] 4
More on Vectors in R
>y = c(x,c(-1,5),x)
>length(x)
>length(y)
There are useful methods to create long vectors whose elements are in
arithmetic progression:
> x=1:20
>x
If the common difference is not 1 or -1 then we can use the seq function
> y=seq(2,5,0.3)
>y
[1] 2.0 2.3 2.6 2.9 3.2 3.5 3.8 4.1 4.4 4.7 5.0
> length(y)
[1] 11
More on Vectors in R
> x=1:5
It is very easy to
> mean(x) add/subtract/multiply/divide two
[1] 3 vectors entry by entry.
>x > y=c(0,3,4,0)
[1] 1 2 3 4 5 > x+y
> x^2 [1] 1 5 7 4 5
[1] 1 4 9 16 25 > y=c(0,3,4,0,9)
> x+y
> x+1
[1] 1 5 7 4 14
[1] 2 3 4 5 6 Warning message:
> 2*x In x + y : longer object length is not a
[1] 2 4 6 8 10 multiple of shorter object length
> exp(sqrt(x)) > x=1:6
[1] 2.718282 4.113250 5.652234 > y=c(9,8)
7.389056 9.356469 > x+y
[1] 10 10 12 12 14 14
Matrices in R
Same data type/mode – number , character, logical
a.matrix <- matrix(vector, nrow = r, ncol = c, byrow = FALSE,
dimnames = list(char-vector-rownames, char-vector-col-names))
## dimnames is optional argument, provides labels for rows & columns.
> y <- matrix(1:20, nrow = 4, ncol = 5)
>A = matrix(c(1,2,3,4),nrow=2,byrow=T)
>A
>A = matrix(c(1,2,3,4),ncol=2)
>B = matrix(2:7,nrow=2)
>C = matrix(5:2,ncol=2)
>mr <- matrix(1:20, nrow = 5, ncol = 4, byrow = T)
>mc <- matrix(1:20, nrow = 5, ncol = 4)
>mr
>mc
More on matrices in R
>dim(B) #Dimension
>nrow(B)
>ncol(B)
>A+C
>A-C
>A%*%C #Matrix multiplication. Where will be the result?
>A*C #Entry-wise multiplication
>t(A) #Transpose
>A[1,2]
>A[1,]
>B[1,c(2,3)]
>B[,-1]
Lists in R
Vectors and matrices in R are two ways to work with a
collection of objects.
>names(x)
>x$name