[go: up one dir, main page]

0% found this document useful (0 votes)
45 views17 pages

Big Data - Lab 1

R is a statistical programming language used for statistical analysis and graphics. It allows users to implement statistical procedures, develop statistical software, and provide excellent graphics functionality, making it a good starting point for data analysis projects. The document discusses getting started with R, including downloading and installing R, and provides an overview of basic R concepts such as objects, naming conventions, assignment, built-in functions, vectors, lists, data frames, and control statements.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views17 pages

Big Data - Lab 1

R is a statistical programming language used for statistical analysis and graphics. It allows users to implement statistical procedures, develop statistical software, and provide excellent graphics functionality, making it a good starting point for data analysis projects. The document discusses getting started with R, including downloading and installing R, and provides an overview of basic R concepts such as objects, naming conventions, assignment, built-in functions, vectors, lists, data frames, and control statements.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Big Data

First Section
Lab Outcomes
1. What is R
2. Invoke the R environment and examine the R
workspace
3. R Basics
Introduction to R
What is R
▪ R is a free open-source package based on the
S language developed by Bell Labs
▪ Statistical Programming Language used to
develop statistical software
▪ Used by statisticians and data miners
▪ Many statistical functions are already built in
Why R

Implement statistical procedures

Provide excellent graphics functionality

Excellent start for data analysis projects.


Getting Started
• Where to get R?
• Go to www.r-project.org
• Downloads: CRAN
• Set your Mirror: Anyone in the USA is fine.
• Select Windows 95 or later.
• Download latest R version.
R Basics
▪ Objects
▪ Naming Convention
▪ Assignment
▪ Built-In Functions
▪ Example Objects: Vectors, Lists, Data Frames
▪ Control Statements
▪ Functions
▪ Workspace
Objects
• Objects should have names
• Object Types: vector, matrix … etc.
• Object Attributes
– mode: numeric, character, boolean
– length: number of elements in object
• Object Values
– assign a value
– create a blank object
Naming Convention
• Must start with a letter (A-Z or a-z)
• Can contain letters, digits (0-9), and/or periods
“.” ex: Var1.1
• Case-sensitive
– mydata different from MyData
• Can’t start with underscore“_”
Assignment
“<-” used to indicate assignment
▪ > x<-1
▪ > y<-3
▪ > z<-4
▪ > x*y*z
[1] 12
Note: Type determined automatically when variable is created
with "<-" operator
Built-In Functions
▪ Actions can be performed on objects using
functions
▪ Have arguments and options
▪ Provide a result
▪ Parentheses () are used to specify that a
function is being called.
Example for Functions
> Z<-rep (1,10)
[1] 1 1 1 1 1 1 1 1 1 1
> Y<- seq (2,6)
[1] 2 3 4 5 6
> W<- seq (4,20, by=4)
[1] 4 8 12 16 20
> x <- c (2,0,0,4)
>x*4
[1] 8 0 016
>sqrt(x)
[1] 1.41 0.00 0.00 2.00
Objects | Vectors
• A series of numbers
• Created with:
– c() to concatenate elements or sub-vectors
– rep() to repeat elements or patterns
– seq() or m:n to generate sequences
• Example:
– X <- c(2,0,0,9)
– Y <- seq(2,5) #sequence of integers between 2 & 5
– Z <- rep(1,4) #repeat the number 1, 4 times
– X+Y+Z
– ?*
Objects | Accessing Vectors
> x <- c (2,0,0,4)
> x [1] # Select the first element, equivalent to x[c(1)]
[1] 2
x [-1] # Exclude the first element
[1] 0 0 4
> x [1] <- 3 ; x
[1] 3 0 0 4
> x [-1] = 5 ; x
[1] 3 5 5 5
>x<5
[1] TRUE FALSE FALSE FALSE
> x [x<5] = 2 #Edits elements meeting condition
[1] 2 5 5 5
Objects | Data Frames
▪ A group or collection of Vectors
▪ Most of the time, when data is loaded, it will be
organized in a data frame

Example:
>DF <- data.frame (h=c(150,160), w=c(65,72))
>DF
h w
1 150 65
2 160 72
Objects | Accessing Data Frames
> DF[1] > DF[2]
h w
1 150 1 65
2 160 2 72

> DF[1,] > DF[2,]


h w h w
150 65 2 160 72

Q1: DF[1,2] ?
Q2:DF[-1,2] ?
Thank You

You might also like