1mod References
1mod References
Introduction, How to run R, R Sessions and Functions, Basic Math, Variables, Data Types, Vectors,
Conclusion, Advanced Data Structures, Data Frames, Lists, Matrices, Arrays, Classes
Introduction:
Features of R
As stated earlier, R is a programming language and software environment for statistical analysis,
graphics representation and reporting.
Why use R
• R is an open-source programming language and software environment for statistical computing and
graphics.
• R is an object-oriented programming environment, much more than most other statistical software
packages.
• R is a comprehensive statistical platform, offering all manner of data-analytic techniques – any type
of data analysis can be done in R.
• R functionality can be integrated into applications written in other languages, including C++, Java,
Python, PHP, SAS and SPSS.
• R runs on a wide array of platforms, including Windows, Unix and Mac OS X. • R is extensible; can
be expanded by installing “packages”
1. R is open source and free! R is free to download as it is licensed under the terms of GNU General
Public license. You can look at the source to see what’s happening under the hood. There’s more,
most R packages are available under the same license so you can use them, even in commercial
applications without having to call your lawyer.
2. R is popular - and increasing in popularity IEEE publishes a list of the most popular programming
languages each year. R was ranked 5th in 2016, up from 6th in 2015. It is a big deal for a domain-
specific language like R to be more popular than a general purpose language like C#. This not only
shows the increasing interest in R as a programming language, but also of the fields like Data Science
and Machine Learning where R is commonly used.
3. R runs on all platforms You can find distributions of R for all popular platforms - Windows, Linux
and Mac. R code that you write on one platform can easily be ported to another without any issues.
Cross-platform interoperability is an important feature to have in today’s computing world - even
Microsoft is making its coveted .NET platform available on all platforms after realizing the benefits of
technology that runs on all systems.
4. Learning R will increase your chances of getting a job According to the Data Science Salary Survey
conducted by O’Reilly Media in 2014, data scientists are paid a median of $98,000 worldwide.
Getting help in R
To get help on specific topics, we can use the help() function along with the topic we want to search.
We can also use the
> help(Syntax)
> ?Syntax We also have the help.search() function to do a search engine type of search. We could use
the ?? operator for this.
> help.search("histograms")
> ??"histograms" You must be itching to start learning R by now. Our collection of R tutorials will help
you learn R. Whether you are a beginner or an expert, each tutorial explains the relevant concepts
and syntax with easy-to-understand examples.
Working with R session Once we are inside the R session, we can directly execute R language
commands by typing them line by line. Pressing the enter key terminates typing of command and
brings the > prompt again. In the example session below, we declare 2 variables 'a' and 'b' to have
values 5 and 6 respectively, and assign their sum to another variable called 'c':
> a=5
> b=6
> c=a+b
> c The value of the variable 'c' is printed as, [1] 11 In R session, typing a variable name prints its
value on the screen.
Get help inside R session To get help on any function of R, type help(function-name) in R prompt.
For example,
> help("if") then, help lines for the "if" statement are printed. Exit the R session To exit the R session,
type quit() in the R prompt, and say 'n' (no) for saving the workspace image. This means, we do not
want to save the memory of all the commands we typed in the current session:
From R prompt, we can get information about the current working directory using getwd()
command:
> getwd() [1] "/home/user" Similarly, we can set the current wor directory by calling setwd()
function:
Comments :
Comments are like helping text in your R program and they are ignored by the interpreter while
executing your actual program. Single comment is written using # in the beginning of the statement
as follows:
# My first program in R Programming R does not support multi-line comments but you can perform a
trick which is something as follows:
if(FALSE)
{ "This is a demo for multi-line comments and it should be put inside either a single of double quote"
} myString <- "Hello, World!"
print ( myString)
Though above comments will be executed by R interpreter, they will not interfere with your actual
program. You should put such comments inside, either single or double quote.
Reserved words in R programming are a set of words that have special meaning and cannot be used
as an identifier (variable name, function name etc.). Here is a list of reserved words in the R's parser.
Inf is for "Infinity", for example when 1 is divided by 0 whereas NaN is for "Not a Number", for
example when 0 is divided by 0. NA stands for "Not Available" and is used to represent missing
values.
R is a case sensitive language. Which mean that TRUE and True are not the same.
Variables in R:
Variables are used to store data, whose value can be changed according to our need. Unique name
given to variable (function and objects as well) is identifier.
1. Identifiers can be a combination of letters, digits, period (.) and underscore (_).
2. It must start with a letter or a period. If it starts with a period, it cannot be followed by a digit.
Valid identifiers in R:
Invalid identifiers in R:
Constants in R
Constants, as the name suggests, are entities whose value cannot be altered. Basic types of constant
are numeric constants and character constants.
R DATA TYPES:
In contrast to other programming languages like C and java in R, the variables are not declared as
some data type. The variables are assigned with R-Objects and the data type of the R-object
becomes the data type of the variable. There are many types of R objects.
• Vectors
• Lists
• Matrices
• Arrays
• Factors
• Data Frames
The simplest of these objects is the vector object and there are six data types of these atomic
vectors, also termed as six classes of vectors. The other R-Objects are built upon the atomic vectors.
Understanding basic data types in R
• To make the best of the R language, you'll need a strong understanding of the basic data
types and data structures and how to operate on those.
• Very Important to understand because these are the things you will manipulate on a day-to-
day basis in R. Most common source of frustration among beginners.
• Everything in R is an object. R has 5 basic atomic classes
• logical (e.g., TRUE, FALSE)
• integer (e.g,, 2L, as.integer(3))
• numeric (real or decimal) (e.g, 2, 2.0, pi)
• complex (e.g, 1 + 0i, 1 + 4i)
• character (e.g, "a", "swc")
typeof() # what is it? class() # what is it? (sorry) storage.mode() # what is it? (very sorry) length() #
how long is it? What about two dimensional objects? attributes() # does it have any metadata?
These include
• vector
• list
• matrix
• data frame
• factors (we will avoid these, but they have their uses)
• tables
Various examples:
They are numeric objects and are treated as double precision real numbers. To explicitly create
integers, add a L at the end.
Various examples:
To explicitly create integers, add a L at the end. x1 <- c(1L, 2L, 3L)
Matrix
Matrices are a special vector in R. They are not a separate class of object but simply a vector but
now with dimensions added on to it.
Matrices have rows and columns. m <- matrix(nrow = 2, ncol = 2) m dim(m) same as attributes(m)
This takes a vector and transform into a matrix with 2 rows and 5 columns.
R FUNCTIONS: