[go: up one dir, main page]

0% found this document useful (0 votes)
7 views52 pages

1mod References

R is a programming language and environment widely used for statistical computing, data analytics, and scientific research, known for its expressive syntax and ease of use. It features a variety of data structures such as vectors, lists, matrices, and data frames, and offers extensive tools for data analysis and visualization. R is open-source, runs on multiple platforms, and is increasingly popular in fields like Data Science and Machine Learning.

Uploaded by

Pruthviraj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views52 pages

1mod References

R is a programming language and environment widely used for statistical computing, data analytics, and scientific research, known for its expressive syntax and ease of use. It features a variety of data structures such as vectors, lists, matrices, and data frames, and offers extensive tools for data analysis and visualization. R is open-source, runs on multiple platforms, and is increasingly popular in fields like Data Science and Machine Learning.

Uploaded by

Pruthviraj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

R Programming

Introduction, How to run R, R Sessions and Functions, Basic Math, Variables, Data Types, Vectors,
Conclusion, Advanced Data Structures, Data Frames, Lists, Matrices, Arrays, Classes

Introduction:

R is a programming language and environment commonly used in statistical computing, data


analytics and scientific research. It is one of the most popular languages used by statisticians, data
analysts, researchers and marketers to retrieve, clean, analyze, visualize and present data. Due to its
expressive syntax and easy-to-use interface, it has grown in popularity in recent years.

• R is a programming language and software environment for statistical analysis, graphics


representation and reporting. R was created by Ross Ihaka and Robert Gentleman at the
University of Auckland, New Zealand, and is currently developed by the R
DevelopmentCore Team.
• The core of R is an interpreted computer language which allows branching and looping
as well as modular programming using functions.
• R allows integration with the procedures written in the C, C++, .Net, Python or FORTRAN
languages for efficiency.
• R is freely available under the GNU General Public License, and precompiled binary
versions are provided for various operating systems like Linux, Windows and Mac. R is
free software distributed under a GNU-style copy left, and an official part of the GNU
project called GNU S

Features of R

As stated earlier, R is a programming language and software environment for statistical analysis,
graphics representation and reporting.

The following are the important features of R:

• R is a well-developed, simple and effective programming language which includes


conditionals, loops, user defined recursive functions and input and output facilities.
• R has an effective data handling and storage facility,
• R provides a suite of operators for calculations on arrays, lists, vectors and matrices.
• R provides a large, coherent and integrated collection of tools for data analysis
• R provides graphical facilities for data analysis and display either directly at the computer or
printing at the papers.

Why use R

• R is an open-source programming language and software environment for statistical computing and
graphics.

• R is an object-oriented programming environment, much more than most other statistical software
packages.
• R is a comprehensive statistical platform, offering all manner of data-analytic techniques – any type
of data analysis can be done in R.

• R has state-of-the-art graphics capabilities- visualize complex data.

• R is a powerful platform for interactive data analysis and exploration.

• Getting data into a usable form from multiple sources.

• R functionality can be integrated into applications written in other languages, including C++, Java,
Python, PHP, SAS and SPSS.

• R runs on a wide array of platforms, including Windows, Unix and Mac OS X. • R is extensible; can
be expanded by installing “packages”

Why use R for statistical computing and graphics?

1. R is open source and free! R is free to download as it is licensed under the terms of GNU General
Public license. You can look at the source to see what’s happening under the hood. There’s more,
most R packages are available under the same license so you can use them, even in commercial
applications without having to call your lawyer.

2. R is popular - and increasing in popularity IEEE publishes a list of the most popular programming
languages each year. R was ranked 5th in 2016, up from 6th in 2015. It is a big deal for a domain-
specific language like R to be more popular than a general purpose language like C#. This not only
shows the increasing interest in R as a programming language, but also of the fields like Data Science
and Machine Learning where R is commonly used.

3. R runs on all platforms You can find distributions of R for all popular platforms - Windows, Linux
and Mac. R code that you write on one platform can easily be ported to another without any issues.
Cross-platform interoperability is an important feature to have in today’s computing world - even
Microsoft is making its coveted .NET platform available on all platforms after realizing the benefits of
technology that runs on all systems.

4. Learning R will increase your chances of getting a job According to the Data Science Salary Survey
conducted by O’Reilly Media in 2014, data scientists are paid a median of $98,000 worldwide.
Getting help in R

To get help on specific topics, we can use the help() function along with the topic we want to search.
We can also use the

? operator for this.

> help(Syntax)

> ?Syntax We also have the help.search() function to do a search engine type of search. We could use
the ?? operator for this.

> help.search("histograms")

> ??"histograms" You must be itching to start learning R by now. Our collection of R tutorials will help
you learn R. Whether you are a beginner or an expert, each tutorial explains the relevant concepts
and syntax with easy-to-understand examples.

Working with R session Once we are inside the R session, we can directly execute R language
commands by typing them line by line. Pressing the enter key terminates typing of command and
brings the > prompt again. In the example session below, we declare 2 variables 'a' and 'b' to have
values 5 and 6 respectively, and assign their sum to another variable called 'c':

> a=5

> b=6
> c=a+b

> c The value of the variable 'c' is printed as, [1] 11 In R session, typing a variable name prints its
value on the screen.

Get help inside R session To get help on any function of R, type help(function-name) in R prompt.

For example,

if we need help on "if" logic, type,

> help("if") then, help lines for the "if" statement are printed. Exit the R session To exit the R session,
type quit() in the R prompt, and say 'n' (no) for saving the workspace image. This means, we do not
want to save the memory of all the commands we typed in the current session:

> quit() Save workspace image? [y/n/c]: n >

Getting and setting the current working directories

From R prompt, we can get information about the current working directory using getwd()
command:

> getwd() [1] "/home/user" Similarly, we can set the current wor directory by calling setwd()
function:

> setwd("/home/user/prog") After this, "/home/user/prog" will be the working directory.

Comments :

Comments are like helping text in your R program and they are ignored by the interpreter while
executing your actual program. Single comment is written using # in the beginning of the statement
as follows:

# My first program in R Programming R does not support multi-line comments but you can perform a
trick which is something as follows:

if(FALSE)

{ "This is a demo for multi-line comments and it should be put inside either a single of double quote"
} myString <- "Hello, World!"

print ( myString)

Though above comments will be executed by R interpreter, they will not interfere with your actual
program. You should put such comments inside, either single or double quote.

Reserved words in R programming:

Reserved words in R programming are a set of words that have special meaning and cannot be used
as an identifier (variable name, function name etc.). Here is a list of reserved words in the R's parser.
Inf is for "Infinity", for example when 1 is divided by 0 whereas NaN is for "Not a Number", for
example when 0 is divided by 0. NA stands for "Not Available" and is used to represent missing
values.

R is a case sensitive language. Which mean that TRUE and True are not the same.

Variables in R:

Variables are used to store data, whose value can be changed according to our need. Unique name
given to variable (function and objects as well) is identifier.

Rules for writing Identifiers in R

1. Identifiers can be a combination of letters, digits, period (.) and underscore (_).

2. It must start with a letter or a period. If it starts with a period, it cannot be followed by a digit.

3. Reserved words in R cannot be used as identifiers.

Valid identifiers in R:

total, Sum, .fine.with.dot, this_is_acceptable, Number5

Invalid identifiers in R:

tot@l, 5um, _fine, TRUE, .0ne

Constants in R
Constants, as the name suggests, are entities whose value cannot be altered. Basic types of constant
are numeric constants and character constants.
R DATA TYPES:

In contrast to other programming languages like C and java in R, the variables are not declared as
some data type. The variables are assigned with R-Objects and the data type of the R-object
becomes the data type of the variable. There are many types of R objects.

The frequently used ones are –

• Vectors
• Lists
• Matrices
• Arrays
• Factors
• Data Frames

The simplest of these objects is the vector object and there are six data types of these atomic
vectors, also termed as six classes of vectors. The other R-Objects are built upon the atomic vectors.
Understanding basic data types in R

• To make the best of the R language, you'll need a strong understanding of the basic data
types and data structures and how to operate on those.
• Very Important to understand because these are the things you will manipulate on a day-to-
day basis in R. Most common source of frustration among beginners.
• Everything in R is an object. R has 5 basic atomic classes
• logical (e.g., TRUE, FALSE)
• integer (e.g,, 2L, as.integer(3))
• numeric (real or decimal) (e.g, 2, 2.0, pi)
• complex (e.g, 1 + 0i, 1 + 4i)
• character (e.g, "a", "swc")
typeof() # what is it? class() # what is it? (sorry) storage.mode() # what is it? (very sorry) length() #
how long is it? What about two dimensional objects? attributes() # does it have any metadata?

R also has many data structures.

These include

• vector
• list
• matrix
• data frame
• factors (we will avoid these, but they have their uses)
• tables

Various examples:

x <- c(1, 2, 3) x is a numeric vector.

These are the most common kind.

They are numeric objects and are treated as double precision real numbers. To explicitly create
integers, add a L at the end.

x1 <- c(1L, 2L, 3L)

Various examples:

x <- c(1, 2, 3) x is a numeric vector.

These are the most common kind.


They are numeric objects and are treated as double precision real numbers.

To explicitly create integers, add a L at the end. x1 <- c(1L, 2L, 3L)

Matrix

Matrices are a special vector in R. They are not a separate class of object but simply a vector but
now with dimensions added on to it.

Matrices have rows and columns. m <- matrix(nrow = 2, ncol = 2) m dim(m) same as attributes(m)

Matrices are constructed columnwise.

m <- matrix(1:6, nrow=2, ncol =3)

Other ways to construct a matrix m <- 1:10 dim(m) <- c(2,5)

This takes a vector and transform into a matrix with 2 rows and 5 columns.

Another way is to bind columns or rows using cbind() and rbind().

x <- 1:3 y <- 10:12 cbind(x,y) # or rbind(x,y)


X
Multiplication table:
How to create vector in R:
How to access element of vector:
How to modify a matrix in R?
DATA FRAME:
How to access components of a DATAFRAME:
ARRAYS:
Manipulating Array Elements As array is made up matrices in multiple dimensions, the operations on
elements of array are carried out by accessing elements of the matrices.
Calculations across Array Elements We can do calculations across the elements in an array using the
apply()function.
For example array with 4 columns and 3 rows, and two tables like this,

R FUNCTIONS:

You might also like