[go: up one dir, main page]

0% found this document useful (0 votes)
9 views39 pages

UNIT I R PROGRAMMING LANGUAGE

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 39

R PROGRAMMING UNIT I

R PROGRAMMING LANGUAGE – INTRODUCTION


R is an open-source programming language that is widely used as a statistical software and
data analysis tool. R generally comes with the Command-line interface. R is available across
widely used platforms like Windows, Linux, and macOS. Also, the R programming language is
the latest cutting-edge tool.
It was designed by Ross Ihaka and Robert Gentleman at the University of Auckland, New
Zealand, and is currently developed by the R Development Core Team. R programming
language is an implementation of the S programming language. It also combines with lexical
scoping semantics inspired by Scheme. Moreover, the project conceives in 1992, with an initial
version released in 1995 and a stable beta version in 2000.
The syntax of R consists of three items:
 Variables, which store data
 Comments, which are used to improve code readability
 Keywords, reserved words that have a special meaning for the compiler
R was developed in 1993 by Ross Ihaka and Robert Gentleman and includes linear regression,
machine learning algorithms, statistical inference, time series, and more.
R is a universal programming language compatible with the Windows, Macintosh, UNIX, and
Linux platforms. It is often referred to as a different implementation of the S language and
environment and is considered highly extensible.
**********
WHAT IS R?
R is a language and environment for statistical computing and graphics. It is a GNU
project which is similar to the S language and environment which was developed at Bell
Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues. R
can be considered as a different implementation of S. There are some important differences, but
much code written for S runs unaltered under R.
R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests,
time-series analysis, classification, clustering, …) and graphical techniques, and is highly
extensible. The S language is often the vehicle of choice for research in statistical methodology,
and R provides an Open Source route to participation in that activity.
One of R’s strengths is the ease with which well-designed publication-quality plots can be
produced, including mathematical symbols and formulae where needed. Great care has been
taken over the defaults for the minor design choices in graphics, but the user retains full control.
R is available as Free Software under the terms of the Free Software Foundation’s GNU General
Public License in source code form. It compiles and runs on a wide variety of UNIX platforms
and similar systems (including FreeBSD and Linux), Windows and MacOS.
The R environment
R is an integrated suite of software facilities for data manipulation, calculation and graphical
display. It includes
 an effective data handling and storage facility,
 a suite of operators for calculations on arrays, in particular matrices,
 a large, coherent, integrated collection of intermediate tools for data analysis,
 graphical facilities for data analysis and display either on-screen or on hardcopy, and
 a well-developed, simple and effective programming language which includes
conditionals, loops, user-defined recursive functions and input and output facilities.

DEEN COLLEGE OF ARTS AND SCIENCE Page 1


R PROGRAMMING UNIT I

The term “environment” is intended to characterize it as a fully planned and coherent system,
rather than an incremental accretion of very specific and inflexible tools, as is frequently the case
with other data analysis software.
R, like S, is designed around a true computer language, and it allows users to add additional
functionality by defining new functions. Much of the system is itself written in the R dialect of S,
which makes it easy for users to follow the algorithmic choices made. For computationally-
intensive tasks, C, C++ and Fortran code can be linked and called at run time. Advanced users
can write C code to manipulate R objects directly.
Many users think of R as a statistics system. We prefer to think of it as an environment within
which statistical techniques are implemented. R can be extended (easily) via packages. There are
about eight packages supplied with the R distribution and many more are available through the
CRAN family of Internet sites covering a very wide range of modern statistics.
R has its own LaTeX-like documentation format, which is used to supply comprehensive
documentation, both on-line in a number of formats and in hardcopy.
**********
WHY USE R?
 It is a great resource for data analysis, data visualization, data science and machine
learning
 It provides many statistical techniques (such as statistical tests, classification, clustering
and data reduction)
 It is easy to draw graphs in R, like pie charts, histograms, box plot, scatter plot, etc++
 It works on different platforms (Windows, Mac, Linux)
 It is open-source and free
 It has a large community support
 It has many packages (libraries of functions) that can be used to solve different problems
R is the most popular language in the world of Data Science. It is heavily used in analyzing data
that is both structured and unstructured. This has made R, the standard language for performing
statistical operations. R allows various features that set it apart from other Data Science
languages. In this article, we will explain why you must learn R and how it will benefit you in
the domain of Data Science.

DEEN COLLEGE OF ARTS AND SCIENCE Page 2


R PROGRAMMING UNIT I

There are various reasons to learn R, we have listed the major ones that will surely answer your
question to why learn R.
1. Why R is important for Data Science?
R plays a very important role in Data Science, you will be benefited with following operations in
R.
 You can run your code without any compiler – R is an interpreted language. Hence we
can run code without any compiler. R interprets the code and makes the development of
code easier.
 Many calculations done with vectors – R is a vector language, so anyone can add
functions to a single Vector without putting in a loop. Hence, R is powerful and faster than
other languages.
 Statistical Language – R used in biology, genetics as well as in statistics. R is a turning
complete language where any type of task can perform.
2. Why R is Good for Business?
R will just not help you in the technical fields, it will also be a great help in your business.
 Here, the major reason is that R is open-source, therefore it can be modified and
redistributed as per the user’s need. It is great for visualization and has far more capabilities
as compared to other tools.
 For data-driven businesses, lack of Data Scientists is a huge concern. Companies are using
R programming as their core platform and are recruiting trained R programmers.
3. R is a gateway to Lucrative Career
R language is used extensively in Data Science. This field offers some of the highest-paying jobs
in the world today. Data Scientists who are proficient in R make more than $117,000 (Rs
80,56,093) on an average per year. If you want to enter the field of Data Science and earn a
lucrative salary, then you must definitely learn R.
Wondering why R is important for Data Science? Then, do check out the article on – Reasons
to Choose R for Data Science
4. Open-source
R is an open-source language. It is maintained by a community of active users and you can avail
R for free. You can modify various functions in R and make your own packages. Since R is
issued under the General Public Licence (GNU), there are no restrictions on its usage.
5. Popularity
R has become one of the most popular programming languages in the industries. Conventionally,
R was mostly used in academia but with the emergence of Data Science, the need for R in the
industries became evident. R is used at Facebook for social network analysis. It is being used at
Twitter for semantic analysis as well as visualizations.
6. Robust Visualization Library
R comprises of libraries like ggplot2, plotly that offer aesthetic graphical plots to its users. R is
most widely recognized for its stunning visualizations which gives it an edge over other Data
Science programming languages.
7. With R, you can develop amazing Web-Apps
R provides you with the ability to build aesthetic web-applications. Using the R Shiny package,
you can develop interactive dashboards straight from the console of your R IDE. Using this, you
can embed your visualizations and enhance the storytelling of your data analysis through
aesthetic visualizations.
Any queries in why learn R article till now? Please comment below.

DEEN COLLEGE OF ARTS AND SCIENCE Page 3


R PROGRAMMING UNIT I

8. R enjoys a vast Community Support


R Programming is supported by a vast community that maintains and updates R. If you face any
trouble with the code in R, you can avail the support of the community on places like Stack
Overflow (of course you can also ask us any queries in the comment section below, DataFlair is
always there for you!). There are several communities around the world that
organize bootcamps and R meetups.
9. A go-to language for Statistics and Data Science
R is the standard language for Statistics and Data Science. R was developed for statistics, by
statisticians. It has been in use even before the word “Data Science” was coined. Statisticians
and Data Scientists are most familiar with R than any other programming language. R facilitates
various statistical operations through its thousands of packages.
Its the right time to be aware of Statistical Programming in R
10. R is being used in almost every industry
R is one of the most widely used programming languages in the world today. It is used in almost
every industry, ranging from finance, banking to medicine and manufacturing. R is used
for portfolio management, risk analytics in finance and banking industries. It is used for carrying
out an analysis of drug discovery and genomic analysis in bioinformatics. R is also used to
implement various statistical measures to optimize industrial processes.
**********
FEATURES OF R PROGRAMMING LANGUAGE
Statistical Features of R:
 Basic Statistics: The most common basic statistics terms are the mean, mode, and median.
These are all known as “Measures of Central Tendency.” So using the R language we can
measure central tendency very easily.
 Static graphics: R is rich with facilities for creating and developing interesting static
graphics. R contains functionality for many plot types including graphic maps, mosaic
plots, biplots, and the list goes on.
 Probability distributions: Probability distributions play a vital role in statistics and by
using R we can easily handle various types of probability distribution such as Binomial
Distribution, Normal Distribution, Chi-squared Distribution and many more.
 Data analysis: It provides a large, coherent and integrated collection of tools for data
analysis.
Programming Features of R:
 R Packages: One of the major features of R is it has a wide availability of libraries. R has
CRAN(Comprehensive R Archive Network), which is a repository holding more than 10,
0000 packages.
 Distributed Computing: Distributed computing is a model in which components of a
software system are shared among multiple computers to improve efficiency and
performance. Two new packages ddR and multidplyr used for distributed programming in
R were released in November 2015.
Programming in R:
Since R is much similar to other widely used languages syntactically, it is easier to code and
learn in R. Programs can be written in R in any of the widely used IDE like R Studio, Rattle,
Tinn-R, etc. After writing the program save the file with the extension .r. To run the program
use the following command on the command line:
R file_name.r
Example:
DEEN COLLEGE OF ARTS AND SCIENCE Page 4
R PROGRAMMING UNIT I

R
# R program to print Welcome to GFG!
# Below line will print "Welcome to GFG!"
cat("Welcome to GFG!")
Output:
Welcome to GFG!
**********
R ADVANTAGES AND DISADVANTAGES
R is the most popular programming language for statistical modeling and analysis. Like other
programming languages, R also has some advantages and disadvantages. It is a continuously
evolving language which means that many cons will slowly fade away with future updates to R.
There are the following pros and cons of R

Pros
1) Open Source
An open-source language is a language on which we can work without any need for a license or
a fee. R is an open-source language. We can contribute to the development of R by optimizing
our packages, developing new ones, and resolving issues.
2) Platform Independent
R is a platform-independent language or cross-platform programming language which means its
code can run on all operating systems. R enables programmers to develop software for several

DEEN COLLEGE OF ARTS AND SCIENCE Page 5


R PROGRAMMING UNIT I

competing platforms by writing a program only once. R can run quite easily on Windows, Linux,
and Mac.
3) Machine Learning Operations
R allows us to do various machine learning operations such as classification and regression. For
this purpose, R provides various packages and features for developing the artificial neural
network. R is used by the best data scientists in the world.
4) Exemplary support for data wrangling
R allows us to perform data wrangling. R provides packages such as dplyr, readr which are
capable of transforming messy data into a structured form.
5) Quality plotting and graphing
R simplifies quality plotting and graphing. R libraries such as ggplot2 and plotly advocates for
visually appealing and aesthetic graphs which set R apart from other programming languages.
6) The array of packages
R has a rich set of packages. R has over 10,000 packages in the CRAN repository which are
constantly growing. R provides packages for data science and machine learning operations.
7) Statistics
R is mainly known as the language of statistics. It is the main reason why R is predominant than
other programming languages for the development of statistical tools.
8) Continuously Growing
R is a constantly evolving programming language. Constantly evolving means when something
evolves, it changes or develops over time, like our taste in music and clothes, which evolve as we
get older. R is a state of the art which provides updates whenever any new feature is added.
Cons
1) Data Handling
In R, objects are stored in physical memory. It is in contrast with other programming languages
like Python. R utilizes more memory as compared to Python. It requires the entire data in one
single place which is in the memory. It is not an ideal option when we deal with Big Data.
2) Basic Security
R lacks basic security. It is an essential part of most programming languages such as Python.
Because of this, there are many restrictions with R as it cannot be embedded in a web-
application.
3) Complicated Language
R is a very complicated language, and it has a steep learning curve. The people who don't have
prior knowledge or programming experience may find it difficult to learn R.
4) Weak Origin
The main disadvantage of R is, it does not have support for dynamic or 3D graphics. The reason
behind this is its origin. It shares its origin with a much older programming language "S."
5) Lesser Speed
R programming language is much slower than other programming languages such as MATLAB
and Python. In comparison to other programming language, R packages are much slower.
In R, algorithms are spread across different packages. The programmers who have no prior
knowledge of packages may find it difficult to implement algorithms.
**********
GETTING STARTED WITH R AND RSTUDIO
Getting Started with RStudio
RStudio is an open-source tool for programming in R. RStudio is a flexible tool that helps you
create readable analyses, and keeps your code, images, comments, and plots together in one
DEEN COLLEGE OF ARTS AND SCIENCE Page 6
R PROGRAMMING UNIT I

place. It’s worth knowing about the capabilities of RStudio for data analysis and programming in
R.

Using RStudio for data analysis and programming in R provides many advantages. Here are a
few examples of what RStudio provides:
 An intuitive interface that lets us keep track of saved objects, scripts, and figures
 A text editor with features like color-coded syntax that helps us write clean scripts
 Auto complete features save time
 Tools for creating documents containing a project’s code, notes, and visuals
 Dedicated Project folders to keep everything in one place
RStudio can also be used to program in other languages including SQL, Python, and Bash, to
name a few.
But before we can install RStudio, we’ll need to have a recent version of R installed on our
computer.
1. Install R
R is available to download from the official R website. Look for this section of the web page:

The version of R to download depends on our operating system. Below, we include installation
instructions for Mac OS X, Windows, and Linux (Ubuntu).
MAC OS X
 Select the Download R for (Mac) OSX option.
 Look for the most up-to-date version of R (new versions are released frequently and appear
toward the top of the page) and click the .pkg file to download.
 Open the .pkg file and follow the standard instructions for installing applications on MAC OS X.
 Drag and drop the R application into the Applications folder.
Windows
DEEN COLLEGE OF ARTS AND SCIENCE Page 7
R PROGRAMMING UNIT I

 Select the Download R for Windows option.


 Select base, since this is our first installation of R on our computer.
 Follow the standard instructions for installing programs for Windows. If we are asked to
select Customize Startup or Accept Default Startup Options, choose the default options.
Linux/Ubuntu
 Select the Download R for Linux option.
 Select the Ubuntu option.
 Alternatively, select the Linux package management system relevant to you if you are not
using Ubuntu.
RStudio is compatible with many versions of R (R version 3.0.1 or newer as of July, 2020).
Installing R separately from RStudio enables the user to select the version of R that fits their
needs.
2. Install RStudio
Now that R is installed, we can install RStudio. Navigate to the RStudio downloads page.
When we reach the RStudio downloads page, let’s click the “Download” button of the RStudio
Desktop Open Source License Free option:

Our operating system is usually detected automatically and so we can directly download the
correct version for our computer by clicking the “Download RStudio” button. If we want to
download RStudio for another operating system (other than the one we are running), navigate
down to the “All installers” section of the page.

DEEN COLLEGE OF ARTS AND SCIENCE Page 8


R PROGRAMMING UNIT I

3. First Look at RStudio


When we open RStudio for the first time, we’ll probably see a layout like this:

But the background color will be white, so don’t expect to see this blue-colored background the
first time RStudio is launched. Check out this Dataquest blog to learn how to customize the
appearance of RStudio.
When we open RStudio, R is launched as well. A common mistake by new users is to open R
instead of RStudio. To open RStudio, search for RStudio on the desktop, and pin the RStudio
icon to the preferred location (e.g. Desktop or toolbar).
4. The Console
Let’s start off by introducing some features of the Console. The Console is a tab in RStudio
where we can run R code.
Notice that the window pane where the console is located contains three
tabs: Console, Terminal and Jobs (this may vary depending on the version of RStudio in use).
We’ll focus on the Console for now.
When we open RStudio, the console contains information about the version of R we’re working
with. Scroll down, and try typing a few expressions like this one. Press the enter key to see the
result.
1+2
As we can see, we can use the console to test code immediately. When we type an expression
like 1 + 2, we’ll see the output below after hitting the enter key.

DEEN COLLEGE OF ARTS AND SCIENCE Page 9


R PROGRAMMING UNIT I

We can store the output of this command as a variable. Here, we’ve named our variable result:
result <- 1 + 2
The <- is called the assignment operator. This operator assigns values to variables. The command
above is translated into a sentence as:
> The result variable gets the value of one plus two.
One nice feature from RStudio is the keyboard shortcut for typing the assignment operator <-:
 Mac OS X: Option + -
 Windows/Linux: Alt + -
We highly recommend that you memorize this keyboard shortcut because it saves a lot of time
in the long run!
When we type result into the console and hit enter, we see the stored value of 3:
> result <- 1 + 2
> result
[1] 3
When we create a variable in RStudio, it saves it as an object in the R global environment.
We’ll discuss the environment and how to view objects stored in the environment in the next
section.
5. The Global Environment
We can think of the global environment as our workspace. During a programming session in R,
any variables we define, or data we import and save in a dataframe, are stored in our global
environment. In RStudio, we can see the objects in our global environment in
the Environment tab at the top right of the interface:

DEEN COLLEGE OF ARTS AND SCIENCE Page 10


R PROGRAMMING UNIT I

We’ll see any objects we created, such as result, under values in the Environment tab. Notice that
the value, 3, stored in the variable is displayed.
Sometimes, having too many named objects in the global environment creates confusion. Maybe
we’d like to remove all or some of the objects. To remove all objects, click the broom icon at the
top of the window:

To remove selected objects from the workspace, select the Grid view from the dropdown menu:

Here we can check the boxes of the objects we’d like to remove and use the broom icon to clear
them from our Global Environment.
**********

R COMMAND PROMPT
Once you have R environment setup, then it’s easy to start your R command prompt by just
typing the following command at your command prompt:
$R
This will launch R interpreter and you will get a prompt > where you can start typing your
program as follows:
> myString <- "Hello, World!"
> print ( myString)
[1] "Hello, World!"
Here first statement defines a string variable myString, where we assign a string "Hello, World!"
and then next statement print() is being used to print the value stored in variable myString.
Using the R command line
The basic way to interact with R is through the command line interface. In RStudio, this
command line interaction occurs in the command console. R is an interpreted programming
language. This means that R will interpret each line of code as it is entered and, if it is valid, R
DEEN COLLEGE OF ARTS AND SCIENCE Page 11
R PROGRAMMING UNIT I

will execute it, returning the result in the command console. This is a more direct interaction than
a compiled programming language, where you edit the code, compile it, run the executable, and
receive the output result. The immediate feedback of an interpreted interface makes R relatively
easy to learn and work with. Simply enter your code, press the ENTER key, and get the result.
A short example exercise will help demonstrate the R interpretive command line interface in the
RStudio Command console.
Type: 45 + 56 then press ENTER
The result 101 is returned to the command console
Type: x <- 34 then press ENTER
Type: y <-16 then press ENTER
Type: x - y then press ENTER
The result 18 is returned to the command console
Type: y/x then press ENTER
The result 0.4705882 is returned to the command console
Type: x - z then press ENTER
The result Error: object 'z' not found is returned to the command console. R gave you this
error message because you do not have an object named z.
One last example:
Create an object v containing a list
Type: v <- c(1, 2, 3, 4, 5, 6) then press ENTER [the function c( ) will be explained later when
we explore the R functions]
You can view the contents of v
Type: v
The result 1 2 3 4 5 6 is returned to the command console
Start to type the function name mean( )
As you type each letter, RStudio begins to suggest available objects and functions from your
session environment. Any object active in the Environment panel or any function in the active
packages in the Packages panel will be recommended by RStudio for your use. This RStudio
recommendation system makes command line interaction easier. You do not have to remember
all of the active objects or functions. You can pick the one you want to use from the
recommendation list.
Once you choose the mean( ) function, R inserts it on the command line and places your input
cursor inside the function parentheses so you can fill in the function arguments.
To finish our example
Type: v as an argument to the mean( ) function and press ENTER
The result 3.5 is returned to the command console
You can also copy and paste code from earlier in your R session and run it again. Simply
highlight the line of code. Copy it to the clipboard [use CNTL+c in Windows or ⌘+c in Mac].
Go to the command prompt [you can simply press the Down Arrow on your keyboard and your
cursor will jump the command prompt]. Paste the code at the command prompt [use CNTL+v in
windows or ⌘+v in Mac], then press ENTER.
You can now interact with R using the command line interface. As you develop your skills using
R, you may want to save your work for a later session or run a block of code to initialize your
session environment. This can be easily done with an R script. The next topic will introduce you
to R scripts and how they can expand your capabilities.
**********

DEEN COLLEGE OF ARTS AND SCIENCE Page 12


R PROGRAMMING UNIT I

R SCRIPT FILE
Usually, you will do your programming by writing your programs in script files and then you
execute those scripts at your command prompt with the help of R interpreter called Rscript. So
let's start with writing following code in a text file called test.R as under:
# My first program in R Programming
myString <- "Hello, World!"
print ( myString)
Save the above code in a file test.R and execute it at Linux command prompt as given below.
Even if you are using Windows or other system, syntax will remain same.
$ Rscript test.R
When we run the above program, it produces the following result.
[1] "Hello, World!"
R scripts
While entering and running your code at the R command line is effective and simple. This
technique has its limitations. Each time you want to execute a set of commands, you have to re-
enter them at the command line. Complex commands are potentially subject to typographical
errors, necessitating that they be re-entered correctly. Repeating a set of operations requires re-
entering the code stream. Fortunately, R and RStudio provide a method to mitigate these issues.
R scripts are that solution.
A script is simply a text file containing a set of commands and comments. The script can be
saved and used later to re-execute the saved commands. The script can also be edited so you can
execute a modified version of the commands.
Creating an R script
It is easy to create a new script in RStudio. You can open a new empty script by clicking
the New File icon in the upper left of the main RStudio toolbar. This icon looks like a white
square with a white plus sign in a green circle. Clicking the icon opens the New File Menu.
Click the R Script menu option and the script editor will open with an empty script.

Figure 1 - RStudio New Script Menu


Once the new script opens in the Script Editor panel, the script is ready for text entry, and your
RStudio session will look like this.

DEEN COLLEGE OF ARTS AND SCIENCE Page 13


R PROGRAMMING UNIT I

Figure 2 - RStudio with Script Editor Panel


Here is an easy example to familiarize you with the Script Editor interface. Type the following
code into your new script [later topics will explain the specific code components do].
# this is my first R script
# do some things
x = 34
y = 16
z = x + y # addition
w = y/x # division
# display the results
x
y
z
w
# change x
x = "some text"
# display the results
x
y
z
w

DEEN COLLEGE OF ARTS AND SCIENCE Page 14


R PROGRAMMING UNIT I

Figure 3 - R Script Example


There, you now have your first R script. Notice how the editor places a number in front of each
line of code. The line numbers can be helpful as you work with your code. Before proceeding on
to executing this code, it would be a good idea to learn how to save your script.
Saving an R script
You can save your script by clicking on the Save icon at the top of the Script Editor panel.
When you do that, a Save File dialog will open.

Figure 4 - Save File Dialog

DEEN COLLEGE OF ARTS AND SCIENCE Page 15


R PROGRAMMING UNIT I

The default script name is Untitled.R. The Untitled part is highlighted. You will save this script
as First script.R. Start typing First script. RStudio overwrites the highlighted default name with
your new name, leaving the .R file extension. The Save File dialog should now look like this.

Figure 5 - Save First script.R


Notice that RStudio will save your script to your current working folder. An earlier topic in this
learning infrastructure explained how to set your default working folder, so that will not be
addressed here. Press the Save button and your script is saved to your working folder. Notice that
the name in the file tab at the top of the Script Editor panel now shows your saved script file
name.
Be aware that, while it is not necessary to use an .R file extension for your R scripts, it does
make it easier for RStudio to work with them if your use this file extension.
That is how you save your script files to your working folder.
Opening an R script
Opening a saved R script is easy to do. Click on the Open an existing file icon in the RStudio
toolbar. A Choose file dialog will open.

Figure 6 - RStudio Open Script Dialog


Select the R script you want to open [this is one place where the .R file extension comes in
handy] and click the Open button. Your script will open in the Script Editor panel with the script
name in an editor tab.

DEEN COLLEGE OF ARTS AND SCIENCE Page 16


R PROGRAMMING UNIT I

Working through an example may be helpful. We will use the script you created above [First
script.R] for this exercise. First, you will need to close the script. You can close this script by
clicking the X in the right side of the editor tab where the script name appears. Since you only
had one script open, when you close First script.R, the Script Editor panel disappears.

Now, click on the Open an existing file icon in the RStudio toolbar. The Choose file dialog will
open. Select First script.R and then press the Open button in the dialog. Your script is now
open in the Script Editor panel and ready to use.
Executing code in an R script
You can run the code in your R script easily. The Run button in the Script Editor panel toolbar
will run either the current line of code or any block of selected code. You can use your First
script.R code to gain familiarity with this functionality.
Place the cursor anywhere in line 3 of your script [x = 34]. Now press the Run button in the
Script Editor panel toolbar. Three things happen: 1) the code is transferred to the command
console, 2) the code is executed, and 3) the cursor moves to the next line in your script. Press
the Run button three more times. RStudio executes lines 4, 5, and 6 of your script.
Now you will run a set of code commands all at once. Highlight lines 8, 9, 10, and 11 in the
script.

Figure 7 - Highlighted Script Code

DEEN COLLEGE OF ARTS AND SCIENCE Page 17


R PROGRAMMING UNIT I

Highlighting is accomplished similar to what you may be familiar with in word processor
applications. You click your left mouse button and the beginning of the text you want to
highlight, you hold the mouse button and drag the cursor to the end of the text and release the
button. With those four lines of code highlighted, click the editor Run button. All four lines of
code are executed in the command console. That is all it takes to run script code in RStudio.
Comments in an R script [documenting your code]
Before finishing this topic, there is one final concept you should understand. It is always a good
idea to place comments in your code. They will help you understand what your code is meant to
do. This will become helpful when you reopen code you wrote weeks ago and are trying to work
with again. The saying, "Real programmers do not document their code. If it was hard to write, it
should be hard to understand" is meant to be a dark joke, not a coding style guide.

Figure 8 - R Script Example [with comments]


A comment in R code begins with the # symbol. Your code in First script.R contains several
examples of comments. Lines 1, 2, 7, 12, and 14 in the image above are all comment lines. Any
line of text that starts with # will be treated as a comment and will be ignored during code
execution. Lines 5 and 6 in this image contain comments at the end. All text after the # is treated
as a comment and is ignored during execution.
Notice how the RStudio editor shows these comments colored green. The green color helps you
focus on the code and not get confused by the comments.

Besides using comments to help make your R code more easily understood, you can use
the # symbol to ignore lines of code while you are developing your code stream. Simply place
a # in front of any line that you want to ignore. R will treat those lines as comments and ignore
them. When you want to include those lines again in the code execution, remove the # symbols
and the code is executable again. This technique allows you to change what code you execute
without having to retype deleted code.
**********

DEEN COLLEGE OF ARTS AND SCIENCE Page 18


R PROGRAMMING UNIT I

COMMENTS
Comments are generic English sentences, mostly written in a program to explain what it does or
what a piece of code is supposed to do. More specifically, information that programmer should
be concerned with and it has nothing to do with the logic of the code. They are completely
ignored by the compiler and are thus never reflected on to the input.
The question arises here that how will the compiler know whether the given statement is a
comment or not?
The answer is pretty simple. All languages use a symbol to denote a comment and this symbol
when encountered by the compiler helps it to differentiate between a comment and statement.
Comments are generally used for the following purposes:
 Code Readability
 Explanation of the code or Metadata of the project
 Prevent execution of code
 To include resources
Types of Comments
There are generally three types of comments supported by languages, namely
Single-line Comments- Comment that only needs one line
Multi-line Comments- Comment that requires more than one line.
Documentation Comments- Comments that are drafted usually for a quick documentation look-
up
Note: R doesn’t support Multi-line and Documentation comments. It only supports single-line
comments drafted by a ‘#’ symbol.
Comments in R
As stated in the Note provided above, currently R doesn’t have support for Multi-line comments
and documentation comments. R provides its users with single-lined comments in order to add
information about the code.
Single-Line Comments in R
Single-line comments are comments that require only one line. They are usually drafted to
explain what a single line of code does or what it is supposed to produce so that it can help
someone referring to the source code.
Just like python single-line comments, any statement starting with “#” is a comment in R.
Syntax:
# comment statement
Example 1:

# geeksforgeeks

The above code when executed will not produce any output, because R will consider the
statement as a comment and hence the compiler will ignore the line.
Example 2:

# R program to add two numbers


# Assigning values to variables
a <- 9
b <- 4
# Printing sum

DEEN COLLEGE OF ARTS AND SCIENCE Page 19


R PROGRAMMING UNIT I

print(a + b)

Output:
[1] 13
Commenting Multiple Lines
As stated earlier that R doesn’t support multi-lined comments, but to make the commenting
process easier, R allows commenting multiple single lines at once. There are two ways to add
multiple single-line comments in R Studio:
First way: Select the multiple lines which you want to comment using the cursor and then use
the key combination “control + shift + C” to comment or uncomment the selected lines.
Second way: The other way is to use the GUI, select the lines which you want to comment by
using the cursor and click on “Code” in menu, a pop-up window pops out in which we need to
select “Comment/Uncomment Lines” which appropriately comments or uncomment the lines

which you have selected.


This makes the process of commenting a block of code easier and faster than adding # before
each line one at a time.
**********
HANDLING PACKAGES IN R
R packages are a collection of R functions, complied code and sample data. They are stored
under a directory called "library" in the R environment. By default, R installs a set of packages
during installation. More packages are added later, when they are needed for some specific
purpose. When we start the R console, only the default packages are available by default. Other
packages which are already installed have to be loaded explicitly to be used by the R program
that is going to use them.
All the packages available in R language are listed at R Packages.
Below is a list of commands to be used to check, verify and use the R packages.
Check Available R Packages
Get library locations containing R packages
.libPaths()

DEEN COLLEGE OF ARTS AND SCIENCE Page 20


R PROGRAMMING UNIT I

When we execute the above code, it produces the following result. It may vary depending on the
local settings of your pc.
[2] "C:/Program Files/R/R-3.2.2/library"
Get the list of all the packages installed
library()
When we execute the above code, it produces the following result. It may vary depending on the
local settings of your pc.
Packages in library ‘C:/Program Files/R/R-3.2.2/library’:
base The R Base Package
boot Bootstrap Functions (Originally by Angelo Canty for S)
class Functions for Classification
cluster "Finding Groups in Data": Cluster Analysis Extended Rousseeuw et al.
codetools Code Analysis Tools for R
compiler The R Compiler Package
datasets The R Datasets Package
foreign Read Data Stored by 'Minitab', 'S', 'SAS','SPSS', 'Stata', 'Systat', 'Weka', 'dBase', ...
graphics The R Graphics Package
grDevices The R Graphics Devices and Support for Colours and Fonts
grid The Grid Graphics Package
KernSmooth Functions for Kernel Smoothing Supporting Wand & Jones (1995)
lattice Trellis Graphics for R
MASS Support Functions and Datasets for Venables and Ripley's MASS
Matrix Sparse and Dense Matrix Classes and Methods
methods Formal Methods and Classes
mgcv Mixed GAM Computation Vehicle with GCV/AIC/REML Smoothness Estimation
nlme Linear and Nonlinear Mixed Effects Models
nnet Feed-Forward Neural Networks and Multinomial Log-Linear Models
parallel Support for Parallel computation in R
rpart Recursive Partitioning and Regression Trees
spatial Functions for Kriging and Point Pattern Analysis
splines Regression Spline Functions and Classes
stats The R Stats Package
stats4 Statistical Functions using S4 Classes
survival Survival Analysis
tcltk Tcl/Tk Interface
tools Tools for Package Development
utils The R Utils Package
Get all packages currently loaded in the R environment
search()
When we execute the above code, it produces the following result. It may vary depending on the
local settings of your pc.
[1] ".GlobalEnv" "package:stats" "package:graphics"
[4] "package:grDevices" "package:utils" "package:datasets"
[7] "package:methods" "Autoloads" "package:base"
Install a New Package
There are two ways to add new R packages. One is installing directly from the CRAN directory
and another is downloading the package to your local system and installing it manually.
DEEN COLLEGE OF ARTS AND SCIENCE Page 21
R PROGRAMMING UNIT I

Install directly from CRAN


The following command gets the packages directly from CRAN webpage and installs the
package in the R environment. You may be prompted to choose a nearest mirror. Choose the one
appropriate to your location.
install.packages("Package Name")
# Install the package named "XML".
install.packages("XML")
Install package manually
Go to the link R Packages to download the package needed. Save the package as a .zip file in a
suitable location in the local system.
Now you can run the following command to install this package in the R environment.
install.packages(file_name_with_path, repos = NULL, type = "source")
# Install the package named "XML"
install.packages("E:/XML_3.98-1.3.zip", repos = NULL, type = "source")
Load Package to Library
Before a package can be used in the code, it must be loaded to the current R environment. You
also need to load a package that is already installed previously but not available in the current
environment.
A package is loaded using the following command −
library("package Name", lib.loc = "path to library")
# Load the package named "XML"
install.packages("E:/XML_3.98-1.3.zip", repos = NULL, type = "source")
**********
PACKAGE DESCRIPTION
Description
Parses and returns the ‘DESCRIPTION’ file of a package as a "packageDescription".
Utility functions return (transformed) parts of that.
Usage
packageDescription(pkg, lib.loc = NULL, fields = NULL,
drop = TRUE, encoding = "")
packageVersion(pkg, lib.loc = NULL)
packageDate(pkg, lib.loc = NULL,
date.fields = c("Date", "Packaged", "Date/Publication", "Built"),
tryFormats = c("%Y-%m-%d", "%Y/%m/%d", "%D", "%m/%d/%y"),
desc = packageDescription(pkg, lib.loc=lib.loc, fields=date.fields))
asDateBuilt(built)
Arguments
pkg a character string with the package name.
a character vector of directory names of R libraries, or NULL. The default value
lib.loc of NULL corresponds to all libraries currently known. If the default is used, the
loaded packages and namespaces are searched before the libraries.
a character vector giving the tags of fields to return (if other fields occur in the file
fields
they are ignored).
If TRUE and the length of fields is 1, then a single character string with the value of
drop
the respective field is returned instead of an object of class "packageDescription".

DEEN COLLEGE OF ARTS AND SCIENCE Page 22


R PROGRAMMING UNIT I

If there is an Encoding field, to what encoding should re-encoding be attempted?


encoding If NA, no re-encoding. The other values are as used by iconv, so the
default "" indicates the encoding of the current locale.
character vector of field tags to be tried. The first for which as.Date(.) is not NA will
date.fields
be returned. (Partly experimental, see Note.)
tryFormats date formats to try, see as.Date.character().
optionally, a named list with components named from date.fields; where the default
desc
is fine, a complete packageDescription() maybe specified as well.
built for asDateBuilt(), a character string as from packageDescription(*, fields="Built").
Details
A package will not be ‘found’ unless it has a ‘DESCRIPTION’ file which contains a
valid Version field. Different warnings are given when no package directory is found and when
there is a suitable directory but no valid ‘DESCRIPTION’ file.
An attached environment named to look like a package (e.g., package:utils2) will be ignored.
packageVersion() is a convenience shortcut, allowing things like if (packageVersion("MASS") <
"7.3") { do.things } .
For packageDate(), if desc is valid, both pkg and lib.loc are not made use of.
Value
If a ‘DESCRIPTION’ file for the given package is found and can successfully be
read, packageDescription returns an object of class "packageDescription", which is a named list
with the values of the (given) fields as elements and the tags as names, unless drop = TRUE.
If parsing the ‘DESCRIPTION’ file was not successful, it returns a named list of NAs with the
field tags as names if fields is not null, and NA otherwise.
packageVersion() returns a (length-one) object of class "package_version".
packageDate() will return a "Date" object from as.Date() or NA.
asDateBuilt(built) returns a "Date" object or signals an error if built is invalid.
Note
The default behavior of packageDate(), notably for date.fields, is somewhat experimental and
may change.
Examples
packageDescription("stats")
packageDescription("stats", fields = c("Package", "Version"))
packageDescription("stats", fields = "Version")
packageDescription("stats", fields = "Version", drop = FALSE)
if(requireNamespace("MASS") && packageVersion("MASS") < "7.3.29")
message("you need to update 'MASS'")
pu <- packageDate("utils")
str(pu)
stopifnot(identical(pu, packageDate(desc = packageDescription("utils"))),
identical(pu, packageDate("stats"))) # as "utils" and "stats" are
# both 'base R' and "Built" at same time
**********

DEEN COLLEGE OF ARTS AND SCIENCE Page 23


R PROGRAMMING UNIT I
HELP()
The help() function in R is used to get help on any given R function passed to it.
Syntax
help(function name)
Parameter value
The help() function takes the parameter value function name which represents the name of any R
function.
Return value
The help() function returns access to official documentation pages of the function passed to it.
Example
In the code below, we’ll use the help function to get help on the following R functions:
eval() function
dump() function
# implementing the help() function
help(eval)

# implementing the help() function


help(eval)
Implementing the help() function
Explanation
In the code above, we use the help() function to provide the official documentation page for the R
function eval().
Now, let’s use the help() function to get help on the dump() function. In other words, we use
the help() function to provide the official documentation page for the R function dump.
# implementing the help() function
help(dump)

# implementing the help() function


help(dump)
Implementing the help() function
Explanation
In the code above, we use the help() function to provide help on the R function dump() by simply
returning the R official documentation page for the dump() function.
**********
FIND PACKAGES
Description
Find the paths to one or more packages.
Usage
find.package(package, lib.loc = NULL, quiet = FALSE,
verbose = getOption("verbose"))
path.package(package, quiet = FALSE)
packageNotFoundError(package, lib.loc, call = NULL)
Arguments
package character vector: the names of packages.
lib.loc a character vector describing the location of R library trees to search through, or NULL.
The default value of NULL corresponds to checking the loaded namespace, then all
libraries currently known in .libPaths().
DEEN COLLEGE OF ARTS AND SCIENCE Page 24
R PROGRAMMING UNIT I

quiet logical. Should this not give warnings or an error if the package is not found?
verbose a logical. If TRUE, additional diagnostics are printed, notably when a package is found
more than once.
call call expression.
Details
find.package returns path to the locations where the given packages are found.
If lib.loc is NULL, then loaded namespaces are searched before the libraries. If a package is
found more than once, the first match is used. Unless quiet = TRUE a warning will be given
about the named packages which are not found, and an error if none are. If verbose is true,
warnings about packages found more than once are given. For a package to be returned it must
contain a either a ‘Meta’ subdirectory or a ‘DESCRIPTION’ file containing a valid version field,
but it need not be installed (it could be a source package if lib.loc was set suitably).
find.package is not usually the right tool to find out if a package is available for use: the only
way to do that is to use require to try to load it. It need not be installed for the correct platform, it
might have a version requirement not met by the running version of R, there might be
dependencies which are not available, ....
path.package returns the paths from which the named packages were loaded, or if none were
named, for all currently attached packages. Unless quiet = TRUE it will warn if some of the
packages named are not attached, and given an error if none are.
packageNotFoundError creates an error condition object of class packageNotFoundError for
signaling errors. The condition object contains the fields package and lib.loc.
Value
A character vector of paths of package directories.
Examples
try(find.package("knitr"))
## will not give an error, maybe a warning about *all* locations it is found:
find.package("kitty", quiet=TRUE, verbose=TRUE)
## Find all .libPaths() entries a package is found:
findPkgAll <- function(pkg)
unlist(lapply(.libPaths(), function(lib)
find.package(pkg, lib, quiet=TRUE, verbose=FALSE)))

findPkgAll("MASS")
findPkgAll("knitr")
**********
LIBRARY()
library(package, help, pos = 2, lib.loc = NULL,
character.only = FALSE, logical.return = FALSE,
warn.conflicts, quietly = FALSE,
verbose = getOption("verbose"),
mask.ok, exclude, include.only,
attach.required = missing(include.only))
conflictRules(pkg, mask.ok = NULL, exclude = NULL)
Arguments
package, help

DEEN COLLEGE OF ARTS AND SCIENCE Page 25


R PROGRAMMING UNIT I

the name of a package, given as a name or literal character string, or a character string,
depending on whether character.only is FALSE (default) or TRUE.
pos
the position on the search list at which to attach the loaded namespace. Can also be the name of a
position on the current search list as given by search().
lib.loc
a character vector describing the location of R library trees to search through, or NULL. The
default value of NULL corresponds to all libraries currently known to .libPaths(). Non-existent
library trees are silently ignored.
character.only
a logical indicating whether package or help can be assumed to be character strings.
logical.return
logical. If it is TRUE, FALSE or TRUE is returned to indicate success.
warn.conflicts
logical. If TRUE, warnings are printed about conflicts from attaching the new package. A
conflict is a function masking a function, or a non-function masking a non-function. The default
is TRUE unless specified as FALSE in the conflicts.policy option.
verbose
a logical. If TRUE, additional diagnostics are printed.
quietly
a logical. If TRUE, no message confirming package attaching is printed, and most often, no
errors/warnings are printed if package attaching fails.
pkg
character string naming a package.
mask.ok
character vector of names of objects that can mask objects on the search path without signaling
an error when strict conflict checking is enabled
exclude,include.only
character vector of names of objects to exclude or include in the attached frame. Only one of
these arguments may be used in a call to library or require.
attach.required
logical specifying whether required packages listed in the Depends clause of
the DESCRIPTION file should be attached automatically.
Value
Normally library returns (invisibly) the list of attached packages,
but TRUE or FALSE if logical.return is TRUE. When called as library() it returns an object of
class "libraryIQR", and for library(help=), one of class "packageInfo".
Conflicts
Handling of conflicts depends on the setting of the conflicts.policy option. If this option is not
set, then conflicts result in warning messages if the argument warn.conflicts is TRUE. If the
option is set to the character string "strict", then all unresolved conflicts signal errors. Conflicts
can be resolved using the mask.ok, exclude, and include.only arguments to library and require.
Defaults for mask.ok and exclude can be specified using conflictRules.
If the conflicts.policy option is set to the string "depends.ok" then conflicts resulting from
attaching declared dependencies will not produce errors, but other conflicts will. This is likely to
be the best setting for most users wanting some additional protection against unexpected
conflicts.
DEEN COLLEGE OF ARTS AND SCIENCE Page 26
R PROGRAMMING UNIT I

The policy can be tuned further by specifying the conflicts.policy option as a named list with the
following fields:
error:
logical; if TRUE treat unresolved conflicts as errors.
warn:
logical; unless FALSE issue a warning message when conflicts are found.
generics.ok:
logical; if TRUE ignore conflicts created by defining S4 generics for functions on the search
path.
depends.ok:
logical; if TRUE do not treat conflicts with required packages as errors.
can.mask:
character vector of names of packages that are allowed to be masked. These would typically be
base packages attached by default.
Example
# NOT RUN {
library() # list all available packages
library(lib.loc = .Library) # list all packages in the default library
#}
# NOT RUN {
library(help = splines) # documentation on package 'splines'
#}
# NOT RUN {
library(splines) # attach package 'splines'
**********
TAKING INPUT FROM USER IN R PROGRAMMING
Developers often have a need to interact with users, either to get data or to provide some sort of
result. Most programs today use a dialog box as a way of asking the user to provide some type of
input. Like other programming languages in R it’s also possible to take input from the user. For
doing so, there are two methods in R.

Using readline() method


Using scan() method
Using readline() method
In R language readline() method takes input in string format. If one inputs an integer then it is
inputted as a string, lets say, one wants to input 255, then it will input as “255”, like a string. So
one needs to convert that inputted value to the format that he needs. In this case, string “255” is
converted to integer 255. To convert the inputted value to the desired data type, there are some
functions in R,
as.integer(n); —> convert to integer
as.numeric(n); —> convert to numeric type (float, double etc)
as.complex(n); —> convert to complex number (i.e 3+2i)
as.Date(n) —> convert to date …, etc
Syntax:
var = readline();
var = as.integer(var);
Note that one can use “<-“ instead of “=”
DEEN COLLEGE OF ARTS AND SCIENCE Page 27
R PROGRAMMING UNIT I

Example:
R

# R program to illustrate
# taking input from the user
# taking input using readline()
# this command will prompt you
# to input a desired value
var = readline();
# convert the inputted value to integer
var = as.integer(var);
# print the value
print(var)

Output:
255
[1] 255
One can also show message in the console window to tell the user, what to input in the program.
To do this one must use a argument named prompt inside the readline() function.
Actually prompt argument facilitates other functions to constructing of files documenting.
But prompt is not mandatory to use all the time.
Syntax:
var1 = readline(prompt = “Enter any number : “);
or,
var1 = readline(“Enter any number : “);
Example:
R

# R program to illustrate
# taking input from the user
# taking input with showing the message
var = readline(prompt = "Enter any number : ");
# convert the inputted value to an integer
var = as.integer(var);
# print the value
print(var)

Output:

Enter any number : 255


[1] 255

Taking multiple inputs in R


Taking multiple inputs in R language is same as taking single input, just need to define
multiple readline() for inputs. One can use braces for define multiple readline() inside it.

Syntax:
var1 = readline(“Enter 1st number : “);
DEEN COLLEGE OF ARTS AND SCIENCE Page 28
R PROGRAMMING UNIT I

var2 = readline(“Enter 2nd number : “);


var3 = readline(“Enter 3rd number : “);
var4 = readline(“Enter 4th number : “);
or,
{
var1 = readline(“Enter 1st number : “);
var2 = readline(“Enter 2nd number : “);
var3 = readline(“Enter 3rd number : “);
var4 = readline(“Enter 4th number : “);
}
Example:

# R program to illustrate
# taking input from the user
# taking multiple inputs
# using braces
{
var1 = readline("Enter 1st number : ");
var2 = readline("Enter 2nd number : ");
var3 = readline("Enter 3rd number : ");
var4 = readline("Enter 4th number : ");
}
# converting each value
var1 = as.integer(var1);
var2 = as.integer(var2);
var3 = as.integer(var3);
var4 = as.integer(var4);
# print the sum of the 4 number
print(var1 + var2 + var3 + var4)

Output:

Enter 1st number : 12


Enter 2nd number : 13
Enter 3rd number : 14
Enter 4th number : 15
[1] 54

Taking String and Character input in R

To take string input is the same as an integer. For “String” one doesn’t need to convert the
inputted data into a string because R takes input as string always. And for “character”, it needs to
be converted to ‘character’. Sometimes it may not cause any error. One can take character input
as same as string also, but that inputted data is of type string for the entire program. So the best
way to use that inputted data as ‘character’ is to convert the data to a character.

DEEN COLLEGE OF ARTS AND SCIENCE Page 29


R PROGRAMMING UNIT I

Syntax:
string:
var1 = readline(prompt = “Enter your name : “);
character:
var1 = readline(prompt = “Enter any character : “);
var1 = as.character(var1)
Example:
R

# R program to illustrate
# taking input from the user
# string input
var1 = readline(prompt = "Enter your name : ");
# character input
var2 = readline(prompt = "Enter any character : ");
# convert to character
var2 = as.character(var2)
# printing values
print(var1)
print(var2)

Output:
Enter your name : GeeksforGeeks
Enter any character : G
[1] "GeeksforGeeks"
[1] "G"
Using scan() method
Another way to take user input in R language is using a method, called scan() method. This
method takes input from the console. This method is a very handy method while inputs are
needed to taken quickly for any mathematical calculation or for any dataset. This method reads
data in the form of a vector or list. This method also uses to reads input from a file also.
Syntax:
x = scan()
scan() method is taking input continuously, to terminate the input process, need to
press Enter key 2 times on the console.
Example:
This is simple method to take input using scan() method, where some integer number is taking as
input and print those values in the next line on the console.
R

# R program to illustrate
# taking input from the user
# taking input using scan()
x = scan()
# print the inputted values
print(x)

DEEN COLLEGE OF ARTS AND SCIENCE Page 30


R PROGRAMMING UNIT I

Output:

1: 1 2 3 4 5 6
7: 7 8 9 4 5 6
13:
Read 12 items
[1] 1 2 3 4 5 6 7 8 9 4 5 6
Explanation:
Total 12 integers are taking as input in 2 lines when the control goes to 3rd line then by
pressing Enter key 2 times the input process will be terminated.

Taking double, string, character type values using scan() method


To take double, string, character types inputs, specify the type of the inputted value in
the scan() method. To do this there is an argument called what, by which one can specify the
data type of the inputted value.
Syntax:
x = scan(what = double()) —-for double
x = scan(what = ” “) —-for string
x = scan(what = character()) —-for character
Example:
R

# R program to illustrate
# taking input from the user
# double input using scan()
d = scan(what = double())
# string input using 'scan()'
s = scan(what = " ")
# character input using 'scan()'
c = scan(what = character())
# print the inputted values
print(d) # double
print(s) # string
print(c) # character

Output:
1: 123.321 523.458 632.147
4: 741.25 855.36
6:
Read 5 items
1: geeksfor geeks gfg
4: c++ R java python
8:
Read 7 items
1: g e e k s f o
8: r g e e k s
14:

DEEN COLLEGE OF ARTS AND SCIENCE Page 31


R PROGRAMMING UNIT I

Read 13 items
[1] 123.321 523.458 632.147 741.250 855.360
[1] "geeksfor" "geeks" "gfg" "c++" "R" "java" "python"
[1] "g" "e" "e" "k" "s" "f" "o" "r" "g" "e" "e" "k" "s"
Explanation:
Here, count of double items is 5, count of sorting items is 7, count of character items is 13.
Read File data using scan() method
To read file using scan() method is same as normal console input, only thing is that, one needs to
pass the file name and data type to the scan() method.
Syntax:
x = scan(“fileDouble.txt”, what = double()) —-for double
x = scan(“fileString.txt”, what = ” “) —-for string
x = scan(“fileChar.txt”, what = character()) —-for character
Example:
R

# R program to illustrate
# taking input from the user
# string file input using scan()
s = scan("fileString.txt", what = " ")
# double file input using scan()
d = scan("fileDouble.txt", what = double())
# character file input using scan()
c = scan("fileChar.txt", what = character())
# print the inputted values
print(s) # string
print(d) # double
print(c) # character

Output:
Read 7 items
Read 5 items
Read 13 items
[1] "geek" "for" "geeks" "gfg" "c++" "java" "python"
[1] 123.321 523.458 632.147 741.250 855.360
[1] "g" "e" "e" "k" "s" "f" "o" "r" "g" "e" "e" "k" "s"
Save the data file in the same location where the program is saved for better access. Otherwise
total path of the file need to defined inside the scan() method.
**********

DEEN COLLEGE OF ARTS AND SCIENCE Page 32


R PROGRAMMING UNIT I

PRINTING OUTPUT OF AN R PROGRAM


In R there are various methods to print the output. Most common method to print output in R
program, there is a function called print() is used. Also if the program of R is written over
the console line by line then the output is printed normally, no need to use any function for print
that output. To do this just select the output variable and press run button. Example:
R

# select 'x' and then press 'run' button


# it will print 'GeeksforGeeks' on the console
x <- "GeeksforGeeks"
x

Output:
[1] "GeeksforGeeks"
Print output using print() function
Using print() function to print output is the most common method in R. Implementation of this
method is very simple.
Syntax: print(“any string”) or, print(variable)
Example:
R

# R program to illustrate
# printing output of an R program
# print string
print("GFG")
# print variable
# it will print 'GeeksforGeeks' on the console
x <- "GeeksforGeeks"
print(x)

Output:
[1] "GFG"
[1] "GeeksforGeeks"
Print output using paste() function inside print() function
R provides a method paste() to print output with string and variable together. This method
defined inside the print() function. paste() converts its arguments to character strings. One can
also use paste0() method.
Note: The difference between paste() and paste0() is that the argument sep by default is ”
“(paste) and “”(paste0).
Syntax: print(paste(“any string”, variable)) or, print(paste0(variable, “any string”))
Example:
R

# R program to illustrate
# printing output of an R program
x <- "GeeksforGeeks"

DEEN COLLEGE OF ARTS AND SCIENCE Page 33


R PROGRAMMING UNIT I

# using paste inside print()


print(paste(x, "is best (paste inside print())"))
# using paste0 inside print()
print(paste0(x, "is best (paste0 inside print())"))

Output:
[1] "GeeksforGeeks is best (paste inside print())"
[1] "GeeksforGeeksis best (paste0 inside print())"
Print output using sprintf() function
sprintf() is basically a C library function. This function is use to print string as C language.
This is working as a wrapper function to print values and strings together like C language. This
function returns a character vector containing a formatted combination of string and variable to
be printed.
Syntax: sprintf(“any string %d”, variable) or, sprintf(“any string %s”, variable) or, sprintf(“any
string %f”, variable)) etc.
Example:
R

# R program to illustrate
# printing output of an R program
x = "GeeksforGeeks" # string
x1 = 255 # integer
x2 = 23.14 # float
# string print
sprintf("%s is best", x)
# integer print
sprintf("%d is integer", x1)
# float print
sprintf("%f is float", x2)

Output:
> sprintf("%s is best", x)
[1] "GeeksforGeeks is best"
> sprintf("%d is integer", x1)
[1] "255 is integer"
> sprintf("%f is float", x2)
[1] "23.140000 is float"
Print output using cat() function
Another way to print output in R is using of cat() function. It’s same
as print() function. cat() converts its arguments to character strings. This is useful for printing
output in user defined functions.
Syntax: cat(“any string”) or, cat(“any string”, variable)

DEEN COLLEGE OF ARTS AND SCIENCE Page 34


R PROGRAMMING UNIT I

Example:
R

# R program to illustrate
# printing output of an R program
# print string with variable
# "\n" for new line
x = "GeeksforGeeks"
cat(x, "is best\n")
# print normal string
cat("This is R language")

Output:
GeeksforGeeks is best
This is R language
Print output using message() function
Another way to print something in R by using message() function. This is not used for print
output but its use for showing simple diagnostic messages which are no warnings or errors in the
program. But it can be used for normal uses for printing output.
Syntax: message(“any string”) or, message(“any string”, variable)
Example:
R

# R program to illustrate
# printing output of an R program
x = "GeeksforGeeks"
# print string with variable
message(x, "is best")
# print normal string
message("This is R language")

Output:
GeeksforGeeks is best
This is R language
Write output to a file
To print or write a file with a value of a variable there is a function called write(). This function
is used a option called table to write a file.
Syntax: write.table(variable, file = “file1.txt”) or, write.table(“any string”, file = “file1.txt”)
Example:
R

# R program to illustrate
# printing output of an R program
x = "GeeksforGeeks"
# write variable
write.table(x, file = "my_data1.txt")

DEEN COLLEGE OF ARTS AND SCIENCE Page 35


R PROGRAMMING UNIT I

# write normal string


write.table("GFG is best", file = "my_data2.txt")

Output:

**********
FORMAT NUMBER OF DECIMAL PLACES IN R
In this article we are going to discuss how to format numbers up to n decimal places in the R
programming language. In R language, the decimal number is represented by . symbol
Method 1: Format() function
Format() function can be used to format decimal values by rounding them and displaying only a
specific number of elements after decimal.
Syntax:
format(round(value, n), nsmall = n)
Parameters:
It can take two parameters.
round(value,n) function : which will specify the number of decimal places to be selected. It will
take input number along with integer value that select decimal places of the given number
nsmall function : which will specify the number of decimal places to be selected.It will take
input number along with integer value that select decimal places of the given number
Result:
Formatted Decimal number.
Example:
R

# define an variable and initialize


# to decimal number
a=12.4556785
# display decimal places upto 3
print(format(round(a, 3), nsmall = 3))
# display decimal places upto 4
print(format(round(a, 4), nsmall = 4))

DEEN COLLEGE OF ARTS AND SCIENCE Page 36


R PROGRAMMING UNIT I

# display decimal places upto 0


print(format(round(a, 0), nsmall = 0))
# display decimal places upto 1
print(format(round(a, 1), nsmall = 1))

Output:

Method 2: Using sprintf() function


Using sprintf() function, we can specify the format of the decimal places along with the variable
Syntax: sprintf(variable, fmt = ‘%.nf’)
Parameters:
variable – input decimal value
fmt stands for format which will take parameter “.%nf” where n specifies number of decimal
places to be selected.
Result:
formatted decimal number
Example 1:
R

# decimal number
a=14.6788
# format upto 4 places
print( sprintf(a, fmt = '%.4f') )
# format upto 8 places
print( sprintf(a, fmt = '%.8f') )
# format upto 1 place
print( sprintf(a, fmt = '%.1f') )
# format upto 0 places
print( sprintf(a, fmt = '%.0f') )

Output:

Method 3: Using options() function


This function is used to return the digits after the decimal.
Syntax:
options(digits = n)
Where digits is the number of digits to be returned along with number before decimal point.

DEEN COLLEGE OF ARTS AND SCIENCE Page 37


R PROGRAMMING UNIT I

Example:
a=1.24325454666
options(digits=4)
It will return 1.243
Example 1:
R

# decimal number
a=14.67885350938953809580
# format upto 4 places
options(digits=4)
print(a)
# format upto 8 places
options(digits=8)
print(a)
# format upto 3 place
options(digits=3)
print(a)

Output:

**********
SPECIAL VALUES : NA, Inf, -Inf
R language supports several null-able values and it is relatively important to understand how
these values behave, when making data pre-processing and data munging.
In general, R supports:
 NULL
 NA
 NaN
 Inf / -Inf
 NULL is an object and is returned when an expression or function results in an undefined value.
In R language, NULL (capital letters) is a reserved word and can also be the product of
importing data with unknown data type.
 NA is a logical constant of length 1 and is an indicator for a missing value.NA (capital letters) is
a reserved word and can be coerced to any other data type vector (except raw) and can also be a
product when importing data. NA and “NA” (as presented as string) are not interchangeable. NA
stands for Not Available.
 NaN stands for Not A Number and is a logical vector of a length 1 and applies to numerical
values, as well as real and imaginary parts of complex values, but not to values of integer vector.
NaN is a reserved word.
 Inf and -Inf stands for infinity (or negative infinity) and is a result of storing either a large
number or a product that is a result of division by zero. Inf is a reserved word and is – in most
cases – product of computations in R language and therefore very rarely a product of data
import. Infinite also tells you that the value is not missing and a number!

DEEN COLLEGE OF ARTS AND SCIENCE Page 38


R PROGRAMMING UNIT I

All four null/missing data types have accompanying logical functions available in base R;
returning the TRUE / FALSE for each of particular function: is.null(), is.na(), is.nan(),
is.infinite().
General understanding of all values by simply using following code:
#reading documentation on all data types:
?NULL
?NA
?NaN
?Inf
#populating variables
a <- "NA"
b <- "NULL"
c <- NULL
d <- NA
e <- NaN
f <- Inf
### Check if variables are same?
identical(a,d)
# [1] FALSE
# NA and NaN are not identical
identical(d,e)
# [1] FALSE
###checking length of data types
length(c)
# [1] 0
length(d)
# [1] 1
length(e)
# [1] 1
length(f)
# [1] 1
###checking data types
str(c); class(c);
#NULL
#[1] "NULL"
str(d); class(d);
#logi NA
#[1] "logical"
str(e); class(e);
#num NaN
#[1] "numeric"
str(f); class(f);
#num Inf
#[1] "numeric"
**********

DEEN COLLEGE OF ARTS AND SCIENCE Page 39

You might also like