It Workshop Lab File
It Workshop Lab File
07701182023
1
Mehnaaz Ansari
07701182023
INDEX
LAB NAME PROGRAM DATE REMARK
LAB1 Introduction to R CONSOLE 13/01/24
Language. a) Perform the following operations
on R console:
1. Sample Program
2. Addition
3. Subtraction
4. Multiplication
5. Division
EDITOR
b) Perform the following operations
on R editor:
1. Sample Program
2. Addition
3. Subtraction
4. Multiplication
5. Division
LAB2 Implement basic a) Create two vectors of equal length 16/01/24
functionality of R and perform:
1. Addition
2. Subtraction
3. Multiplication
4. Division
b) Create two vectors of equal length
and perform:
1. Addition
2. Subtraction
3. Multiplication
4. Division
c) Create two vectors of unequal
length and perform:
1. Addition
2. Subtraction
3. Multiplication
4. Division
LAB3 Implement the concept 1.Create a data frame for the 23/01/24
of data frame. employee consisting of attributes:
employee code, employee name and
the salary of the employee.
2.Perform all the operations, i.e.,
class, structure, summary of data
frame, extract specific rows and
columns from data frame, add a row
and column.
LAB4 Perform computation 1. Create two 2x3 matrices column 30/01/24
of matrices in R. wise and perform the addition,
subtraction, multiplication and
division.
2. Create two 2x3 matrices row wise
and perform the addition, subtraction,
multiplication and division.
3. Create two 2x3 matrices row wise
2
Mehnaaz Ansari
07701182023
3
Mehnaaz Ansari
07701182023
LAB 1
INTRODUCTION TO R
1. R is an open-source programming language mostly used for statistical computing and data
analysis and is available across widely used platforms like Windows, Linux and MacOs. It
generally comes with the command line interface and provides a vast list of packages for
performing tasks.
2. R is an interpreted language that supports both procedural programming and object-
oriented programming.
• R programming is used as a leading tool for machine learning, statistics, and data analysis.
Objects, functions, and packages can easily be created by R.
• It’s a platform-independent language. This means it can be applied to all operating system.
• It’s an open-source free language. That means anyone can install it in any organization
without purchasing a license.
• R programming language is not only a statistic package but also allows us to integrate with
other languages(C, C++). Thus, you can easily interact with many data sources and statistical
packages.
• The R programming language has a vast community of users and it’s growing day by day.
• R is currently one of the most requested programming languages in the Data Science job
market that makes it the hottest trend nowadays.
4
Mehnaaz Ansari
07701182023
Statistical Features Of R:
• Basic Statistics: The most common basic statistics terms are the mean, mode and median.
These are all known as “Measure of Central TendencY”. So using R language we can
measure central tendency very easily.
• Static graphics: R is rich with facilities for creating and developing interesting static
graphics. R contains functionality for many plot types including graphic maps, mosaic plots,
biplots, and the list goes on.
• Probability distributions: Probability distributions play a vital role in statistics and by using
R we can easily handle various types of probability distributions such as Binomial
Distribution, Normal Distribution, Chi-squared Distribution and many more.
• Data ananlysis: It provides a large, coherent and integrated collection of tools for data
analysis.
Programming Features Of R:
• R Packages: One of the major features of R is it has a wide availability of libraries. R has
CRAN (Comprehensive R Archive Network), which is a repository holding more than 10000
packages.
• Distributed Computing: Distributed computinf is a model in which components of a
software system are shared among multiple computers to improve efficiency and
performance. Two new packages ddR and multidplyr used for distributed programming in R
were released in November 2015.
CONSOLE:
A) Perform the following operations on R console:
1. Sample Program
5
Mehnaaz Ansari
07701182023
2. Addition
3. Subtraction
4. Division
5. Multiplication
EDITOR:
B) Perform the following operations on R editor:
1. Addition
2. Subtraction
3. Multiplication
4. Division
6
Mehnaaz Ansari
07701182023
LAB 2
IMPLEMENT BASIC FUNCTIONALITY OF R
Numeric Datatype
Decimal values are called Numerics in R. It is the default data type for numbers in R. If you
assign a decimal value to a variable x as follows, x will be of numeric type.
Integer Datatype
R supports integer data types which are the set of all integers. You can create as well as
convert a value into an integer type using the as.integer() function. You can also use the
capital ‘L’ notation as a suffix to denote that a particular value is of the integer data type .
Logical Datatype
R has logical data types that take either a value of true or false. A logical value is often
created via a comparison between variables.
Complex Datatype
R supports complex data types that are set of all the complex numbers. The complex data
type is to store numbers with an imaginary component.
7
Mehnaaz Ansari
07701182023
Character Datatype
R supports character data types where you have all the alphabets and special characters. It
stores character values or strings. Strings in R can contain alphabets, numbers, and symbols.
The easiest way to denote that a value is of character type in R is to wrap the value inside
single or double inverted commas.
Addition
Subtraction
Multiplication
8
Mehnaaz Ansari
07701182023
Division
R VECTORS
Numeric vector
The decimal values are known as numeric data types in R. If we assign a decimal value to
any variable d, then this d variable will become a numeric type. A vector which contains
numeric elements is known as a numeric vector.
Integer vector
A non-fraction numeric value is known as integer data. This integer data is represented by
"Int." The Int size is 2 bytes and long Int size of 4 bytes. There is two way to assign an
integer value to a variable, i.e., by using as.integer() function and appending of L to the value.
Character vector
9
Mehnaaz Ansari
07701182023
A character is held as a one-byte integer in memory. In R, there are two different ways to
create a character data type value, i.e., using as.character() function and by typing string
between double quotes("") or single quotes('').
Logical vector
The logical data types have only two values i.e., True or False. These values are based on
which condition is satisfied. A vector which contains Boolean values is known as the logical
vector.
Vector Operation
In R, there are various operation which is performed on the vector. We can add, subtract,
multiply or divide two or more vectors from each other. In data science, R plays an important
role, and operations are required for data manipulation. There are the following types of
operation which are performed on the vector.
1)Combining vectors
The c() function is not only used to create a vector, but also it is also used to combine two
vectors. By combining one or more vectors, it forms a new vector which contains all the
elements of each vector. Let see an example to see how c() function combines the vectors.
10
Mehnaaz Ansari
07701182023
2) Arithmetic operations
We can perform all the arithmetic operation on vectors. The arithmetic operations are
performed memberby-member on vectors. We can add, subtract, multiply, or divide two
vectors. Let see an example to understand how arithmetic operations are performed on
vectors.
With the help of the logical index vector in R, we can form a new vector from a given vector.
This vector has the same length as the original vector. The vector members are TRUE only
when the corresponding members of the original vector are included in the slice; otherwise, it
will be false. Let see an example to understand how a new vector is formed with the help of
logical index vector.
4) Numeric Index
In R, we specify the index between square braces [ ] for indexing a numerical value. If our
index is negative, it will return us all the values except for the index which we have specified.
For example, specifying [-3] will prompt R to convert -3 into its absolute value and then
search for the value which occupies that index.
11
Mehnaaz Ansari
07701182023
5) Duplicate Index
An index vector allows duplicate values which means we can access one element twice in
one operation. Let see an example to understand how duplicate index works.
6) Range Indexes
Range index is used to slice our vector to form a new vector. For slicing, we used colon(:)
operator. Range indexes are very helpful for the situation involving a large operator. Let see
an example to understand how slicing is done with the help of the colon operator to form a
new vector.
12
Mehnaaz Ansari
07701182023
LAB 3
IMPLEMENT THE CONCEPT OF DATAFRAME IN R
Data Frames in R Language are generic data objects of R which are used to store the tabular
data. Data frames can also be interpreted as matrices where each column of a matrix can be of
the different data types. Data Frame is made up of three principal components, the data, rows,
and columns.To create a data frame in R use data.frame() command and then pass each of the
vectors you have created as arguments to the function.
13
Mehnaaz Ansari
07701182023
Use Of Bind
14
Mehnaaz Ansari
07701182023
LAB 4
PERFORM COMPUTATIONS OF MATRICES IN R.
1. Create two 2x3 matrices column wise and perform addition,
subtraction, multiplication and division.
Addition
Subtraction
Multiplication
15
Mehnaaz Ansari
07701182023
Division
2. Create two 2x3 matrices row wise and perform the addition,
subtraction, multiplication and division.
16
Mehnaaz Ansari
07701182023
Scalar Multiplication:
Multiplying Matrices:
17
Mehnaaz Ansari
07701182023
LAB 5
CONTROL STRUCTURES IN R
In R programming, there are 8 types of control statements as follows:
1. if condition
2. if-else condition
3. for loop
4. nested loops
5. while loop
6. repeat and break statement
7. return statement
8. next statement
• If-condition
• If-else condition
• for loop
18
Mehnaaz Ansari
07701182023
• While loop
• return statement
• next, break
19
Mehnaaz Ansari
07701182023
20
Mehnaaz Ansari
07701182023
LAB 6
Importing Data in R
First, let’s consider a data set that we can use for the demonstration. For this demonstration,
we will use two examples of a single dataset, one in .csv form and another .txt
This function specifies how the dataset is separated, in this case we take sep=”, “ as an
argument.
21
Mehnaaz Ansari
07701182023
When a program is terminated, the entire data is lost. Storing in a file will preserve one’s
data even if the program terminates. If one has to enter a large number of data, it will take a
lot of time to enter them all. However, if one has a file containing all the data, he/she can
easily access the contents of the file using a few commands in R. One can easily move his
data from one computer to another without any changes. So those files can be stored in
various formats. It may be stored in .txt(tab-separated value) file, or in a tabular format
i.e .csv(comma-separated value) file or it may be on the internet or cloud. R provides very
easy methods to export data to those files.
22
Mehnaaz Ansari
07701182023
LAB 7
1. Bar Plot
A)
B)
23
Mehnaaz Ansari
07701182023
C)
2. Histogram
A)
24
Mehnaaz Ansari
07701182023
B)
25
Mehnaaz Ansari
07701182023
C)
3. Pie chart
A)
26
Mehnaaz Ansari
07701182023
B)
C)
27
Mehnaaz Ansari
07701182023
D)
28
Mehnaaz Ansari
07701182023
LAB 8
Apply some advanced visualization techniques in R to analyse
the data
29
Mehnaaz Ansari
07701182023
30
Mehnaaz Ansari
07701182023
Lab 9
Mini Project
(COVID19 CASES IN CHINA IN FEB 2020)
Introduction:
The COVID-19 pandemic, caused by the novel coronavirus SARS-CoV-2, has led to
unprecedented global health challenges since its emergence in late 2019. China, as the initial
epicenter of the outbreak, has been at the forefront of battling and managing the spread of the
virus. This dataset aims to provide a comprehensive overview of COVID-19 cases across
various regions in China, offering insights into the geographical distribution, temporal trends,
and impact of control measures implemented by authorities.
I have found this dataset on kraggle.
31
Mehnaaz Ansari
07701182023
Confirmed Cases:
A. Bar graph
B. Histogram
32
Mehnaaz Ansari
07701182023
C. Pie chart
Deaths:
33
Mehnaaz Ansari
07701182023
A. Bar graph
B. Histogram
C. Pie Chart
34
Mehnaaz Ansari
07701182023
Recovered cases:
A. Bar graph
B. Histogram
35
Mehnaaz Ansari
07701182023
C. Pie Chart
36
Mehnaaz Ansari
07701182023
37
Mehnaaz Ansari
07701182023
Observation:
The dataset encompasses data collected from diverse regions within China, spanning from the
onset of the pandemic to the present. It includes information such as the number of confirmed
cases, recoveries, fatalities, testing rates, and demographic characteristics of affected
individuals. Analysis of the data reveals spatial variations in the intensity and trajectory of the
outbreak, with some regions experiencing higher infection rates and mortality rates compared
to others. Temporal analysis highlights the evolution of the pandemic over time, including the
emergence of new variants, the efficacy of vaccination campaigns, and the effectiveness of
containment measures such as lockdowns and social distancing protocols.
38
Mehnaaz Ansari
07701182023
Conclusion:
In conclusion, the dataset provides valuable insights into the dynamics of COVID-19
transmission and control efforts in different regions of China. By analyzing the data,
policymakers, researchers, and public health officials can better understand the factors
influencing the spread of the virus and develop targeted strategies to mitigate its impact.
Moreover, the dataset serves as a resource for future studies aimed at improving pandemic
preparedness, response mechanisms, and public health interventions to combat emerging
infectious diseases.
39