0% found this document useful (0 votes)

18 views9 pages

1a Data Sorting

Uploaded by

trivenic606

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views9 pages

1a Data Sorting

Uploaded by

trivenic606

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

DataManipulation

• DataSorting
• Finding and Removing Duplicate Records
• Cleaning data
• Recording data
• Merging data

Data Sorting
• order( )
• sort( )
• dplyr()

R provides a different way to sort the data either in ascending or descending order; Data-
analysts, and Data scientists use order(), sort() and packages like dplyr to sort data depending
upon the structure of the obtained data.

order() can sort vector, matrix, and also a dataframe can be sorted in ascending and descending
order with its help, which is shown in the final section of this tutorial.

The syntax of order() is

order(x, decreasing = TRUE or FALSE, na.last = TRUE or FLASE, method = c("auto", "shell",
"quick", "radix") )

• x: data-frames, matrices, or vectors

• decreasing: boolean value; TRUE then sort in descending order

or FALSE then sort in ascending order.

• na.last: boolean value; TRUE then NA indices are put at last or

FLASE THEN NA indices are put first.

• method: sorting method to be used.

order() in R

An example of order() in action.

Below the code contains variable x, which includes a vector with a list of numbers. The numbers
are ordered according to its index by using order(x).

y = c(4,12,6,7,2,9,5)

order(y)

The above code gives the following output:

5173462

Here the order() will sort the given numbers according to its index in the ascending order. Since
number 2 is the smallest, which has an index as five and number 4 is index 1, and similarly, the
process moves forward in the same pattern.

y = c(4,12,6,7,2,9,5)

y[order(y)]

The above code gives the following output:

2 4 5 6 7 9 12

Here the indexing of order is done where the actual values are printed in the ascending order.
The values are ordered according to the index using order() then after each value accessed
using y[some-value].

Sorting vector using different parameters in order()

Let's look at an example where the datasets contain the value as symbol NA(Not available).

order(x,na.last=TRUE)

x <- c(8,2,4,1,-4,NA,46,8,9,5,3)

order(x,na.last = TRUE)

The above code gives the following output:

5 4 2 11 3 10 1 8 9 7 6
Here the order() will also sort the given list of numbers according to its index in the ascending
order. Since NA is present, its index will be placed last, where 6 will be placed last because
of na.last=TRUE.

order(x,na.last=FALSE)

The above code gives the following output:

6 5 4 2 11 3 10 1 8 9 7

Here the order() will also sort the given list of numbers according to its index in the ascending
order. Since NA is present, it's index, which is 6, will be placed first because of na.last=FALSE.

order(x,decreasing=TRUE,na.last=TRUE)

The above code gives the following output:

7 9 1 8 10 3 11 2 4 5 6

Here order() will sort a given list of numbers according to its index in the descending order
because of decreasing=TRUE: 46. The largest is placed at index 7, and the other values are
arranged in a decreasing manner. Since NA is present, index 6 will be placed last because
of na.last=TRUE.

order(x,decreasing=FALSE,na.last=FALSE)

The above code gives the following output:

6 5 4 2 11 3 10 1 8 9 7
Here NA is present which index is 6 will be placed at first because
of na.last=FALSE. order() will sort a given list of numbers according to its index in the
ascending order because of decreasing=FALSE: -4, which is smallest placed at index 5, and the
other values are arranged increasingly.

Sorting a dataframe by using order()

Let's create a dataframe where the population value is 10. The variable gender consists of vector
values 'male' and 'female' where 10 sample values could be obtained with the help of sample(),
whereas replace = TRUE will generate only the unique values. Similarly, the age consists of
value from 25 to 75, along with a degree of possible value as c("MA," "ME," "BE," "BSCS"),
which again will generate unique values.

Task: To sort the given data in the ascending order based on the given population's age.

Note: The sample data shown may differ while you're trying to use it in your local machine
because each time running a code will create a unique dataframe.

population = 10

gender=sample(c("male","female"),population,replace=TRUE)

age = sample(25:75, population, replace=TRUE)

degree = sample(c("MA","ME","BE","BSCS"), population, replace=TRUE)

(final.data = data.frame(gender=gender, age=age, degree=degree))

gender age degree

male 40 MA

female 57 BSCS
gender age degree

male 66 BE

female 61 BSCS

female 48 MA

male 25 MA

female 49 BE

male 52 ME

female 57 MA

female 35 MA

The above code gives the following output, which shows a newly created dataframe.

gender age degree

male 40 MA

female 57 BSCS

male 66 BE

female 61 BSCS

female 48 MA
male 25 MA
female 49 BE

male 52 ME

female 57 MA

female 35 MA

Let's sort the dataframe in the ascending order by using order() based on the variable age.

order(final.data$age)

The above code gives the following output:

6 10 3 9 5 8 4 2 7 1

Since age 25 is at index 6 followed by age 35 at index 10 and similarly, all the age-related values
are arranged in ascending order.

The code below contains the [] order with variable age, is used to arrange in ascending order
where the gender, along with degree information is also printed.

final.data[order(final.data$age),]

gender age degr

6 male 25 MA

10 female 35 MA

1 male 40 MA

5 female 48 MA
gender age degr

7 female 49 BE

8 male 52 ME

2 female 57 BSCS

9 female 57 MA

4 female 61 BSCS

3 male 66 BE

The above code gives the following output:

gender age degree

6 male 25 MA

10 female 35 MA

1 male 40 MA

5 female 48 MA

7 female 49 BE

8 male 52 ME

2 female 57 BSCS

9 female 57 MA

4 female 61 BSCS
3 male 66 BE
The output above shows that age is arranged in ascending order along with its corresponding
gender and degree information is obtained.

Sorting in vector

x<- c(6,7,1,2,5,9,8)
x
[1] 6 7 1 2 5 9 8

sort(x)
[1] 1 2 5 6 7 8 9

rank(x)
[1] 4 5 1 2 3 7 6

order(x)
[1] 3 4 5 1 2 7 6

x[order(x)]
[1] 1 2 5 6 7 8 9

x[order(-x)]
[1] 9 8 7 6 5 2 1

x[order(rank(x))]
[1] 1 2 5 6 7 8 9

x[order(rank(-x))]
[1] 9 8 7 6 5 2 1

To sort a data frame in R, use the order( ) function. By default, sorting is ASCENDING. Prepend
the sorting variable by a minus sign to indicate DESCENDING order. Here are some examples.
mtcars
dim(mtcars)
head(mtcars)
mtcars1=tail(mtcars)
attach(mtcars1)
newdata<-mtcars1[order(mpg),] ascending order based on mpg
newdata

newdata<-mtcars1[order(-mpg),] descending order based on mpg

newdata

newdata<-mtcars1[order(hp),] ascending order based on hp

newdata

newdata<-mtcars1[order(gear,carb),] ascending order based on gear,carb

newdata

newdata<-mtcars1[order(gear,-carb),] ascending based on gear and descending carb

detach(mtcars)

Without attach(mtcars)

We need to mention dataset name with $ symbol before attribute name

newdata<-mtcars1[order(mtcars1$mpg),]
newdata<-mtcars1[order(mtcars1$mpg, mtcars1$cyl),]
newdata<-mtcars1[order(mtcars1$mpg,- mtcars1$cyl),]

Data Manipulation Using R
No ratings yet
Data Manipulation Using R
98 pages
Module III
No ratings yet
Module III
53 pages
Data Manipulation Using R: Dr. D. Kothandaraman Associate Professor, SCOPE, VIT-AP Module - 3
No ratings yet
Data Manipulation Using R: Dr. D. Kothandaraman Associate Professor, SCOPE, VIT-AP Module - 3
56 pages
Introduction To R Software: Sorting and Ordering
No ratings yet
Introduction To R Software: Sorting and Ordering
8 pages
Data Analytics Using R
100% (1)
Data Analytics Using R
27 pages
R Course Own English HS
No ratings yet
R Course Own English HS
70 pages
Introduction To R PDF
No ratings yet
Introduction To R PDF
56 pages
DR - Pierpaolo-Delser - Introduction R
No ratings yet
DR - Pierpaolo-Delser - Introduction R
83 pages
6 Working With Data Frames in R
No ratings yet
6 Working With Data Frames in R
8 pages
SEC Notes
No ratings yet
SEC Notes
62 pages
Basic R Dplyr Session 4 Demonstration
No ratings yet
Basic R Dplyr Session 4 Demonstration
18 pages
R Programming Essentials
No ratings yet
R Programming Essentials
27 pages
R Language PDF
100% (1)
R Language PDF
619 pages
IDS Notes Unit 3
No ratings yet
IDS Notes Unit 3
14 pages
BIO259 Note
No ratings yet
BIO259 Note
55 pages
An Overview of R Language
No ratings yet
An Overview of R Language
23 pages
Factors
No ratings yet
Factors
23 pages
Tutorial-Introduction To Dplyr
No ratings yet
Tutorial-Introduction To Dplyr
54 pages
R - A Practical Course
No ratings yet
R - A Practical Course
42 pages
R Programming: © 2016 SMART Training Resources Pvt. LTD
No ratings yet
R Programming: © 2016 SMART Training Resources Pvt. LTD
28 pages
Module 3 R Data Science
No ratings yet
Module 3 R Data Science
158 pages
Unit 3 Chatgpt
No ratings yet
Unit 3 Chatgpt
6 pages
S24 Stats10 Lab1-1
No ratings yet
S24 Stats10 Lab1-1
8 pages
R Software - Notes
No ratings yet
R Software - Notes
18 pages
Statistics With R Unit 1: Divya Arun Kumar
No ratings yet
Statistics With R Unit 1: Divya Arun Kumar
65 pages
Intro to Data Science with R
No ratings yet
Intro to Data Science with R
40 pages
Mydata - Read - CSV ("Nameofthedatafile - CSV") : Sorting A Data Frame
No ratings yet
Mydata - Read - CSV ("Nameofthedatafile - CSV") : Sorting A Data Frame
2 pages
Week3 2020
No ratings yet
Week3 2020
20 pages
Data Types
No ratings yet
Data Types
27 pages
Lecture 1
No ratings yet
Lecture 1
42 pages
R-Tutorial - Introduction
No ratings yet
R-Tutorial - Introduction
30 pages
MDPN460 Lecture05
No ratings yet
MDPN460 Lecture05
32 pages
Data Types & RStudio Basics
No ratings yet
Data Types & RStudio Basics
42 pages
Module 1 Rprogramming Introduction Part A
No ratings yet
Module 1 Rprogramming Introduction Part A
20 pages
R Programming
No ratings yet
R Programming
30 pages
Lecture 1
No ratings yet
Lecture 1
35 pages
R Programming
No ratings yet
R Programming
61 pages
Base R
No ratings yet
Base R
2 pages
Unit 1 Big Data Analytics - An Introduction (Final)
No ratings yet
Unit 1 Big Data Analytics - An Introduction (Final)
65 pages
R Cheatsheet Base R
No ratings yet
R Cheatsheet Base R
2 pages
Biostat S1 Handout
No ratings yet
Biostat S1 Handout
7 pages
Week6 Slides Updated
No ratings yet
Week6 Slides Updated
57 pages
R-Data Structures
No ratings yet
R-Data Structures
14 pages
R Vectors and Matrices Guide
No ratings yet
R Vectors and Matrices Guide
33 pages
Introduction To R Chap 2
No ratings yet
Introduction To R Chap 2
30 pages
Data Analytic Using R - Advanced
No ratings yet
Data Analytic Using R - Advanced
51 pages
In R Programming PDF
No ratings yet
In R Programming PDF
72 pages
Chapter 03 Wrangling
No ratings yet
Chapter 03 Wrangling
40 pages
Unit 4
No ratings yet
Unit 4
27 pages
R Programming Basics for Beginners
No ratings yet
R Programming Basics for Beginners
14 pages
R Data Types 8
No ratings yet
R Data Types 8
7 pages
R1 Uptovisualisation
No ratings yet
R1 Uptovisualisation
122 pages
Section 03
No ratings yet
Section 03
20 pages
DA Lab Week-2
No ratings yet
DA Lab Week-2
22 pages
R Study Material I
No ratings yet
R Study Material I
8 pages
R Basic and Advanced
No ratings yet
R Basic and Advanced
9 pages
Basics of R Programming - Ghosh - Tagged
No ratings yet
Basics of R Programming - Ghosh - Tagged
18 pages
R Topicscovered
No ratings yet
R Topicscovered
22 pages
Unity Scripting Essentials
No ratings yet
Unity Scripting Essentials
8 pages
300 435 Demo
No ratings yet
300 435 Demo
10 pages
Muhammad Imran Resume
No ratings yet
Muhammad Imran Resume
1 page
KPIs GSM
No ratings yet
KPIs GSM
19 pages
Theory of Computation (CS F351) : BITS Pilani
No ratings yet
Theory of Computation (CS F351) : BITS Pilani
44 pages
Coc Practical Level 4
100% (1)
Coc Practical Level 4
2 pages
PC Tips and Tricks Part-1
No ratings yet
PC Tips and Tricks Part-1
70 pages
Android Debugging Logs
No ratings yet
Android Debugging Logs
71 pages
DRAGFLOW Catalogo DRF IT
No ratings yet
DRAGFLOW Catalogo DRF IT
1 page
Lect 4
No ratings yet
Lect 4
14 pages
Ds LIFEBOOK T939
No ratings yet
Ds LIFEBOOK T939
7 pages
Compatibility Test
No ratings yet
Compatibility Test
5 pages
Renaldo - 2021 - J. - Phys. - Conf. - Ser. - 1858 - 012063
No ratings yet
Renaldo - 2021 - J. - Phys. - Conf. - Ser. - 1858 - 012063
15 pages
Types of Computer
No ratings yet
Types of Computer
15 pages
Implementation of Singly Linked List
No ratings yet
Implementation of Singly Linked List
57 pages
Expedicao Continua 12.1.2210 Backoffice Contents
No ratings yet
Expedicao Continua 12.1.2210 Backoffice Contents
157 pages
Datasheet: of May, 2021
No ratings yet
Datasheet: of May, 2021
8 pages
Directx Video Acceleration Specification For Vp8 and Vp9 Video Coding
No ratings yet
Directx Video Acceleration Specification For Vp8 and Vp9 Video Coding
34 pages
Communication Strategy
50% (2)
Communication Strategy
38 pages
From The Jar To The WWW Jacopo Annese and The Digital Brain Library
No ratings yet
From The Jar To The WWW Jacopo Annese and The Digital Brain Library
11 pages
Hướng Dẫn Sử Dụng Máy in Phun Hitachi
No ratings yet
Hướng Dẫn Sử Dụng Máy in Phun Hitachi
45 pages
MATLAB Basics for Telecom Tech
No ratings yet
MATLAB Basics for Telecom Tech
4 pages
Study Material XI Typograhpy & Comp
No ratings yet
Study Material XI Typograhpy & Comp
152 pages
Scilab
No ratings yet
Scilab
11 pages
(Prof. S.S.Sarkate) : " Round Robin Algorithm "
50% (2)
(Prof. S.S.Sarkate) : " Round Robin Algorithm "
18 pages
Big Data Concepts, Warehousing, and Analytics, 1st Edition Research PDF Download
100% (10)
Big Data Concepts, Warehousing, and Analytics, 1st Edition Research PDF Download
17 pages
Drug License Online Application Process
No ratings yet
Drug License Online Application Process
28 pages
Solution 2000 3000 LCD Alphanumeric Codepad
No ratings yet
Solution 2000 3000 LCD Alphanumeric Codepad
2 pages
Tascam Us-1800 SM Revb
100% (1)
Tascam Us-1800 SM Revb
20 pages
CS 133 - Data Structures and File Organization: Binary Tree
No ratings yet
CS 133 - Data Structures and File Organization: Binary Tree
66 pages