0% found this document useful (0 votes)

22 views17 pages

R Tutorial3

The document is a tutorial on managing and visualizing World Bank data using R, focusing on final consumption expenditure for Germany, Italy, and Turkiye. It outlines steps for downloading, cleaning, and transforming the data, including importing it into R, transposing it, and converting it to a data frame. The tutorial concludes with a data visualization example using ggplot to display the trends over the years.

Uploaded by

cagatayunal00

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views17 pages

R Tutorial3

Uploaded by

cagatayunal00

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

R-tutorial-3

Çağatay Ünal

2024-10-17
Lecture 3

As we did last lesson, we are going to download some meta-data

and try to manage it, edit it, visualise it.
Downloading

We are going to use wordlbank data again. Lets move to world

development database. Our website is:
https://databank.worldbank.org/
After finding world development database, we will interested in 3
different countries. Germany, Italy, Turkiye. After selecting these
countries, continue with series. There are 1488 different
indicators/series. For our economics sakes, we will “Final
consumption expenditure (% of GDP)”
Mini Sum

Final consumption expenditure (formerly total consumption) is the

sum of household final consumption expenditure (private
consumption) and general government final consumption
expenditure (general government consumption). (worldbank.data)
Downloading 2

We will download this data as CSV document. After downloading

process, open zip document and move it to your working directory.
How do we find the working directory? Open the R Studio, Jump
to Tool, Go to Global Options, You will able to see the working
directory option. And you can easily change where do you want to
save your works etc.
Data Cleaning 1

Now we are going to import our data. You have basically two
options. The first one is importing it from right frame of the R
studio. In the default R Studio options, you will see Files option.
And find your data you downloaded it.
The second option is import from Import Dataset option where it
is up-right frame. Click it and click the “From Text (readr)”. Find
your data and import it.
Data Cleaning 2

Just a quick and important warning is “always” change your data

name while youre importing it.
After you imported it, please copy the lastest console code like this:

library(readr)
data <- read_csv("afe4aa7c-584a-4bcc-b540-ba55a787153d_Seri
Data Cleaning 3

And please “always” remember to R what we installed before and

will help you.

# if its necessary:

# install.packages("tibble")

library(tibble)
library(ggplot2)
library(tidyr)
Data Cleaning 4

We will transpose this data. Because we “always” need the

variables as columns, values as rows.
If youre not familiar with the “Transpose of a matrix” please type
exactly to Google. And you will remember from high school this
process. Its very fundamental.
And now we need to transpose it:

data_t <- t(data)

View(data_t)
Data Cleaning 5

Now we remove the all NA values from our data.

data_clean <- data_t[, colSums(is.na(data_t)) == 0]

View(data_clean)
Data Cleanin 6

As you can see, we do not need the first 4 rows. They are
irrelevant.
1st code is names our exactly the first row as a column names.

colnames(data_clean) <- data_clean[1, ]

And now we are going to remove our first four rows from the data.

data_clean <- data_clean[-(1:4), ]

View(data_clean)
Data Cleaning 7

We need to convert our data to data frame. Otherwise its just

basically some triva data and those are just random numbers or
words. For making some process, you need to “always” convert it
to data frame. Because R %90 works with dataframes.

data_clean <- data.frame(data_clean)

Data Cleaning 8

Ohh. There is something wrong with the first column. It does not
seem like a column. It often happens when you transpose your
data. For make it a column:
data_clean <- rownames_to_column(data_clean, var = "Years")

View(data_clean)
Data Cleaning 9

What is gsub? It is basically replacement syntax. It helps us to

change the words have same pattern.
Basic command is:
# gsub(pattern, replacement, x, ignore.case = FALSE, fixed = FALSE)

And we are going to:

data_clean$Years <- gsub("\\s*\\[YR[0-9]{4}\\]", "", data_clean$Years)

View(data_clean)
Data Cleaning 10

For doing some math or visualising, we need to convert all the

columns numeric.
data_clean[] <- lapply(data_clean, function(x) as.numeric(as.character(x)))

View(data_clean)
Data Cleaning 11

What is pivot_longer?
It reshapes data from a “wide” format to a “long” format.
# pivot_longer(data, cols, names_to = "name", values_to = "value")

In our example:
data_long <- pivot_longer(data_clean, cols = -Years, names_to = "variable", values_to = "value")

View(data_long)
Data Visualisation 1
ggplot(data_long, aes(x = Years, y = value, color = variable)) +
geom_line(size = 1) +
labs(title = "Data Visualization",
x = "Year",
y = "Value",
color = "Variable") +
theme_minimal()

Data Visualization

80.0

77.5

Variable
Value

75.0 Germany
Italy
Turkiye

72.5

70.0

R Tutorial2
No ratings yet
R Tutorial2
23 pages
R Studio: Scripts, Data Handling & Cleaning
No ratings yet
R Studio: Scripts, Data Handling & Cleaning
25 pages
CleaningData Chapter 3
No ratings yet
CleaningData Chapter 3
29 pages
BT1101 L2 LAB - Data Exploration and Viz AY2425S1
No ratings yet
BT1101 L2 LAB - Data Exploration and Viz AY2425S1
45 pages
R Data Cleaning Techniques
No ratings yet
R Data Cleaning Techniques
26 pages
Lesson 7 - The Data Frame
No ratings yet
Lesson 7 - The Data Frame
7 pages
Beginner Guide To R and R Studio V1
No ratings yet
Beginner Guide To R and R Studio V1
27 pages
Data Cleaning Using R
No ratings yet
Data Cleaning Using R
26 pages
Intro To Data Science Lecture 4
No ratings yet
Intro To Data Science Lecture 4
13 pages
Cleaning Data in R
No ratings yet
Cleaning Data in R
9 pages
Week2 DataWrangling DelimitedText PDF
No ratings yet
Week2 DataWrangling DelimitedText PDF
5 pages
Lab1 411 Eman Yahya 7773225
No ratings yet
Lab1 411 Eman Yahya 7773225
16 pages
R Programming
No ratings yet
R Programming
11 pages
1-Week R Programming Syllabus (Data Science, ML, Time Series)
No ratings yet
1-Week R Programming Syllabus (Data Science, ML, Time Series)
6 pages
Data Cleansing Using R
0% (1)
Data Cleansing Using R
10 pages
R Language PDF
100% (1)
R Language PDF
619 pages
Apunts BLOC 1 Estadística
No ratings yet
Apunts BLOC 1 Estadística
15 pages
Essential Knowledge For R Beginners B0D7S9F661-2
No ratings yet
Essential Knowledge For R Beginners B0D7S9F661-2
225 pages
Module 2 ExploratoryDataAnalysis
No ratings yet
Module 2 ExploratoryDataAnalysis
22 pages
EM622 Data Analysis and Visualization Techniques For Decision-Making
No ratings yet
EM622 Data Analysis and Visualization Techniques For Decision-Making
47 pages
R Tutorial
No ratings yet
R Tutorial
15 pages
R Cheat Sheets for ECON1267
No ratings yet
R Cheat Sheets for ECON1267
13 pages
Data Preparation: Treatment of Missing Values
No ratings yet
Data Preparation: Treatment of Missing Values
26 pages
R Data Science Essentials - Sample Chapter
No ratings yet
R Data Science Essentials - Sample Chapter
26 pages
Advanced R Guide for Beginners
No ratings yet
Advanced R Guide for Beginners
73 pages
Data Clean R
100% (1)
Data Clean R
11 pages
R Course Own English HS
No ratings yet
R Course Own English HS
70 pages
R Notes Based On Text Module 2
No ratings yet
R Notes Based On Text Module 2
24 pages
R Intro2021
No ratings yet
R Intro2021
23 pages
Data Cleaning in R with Tidyverse
No ratings yet
Data Cleaning in R with Tidyverse
55 pages
R
No ratings yet
R
14 pages
Peng Análisis Exploratorio R
No ratings yet
Peng Análisis Exploratorio R
198 pages
R Studio
No ratings yet
R Studio
15 pages
Mastering Data Analysis With R - Sample Chapter
No ratings yet
Mastering Data Analysis With R - Sample Chapter
32 pages
ProgrammingForDS14 Rbasics
No ratings yet
ProgrammingForDS14 Rbasics
32 pages
Section 03
No ratings yet
Section 03
20 pages
Assignment 2 Tidyr
No ratings yet
Assignment 2 Tidyr
2 pages
Introduction To R For Business Analytics
No ratings yet
Introduction To R For Business Analytics
7 pages
Introduction To Spatial Data Handling in R
No ratings yet
Introduction To Spatial Data Handling in R
25 pages
Data Analysis with R for Beginners
No ratings yet
Data Analysis with R for Beginners
4 pages
Data Preparation and Cleaning Guide
No ratings yet
Data Preparation and Cleaning Guide
28 pages
Week13 Slides Review
No ratings yet
Week13 Slides Review
23 pages
MR4103 - Week 6a
No ratings yet
MR4103 - Week 6a
21 pages
Starting With R
No ratings yet
Starting With R
34 pages
Important R Codes and Notes
No ratings yet
Important R Codes and Notes
13 pages
Data - Analysis Using Matlab
No ratings yet
Data - Analysis Using Matlab
156 pages
Week4 Slides
No ratings yet
Week4 Slides
54 pages
Week 1
No ratings yet
Week 1
10 pages
04 Data Cleaning in R
No ratings yet
04 Data Cleaning in R
36 pages
Matlab Mathworks Data Analysis
No ratings yet
Matlab Mathworks Data Analysis
167 pages
People Analytics With R Part 4
No ratings yet
People Analytics With R Part 4
11 pages
Analysis Using Statistical: Introduction & Data Exploration
No ratings yet
Analysis Using Statistical: Introduction & Data Exploration
23 pages
Lesson3 Sandbox - RMD
No ratings yet
Lesson3 Sandbox - RMD
4 pages
Tutorial 1
No ratings yet
Tutorial 1
29 pages
R-Programming Lab Mannual
No ratings yet
R-Programming Lab Mannual
33 pages
Lecture 1
No ratings yet
Lecture 1
42 pages
Object-Oriented Programming (Ccs0023) : College of Computer Studies
No ratings yet
Object-Oriented Programming (Ccs0023) : College of Computer Studies
32 pages
CIV2FAN
No ratings yet
CIV2FAN
7 pages
Machino Iepf 2
No ratings yet
Machino Iepf 2
258 pages
SAP Asset Manager Configuration Guide
100% (1)
SAP Asset Manager Configuration Guide
194 pages
APL
No ratings yet
APL
13 pages
Jailbreaking The T2 With Checkra1n
100% (3)
Jailbreaking The T2 With Checkra1n
5 pages
03 - Using Big Data Lite Virtual Machine
No ratings yet
03 - Using Big Data Lite Virtual Machine
21 pages
App To App Navigation CookBook
No ratings yet
App To App Navigation CookBook
11 pages
DevOps Essentials for IT Professionals
No ratings yet
DevOps Essentials for IT Professionals
1 page
EMMA Usermanual
No ratings yet
EMMA Usermanual
215 pages
Game Project SRS for Developers
No ratings yet
Game Project SRS for Developers
34 pages
Physical Database Design For Relational Databases
No ratings yet
Physical Database Design For Relational Databases
37 pages
Programming With Java MCA
No ratings yet
Programming With Java MCA
2 pages
Bijay Kumar Shah 20it103012
No ratings yet
Bijay Kumar Shah 20it103012
55 pages
District Level Sports Management System Ijariie12302
No ratings yet
District Level Sports Management System Ijariie12302
6 pages
Major Packages of ERP
No ratings yet
Major Packages of ERP
12 pages
NX Routing - Quick Start Guide
25% (4)
NX Routing - Quick Start Guide
16 pages
Java Concepts
No ratings yet
Java Concepts
172 pages
Manual Mechanical Desktop 2005 User Guide
100% (6)
Manual Mechanical Desktop 2005 User Guide
764 pages
Code View Spec
No ratings yet
Code View Spec
87 pages
Installation Instructuions
No ratings yet
Installation Instructuions
1 page
Software Engineer Career Highlights
No ratings yet
Software Engineer Career Highlights
2 pages
Jitesh Shewaramani Resume
No ratings yet
Jitesh Shewaramani Resume
1 page
QLC+ 4.13.1 User Manual
No ratings yet
QLC+ 4.13.1 User Manual
150 pages
ADA - Architecture Blueprint For Solution or Technology Template v1.101
No ratings yet
ADA - Architecture Blueprint For Solution or Technology Template v1.101
72 pages
Ultimate DoomVisor 2.xx Usage Guide
No ratings yet
Ultimate DoomVisor 2.xx Usage Guide
3 pages
SAP Posting Interface Script
No ratings yet
SAP Posting Interface Script
3 pages
Troubleshooting Inbound Processing Common Problems (ID 1467558.1)
No ratings yet
Troubleshooting Inbound Processing Common Problems (ID 1467558.1)
4 pages
Photostudio 10 en
No ratings yet
Photostudio 10 en
71 pages
Canvauserguide
No ratings yet
Canvauserguide
13 pages

R Tutorial3

Uploaded by

R Tutorial3

Uploaded by

R-tutorial-3

As we did last lesson, we are going to download some meta-data

We are going to use wordlbank data again. Lets move to world

Final consumption expenditure (formerly total consumption) is the

We will download this data as CSV document. After downloading

Just a quick and important warning is “always” change your data

And please “always” remember to R what we installed before and

We will transpose this data. Because we “always” need the

data_t <- t(data)

Now we remove the all NA values from our data.

data_clean <- data_t[, colSums(is.na(data_t)) == 0]

colnames(data_clean) <- data_clean[1, ]

data_clean <- data_clean[-(1:4), ]

We need to convert our data to data frame. Otherwise its just

data_clean <- data.frame(data_clean)

What is gsub? It is basically replacement syntax. It helps us to

And we are going to:

For doing some math or visualising, we need to convert all the

You might also like