Operating Systems
Topic: Introduction to Operating Systems
Endadul Hoque
Acknowledgement
• Youjip Won (Hanyang University)
• OSTEP book – by Remzi and Andrea Arpaci-Dusseau
(University of Wisconsin)
A Sample program
#include <stdio.h> #include <stdio.h>
#define MAX 100
main()
main() {
{ int x;
int x;
for (x = 1; x <= 100; x++) {
x = 1; printf("%d\n", x);
while (x <= MAX) { }
printf("%d\n", x);
}
x++;
}
}
A Sample program
Some things you need to know about C:
#include <stdio.h>
(you probably can figure out most of it without
an explanation!)
#define MAX 100 •main() is the name of a function. It is very
main()
much the same as the main method in a Java
{ class.
int x; •printf() is also the name of a function, similar
to Java’s system.out.print()
x = 1;
while (x <= MAX) { •This program can use printf() function,
printf("%d\n", x); because of the line #include <stdio.h>, like
x++;
Java’s import java.io.*;
}
} •#define MAX 100 essentially defines a constant
(also called macro in C)
C Program: Source to execution
• The standard way to generate and
eventually execute a program is much the
same for all high-level languages
Source Intermediate Machine
code Compiler code Assembler code
(High-level (Assembly (often,
language) language) exec)
C Program: Source to execution
• Once we have a machine code
machine → linking and loading → program execution
code (program) (program)
C Program: Source to execution
• Without going into too much detail, Java
programs do not exactly go through this
set of steps. Instead, Java programs are
“executed” by an interpreter
Java → compiler → Java → Java interpreter
source (program) "byte codes" (program)
code
C Program: Source to execution
• For C programs, compilation has two
conceptually separate steps
C language → C preprocessor → compiler → assembly → …
source (program) (program) language
code
What happens when a program runs?
• A running program executes instructions.
1. The processor fetches an instruction from memory.
2. Decode: Figure out which instruction this is
3. Execute: i.e., add two numbers, access memory, check a
condition, jump to function, and so forth.
4. The processor moves on to the next instruction and so on.
Von Neumann model of computing
Von-Neumann Computing
https://www.javatpoint.com/von-neumann-model
https://computersciencewiki.org/index.php/The_mac
hine_instruction_cycle
An Operating System (OS)
• Responsible for
– Making it easy to run programs
– Allowing programs to share memory
– Enabling programs to interact with devices
OS is in-charge of making sure the system
operates correctly and efficiently.
How does OS ensure this?
One program is easy, but
we’ll have many programs.
What about then?
Magic is there’s no magic!
VIRTUALIZATION
Virtualization
• The OS takes a physical resource and transforms
it into a virtual form of itself.
– Physical resource: Processor, Memory, Disk …
• The virtual form is more general, powerful and
easy-to-use.
• Sometimes, we refer to the OS as a virtual
machine.
System Calls
• System call allows users to tell the OS what to do.
– The OS provides some interface (APIs, standard library).
– A typical OS exports a few hundred system calls.
• Run programs
• Access memory
• Access devices
https://www3.nd.edu/~pbui/teaching/cse.30341.fa19/ https://developer.ibm.com/articles/l-linux-kernel/
System Calls
• System call allows users to tell the OS what to do.
– The OS provides some interface (APIs, standard library).
– A typical OS exports a few hundred system calls.
• Run programs
OS •hides the details and complexity
Access memory away
from
• Access devicesthe user applications
https://www3.nd.edu/~pbui/teaching/cse.30341.fa19/ https://developer.ibm.com/articles/l-linux-kernel/
The OS as a resource manager
• The OS manage resources such as CPU,
memory and disk.
• The OS allows
– Many programs to run → Sharing the CPU
– Many programs to concurrently access their own
instructions and data → Sharing memory
– Many programs to access devices → Sharing disks
Virtualizing the CPU
1 #include <stdio.h>
2 #include <stdlib.h>
3 #include <sys/time.h>
4 #include <assert.h>
5 #include "common.h"
6
7 int
8 main(int argc, char *argv[])
9 {
10 if (argc != 2) {
11 fprintf(stderr, "usage: cpu <string>\n");
12 exit(1);
13 }
14 char *str = argv[1];
15 while (1) {
16 Spin(1); // Repeatedly checks the time and
returns once it has run for a second
17 printf("%s\n", str);
18 }
19 return 0;
20 }
Simple Example(cpu.c): Code That Loops and Prints
Virtualizing the CPU
• Execution result 1
prompt> gcc -o cpu cpu.c -Wall
prompt> ./cpu "A"
A
A
A
ˆC
prompt>
Run forever; Only by pressing “Control-c” we can halt the program
Virtualizing the CPU
• Execution result 2
prompt> ./cpu A & ; ./cpu B & ; ./cpu C & ; ./cpu D &
[1] 7353
[2] 7354
[3] 7355
[4] 7356
A
B
D
C
A Even though we have only
B one processor, all four of
D programs seem to be
C running at the same time!
A
...
Virtualizing the CPU
• The system has a very large number of virtual
CPUs.
– Turning a single CPU into a seemingly infinite number
of CPUs.
– Allowing many programs to seemingly run at once
→ Virtualizing the CPU
Virtualizing Memory
• The physical memory is an array of bytes.
• A program keeps all of its data structures in
memory.
– Read memory (load):
• Specify an address to be able to access the data
– Write memory (store):
• Specify the data to be written to the given address
Virtualizing Memory
• A program that Accesses Memory (mem.c)
1 #include <unistd.h>
2 #include <stdio.h>
3 #include <stdlib.h>
4 #include "common.h"
5
6 int
7 main(int argc, char *argv[])
8 {
9 int *p = malloc(sizeof(int)); // a1: allocate some memory
10 assert(p != NULL);
11 printf("(%d) address of p: %08x\n",
12 getpid(), (unsigned) p); // a2: print out the
address of the memory
13 *p = 0; // a3: put zero into the first slot of the memory
14 while (1) {
15 Spin(1);
16 *p = *p + 1;
17 printf("(%d) p: %d\n", getpid(), *p); // a4
18 }
19 return 0;
20 }
Virtualizing Memory
• The output of the program mem.c
prompt> ./mem
(2134) memory address of p: 00200000
(2134) p: 1
(2134) p: 2
(2134) p: 3
(2134) p: 4
(2134) p: 5
ˆC
– The newly allocated memory is at address 00200000.
– It updates the value and prints out the result.
Virtualizing Memory
• Running prompt> ./mem &; ./mem &
[1] 24113
mem.c [2] 24114
multiple times (24113) memory address of p: 00200000
(24114) memory address of p: 00200000
(24113) p: 1
(24114) p: 1
(24114) p: 2
(24113) p: 2
(24113) p: 3
(24114) p: 3
...
• As if each running program has its own private memory.
– Each running program has allocated memory at the same
address.
– Each seems to be updating the value at 00200000
independently.
Virtualizing Memory
• Each process accesses its own private virtual
address space.
– The OS maps address space onto the physical
memory.
– A memory reference within one running program
does not affect the address space of other processes.
– Physical memory is a shared resource, managed by
the OS.
CONCURRENCY
The Problem of Concurrency
• Where does it stem from?
• The OS is juggling many things at once, first
running one process, then another, and so forth.
• Modern multi-threaded programs also exhibit
the concurrency problem.
The Problem of Concurrency
• A multi-threaded program (thread.c)
1 #include <stdio.h>
2 #include <stdlib.h>
3 #include "common.h"
4
5 volatile int counter = 0;
6 int loops;
7
8 void *worker(void *arg) {
9 int i;
10 for (i = 0; i < loops; i++) {
11 counter++;
12 }
13 return NULL;
14 }
15 ...
The Problem of Concurrency
• A multi-threaded program (thread.c)
1 16 #include
int <stdio.h>
2 17 #include
main(int <stdlib.h>
argc, char *argv[])
3 18 #include
{ "common.h"
4 19 if (argc != 2) {
5 20 volatile int counter = 0;
fprintf(stderr, "usage: threads <value>\n");
6 21 int loops; exit(1);
7 22 }
8 23 loops = atoi(argv[1]);
void *worker(void *arg) {
24 pthread_t p1, p2;
9 25 int i;
printf("Initial value : %d\n", counter);
1026 for (i = 0; i < loops; i++) {
1127 counter++;
Pthread_create(&p1, NULL, worker, NULL);
1228 } Pthread_create(&p2, NULL, worker, NULL);
1329 return NULL;
Pthread_join(p1, NULL);
1430 } Pthread_join(p2, NULL);
1531
... printf("Final value : %d\n", counter);
32 return 0;
33 }
The Problem of Concurrency
• A multi-threaded program (thread.c)
1 16 #include
int <stdio.h>
2 17 #include
main(int <stdlib.h>
argc, char *argv[])
3 18 The main"common.h"
#include
{ program creates two threads.
4 19
5 20
• Thread: runsfprintf(stderr,
within= the
if (argc != 2) {
volatile int counter 0;
same memory space. Each
"usage: threads <value>\n");
6 21 int thread
loops; startsexit(1);
running in a function called worker().
7 22 }
• worker(): increments a counter
8 23 loops = atoi(argv[1]);
void *worker(void *arg) {
24 pthread_t p1, p2;
9 25 int i;
printf("Initial value : %d\n", counter);
1026 for (i = 0; i < loops; i++) {
1127 counter++;
Pthread_create(&p1, NULL, worker, NULL);
1228 } Pthread_create(&p2, NULL, worker, NULL);
1329 return NULL;
Pthread_join(p1, NULL);
1430 } Pthread_join(p2, NULL);
1531
... printf("Final value : %d\n", counter);
32 return 0;
33 }
The Problem of Concurrency
• loops determines how many times each of the two workers will
increment the shared counter in a loop.
– loops: 1000.
prompt> gcc -o thread thread.c -Wall -pthread
prompt> ./thread 1000
Initial value : 0
Final value : 2000
– loops: 100000.
prompt> ./thread 100000
Initial value : 0
Final value : 143012 // huh??
prompt> ./thread 100000
Initial value : 0
Final value : 137298 // what the??
The Problem of Concurrency
• Why is this happening?
• Increment a shared counter → take three instructions.
1. Load the value of the counter from memory into register.
2. Increment it
3. Store it back into memory
• These three instructions do not execute
atomically. → Problem of concurrency happen.
PERSISTENCE
Persistence
• Devices such as DRAM store values in a volatile.
• Hardware and software are needed to store data
persistently.
– Hardware: I/O device such as a hard drive, solid-state
drives(SSDs)
– Software:
• File system manages the disk.
• File system is responsible for storing any files the user creates.
Persistence
• Create a file (/tmp/file) that contains the string “hello world”
1 #include <stdio.h>
2 #include <unistd.h>
3 #include <assert.h>
4 #include <fcntl.h>
5 #include <sys/types.h>
6
7 int
8 main(int argc, char *argv[])
9 {
10 int fd = open("/tmp/file", O_WRONLY | O_CREAT
| O_TRUNC, S_IRWXU);
11 assert(fd > -1);
12 int rc = write(fd, "hello world\n", 13);
13 assert(rc == 13);
14 close(fd);
15 return 0;
16 }
open(), write(), and close() system calls are routed to the part of
OS called the file system, which handles the requests
Persistence
• What OS does in order to write to disk?
– Figure out where on disk this new data will reside
– Issue I/O requests to the underlying storage device
• File system handles system crashes during write.
– Journaling or copy-on-write
– Carefully ordering writes to disk
DESIGN GOALS OF AN OS
Design Goals
• Build up abstraction
– Make the system convenient and easy to use.
• Provide high performance
– Minimize the overhead of the OS.
– OS must strive to provide virtualization without excessive
overhead.
• Protection between applications
– Isolation: Bad behavior of one does not harm other and the OS
itself.
Design Goals
• High degree of reliability
– The OS must also run non-stop.
• Other issues
– Energy-efficiency
– Security
– Mobility
Reading Material
• Chapter 2 of OSTEP book – by Remzi and Andrea
Arpaci-Dusseau (University of Wisconsin)
http://pages.cs.wisc.edu/~remzi/OSTEP/intro.pdf
Questions?