04 PSysCalls (Function)
04 PSysCalls (Function)
POSIX
ESSENTIALS
Part I - SysCalls 1
●
Introduction
●
Files and directories
●
Processes
●
Inter Process Communication:
➔
Pipe
➔
Signal
➔
Socket
Introduction
Posix includes also some standard C library functions
100 30 100
printf()
C standard Posix
library
open() fopen()
3
Portable programs
4
C language interface
INTERRUPT
5
Header files
Posix has a series of #include <header files>
Frequently used header files
<assert.h> <limits.h> errno.h
<locale.h> <stdlib.h> qrp.h
<complex.h>
<string.h> math.h
<ctype.h> <math.h>
netdb.h
<pthread.h> <sys/stat.h> pthread.h
<dirent.h>
<setjmp.h> <tgmath.h> signal.h
<dlfcn.h>
stdio.h
<time.h>
<errno.h> <signal.h> stdlib.h
<stdarg.h> <unistd.h> strings.h
<fcntl.h>
time.h
<stdbool.h> <utime.h>
<fenv.h> unistd.h
<float.h> <stddef.h> <wchar.h> types.h
<wctype.h> stat.h
<inttypes.h> <stdint.h>
….
<iso646.h> <stdio.h>
6
In general and error management
●
Most system calls return -1 in case of error
●
In this case they assign a specific error code to
the global variable
extern int errno;
Some system calls accept a variable number of arguments (only the first
ones may be mandatory – see man pages)
7
Errno.h
●
Header file errno.h holds the symbolic standard names of error
codes
●
More than 100 error codes, only few are system dependent:
run-time diagnostics can be very accurate!
● Consult man errorno for list and details
●
Examples of frequently occurring:
EPERM 1 Operation not permitted (e.g. not owner)
ENOENT 2 No such file or directory
ESRCH 3 No such process ECHILD 10 No child processes
EINTR 4 Interrupted system call EACCES 13 Permission denied
EIO 5 Input/output error EBUSY 16 Device or resource busy
EEXIST 17 File exists
EPIPE 32 Broken pipe
8
perror
void perror (const char *str )
it prefixed by str
str: error message
str is used to identify the code point in which the error occurred:
…..
fd=open("nonexist.txt", O_RDONLY);
if (fd==-1) perror ("main");
…..
--> main: No such file or directory
9
man
●
A man page (short for manual page) is a form of software
documentation usually found on a Unix or Unix-like operating system.
Topics covered include computer programs (including library and
system calls), formal standards and conventions, and even abstract
concepts. A user may invoke a man page by issuing the man
command.
●
By default, man typically uses a terminal pager program such as
more or less to display its output.
●
To read a manual page for a Unix command, a user can type:
man < [section] command_name>
●
The section is optional.
10
man sections
●
Section Description
●
1 General commands
●
2 System calls
●
3 Library functions, covering in particular the C standard library
●
4 Special files (usually devices, those found in /dev) and drivers
●
5 File formats and conventions
●
6 Games and screensavers
●
7 Miscellanea
●
8 System administration commands and daemons
11
man example
12
Files primitives
• Creating files, directories, special files
• Open / Close files
• File access
• File and record locking
• Creating and destroying a link
• Reading file attributes
• Changing file attributes
• Changing the current directory
• Redirection and pipeline
13
the concept of file in Posix
A Posix file is much more than a collection of data on a disk.
14
opening a file
To use a file you must first open it with
15
opening / closing
int fd;
...
fd=open(pathname, ...);
...
read(fd, ...);
...
write(fd,...);
...
close(fd);
Note:
A file can be opened more than once, so that
multiple file descriptors can point to the same file
16
open
int open (const char *pathname, int flag [,…]);
• opens (or creates)the file pathname,
with policy defined by flag
• returns a file descriptor fur futher use (or -1 if error)
flag: O_RDONLY read-only
O_WRONLY write-only
18
close
Note:
When a process exits, all its files are
closed by an implicit close.
19
read
20
write
21
example
Copy standard input into standard output
#include ...
#define BUFFSIZE 8192
int main(void)
{
int n;
char buf[BUFFSIZE];
while( (n=read(STDIN_FILENO, buf, BUFFSIZE)) > 0)
if (write(STDOUT_FILENO, buf, n) != n)
perror("main");
if (n<0)
perror("main");
exit(0);
}
22
example
Note:
• the correct file descriptors are:
STDIN_FILENO and STDOUT_FILENO
they are defined in <unistd.h>
O and 1 can be used instead
• standard input and standard output do not need open/close
23
example
24
example
BUFFSIZE User CPU System CPU Clock time # loops
(seconds) (seconds) (seconds)
1 23.8 397.9 423.4 1468802
2 12.3 202.0 215.2 734401
4 6.1 100.6 107.2 367201
8 3.0 50.7 54.0 183601
16 1.5 25.3 27.0 91801
32 0.7 12.8 13.7 45901
64 0.3 6.6 7.0 22950
128 0.2 3.3 3.6 11475
256 0.1 1.8 1.9 5738
512 0.0 1.0 1.1 2869
1024 0.0 0.6 0.6 1435
2048 0.0 0.4 0.4 718
4096 0.0 0.4 0.4 359
8192 0.0 0.3 0.3 180
16384 0.0 0.3 0.3 90
32768 0.0 0.3 0.3 45
65536 0.0 0.3 0.3 23
131072 0.0 0.3 0.3 12
25
example
Note:
The kernel strongly uses caching to limit physical I/O
MEMORY DISC
write (..., buf, ...)
cache
cluster
26
flushing
●
A series of POSIX system calls provide I/O flushing, as to say
synchronizing a file's in-core state with storage device
●
Different functions are available (most popular: sync, fflush,
fsync and others).
●
The simplest one is
27
lseek
off_t lseek(int fildes,off_t offset, int whence);
●
advances the current position for reading/writing in
the file fildes of offset bytes starting from the
position specifiedin whence
Note:
lseek does not carry out I/O operations
28
link
29
rename
int rename (const char *old, const char *new);
30
unlink
31
directories
int mkdir(const char *path,mode_t mode);
Removes a directory
32
directories
●
Portable directory structure, common syscalls:
struct dirent {
ino_t d_ino;/*i-number*/
char d_name[NAME_MAX+1]/*filename*/
};
33
file status
34
file status
Header file stat.h:
35
file attributes
int chmod(const char *path, mode_t mode);
Changes permission of the file path as specified in mode
36
dup
37
dup
38
file system administration
POSIX does not include system calls for file system administration
39
concurrent access to files
●
Files (entities referred to by a file descriptor) are shareable
resources
●
No mutual exclusion is provided
●
Two (or more) processes can concurrently read / write from / to
a same file
●
Race condition are possible (data jamming)
●
Some special files used for IPC intrinsically provide mutual
exclusion (see e.g. pipes in the following)
●
System calls for mutual exclusion (called locking) are available
40
file and record locking
●
An entire file can be locked as well as some part of it
(record)
●
A lock can prevent simultaneous read/write by different
processes
●
A lock is advisory if holds only for groups of processes
participating in the locking
●
A lock is mandatory if any process is excluded when one
process locks the file/record (discouraged in POSIX)
●
Locks must be set and freed like Dijkstra’s semaphores: a
lock is in principle a P operation, unlocking is V
41
file and record locking
●
locking / unlocking in POSIX is complex
●
several system calls are available
fcntl(), flock() lockf()
●
The simplest system call is:
int flock(int fd, int operation);
42
file and record locking
43
POSIX file system calls 1/2
44
POSIX file system calls 2/2
45
file descriptor vs. streams
●
C language has a standard library for file management based
on the type FILE
● FILE is a type of a “file pointer” (or stream) variable
FILE * fp;
which is not a file descriptor.
●
A stream can be opened, closed, read and written by specific
C standard library functions (they all are prefixed by “f”):
fopen(), fclose(), fread(), fwrite(),
fscanf(), fprintf()
file descriptor vs. streams
●
a file descriptor can be derived by a stream by the
function fileno
int fileno( FILE * fp );
●
a file pointer can be derived by a file descriptor by the
function fdopen
FILE * fdopen();
process management
●
A process image file is composed of 4 main
parts
u-area
data
stack
text
49
process creation
●
A generic POSIX process is created by a parent
process by the fork system call
father
father son
Same execution environment of
u-area father: u-area (program counter,
u-area
open files, working directory, ...),
data and stack:
data data
fork()
stack stack
text text
child
father
50
fork
to the child: 0
to the parent: PID of the child
in case of error: -1
51
fork
pid = fork();
if (pid<0) {/*fork failed */ }
else if(pid>0) {/*father’s code */}
else {/*son’s code*/}
father son
u-area u-area
data data
stack stack
text text
52
exec
pid_t exec (pathname, arguments)
53
execve
54
exec family
●
There are several exec system calls with tiny
differences
int execlp ( const char *filename, const char *arg0, ... ); … path
55
wait, waitpid
56
_exit, exit
●
_exit is the POSIX system call, exit is the standard C call
●
exit calls _exit so they are identical
57
times
clock_t times (struct tms *buf);
58
the sleep family
●
in several situations a process may require rescheduling for a
certain time amount
●
the sleep family of system calls is available at this purpose
●
a sleep action moves the process to the waiting state for a
specified time interval; after that, the process goes to ready
state
●
the family is composed by three system calls with different time
scales: sleep (seconds), usleep (microseconds) and
nanosleep (nanoseconds)
●
very short intervals may use busy waiting instead of
rescheduling (no switcher action)
59
the sleep family
60
The sleep family
int nanosleep(const struct timespec *req,
struct timespec *rem);
struct timespec {
time_t tv_sec; /* seconds */
long tv_nsec; /* nanoseconds */
};
●
Suspends the execution of the caller until either:
1) at least the time specified in *req has elapsed, or
2) a signal (see signal section) has interrupted the caller.
● If the call is interrupted, nanosleep returns -1, sets errno to EINTR, and writes the
remaining time into the structure pointed to by rem. The value of *rem can then be
used to call nanosleep again and complete the specified pause.
● nanosleep is the most complete and reliable system call of the family.
61
Example 1: how the shell works
write(1, PROMPT); //STD_OUT
while ((n=read(0,buffer,80)=!0) { //STD_IN
/* command line processing, parse filename */
if ((pid=fork()) == 0) {
62
shell and redirection
●
The shell language includes redirection, background and pipeline commands
● Let “$” be the prompt message from the shell
● $ command > filename means: command redirects its STD_OUT onto
filename (writes on filename)
● $ command < filename means: command redirects its STD_IN onto
filename (reads from filename)
● $ command & means: command is executed in background (the shell returns
immediately to prompt without waiting command completion)
● $ command_1 | command_2 means: STD_OUT of command_1 is redirected
onto STD_IN of command_2 (a pipeline)
● >, <, |, & can be composed in several intuitive ways
● Any file descriptor n can be redirected with the sintax n> or n< (> and < as
above are shortcuts for 1> and 0<, respectively)
● STD_ERR can hence be redirected with 2< and 2>
63
shell and redirection
Examples using generic processes
Assume that all processes read by default from STD_IN, write on
STD_OUT and write errors on STD_ERR
● $ proc1> file
●
proc1 redirects its normal output to file
● $ proc1 | proc2
●
the output of proc1 is send as input of proc2 (pipeline)
● $ proc1 < file
●
proc1 reads its input from a file
● $ proc1 < file1 > file2
●
proc1 reads from file1 and writes on file2
● $ proc1 2>/ logfile.txt | proc2 > file &
● $
●
proc1 sends its output to proc2; proc2 writes into file2; error messages
from proc1 are written into a logfile; the whole command is executed in
background returning to the shell upon execution start
64
shell and redirection
The same examples using system processes
● $ man fork > textfile
●
writes a man page into textfile
● $ ls usr/bin | grep icon
●
lists the directory /usr/bin and sends the outptut to the program sort (which
usually reads from STD_IN); and it shows on the screen
● $ sort < textfile.txt
●
the program sort reads the text to sort from textfile.txt instead of STD_IN
● $ sort < textfile.txt > textfile_sorted.txt
●
same as above, but sort writes into a file
● $ ls -R / 2>/dev/null | grep “*.txt” > result.txt &
●
lists recursively the entire file system, sends error messages to a dummy
output, sends the right output to grep which filters a certain pattern, and writes
results into a textfile; since execution takes many seconds, the whole pipeline
is executed in background so that the shells gains immediate control
65
Example 1: how the shell works
We can complete the previous example 1 with redirection and
background execution (pipeline will follow in part II)
if ((pid=fork()) == 0) {
if (exec(file,args)==-1) exit(1);
procid=wait(status);
66
Example 2: how the shell works
write(1, PROMPT);
while ((n=read(0,buffer,80)=!0) {
/* command line processing: parse filename,redirection on
file f, and background_execution */
if ((pid=fork()) == 0) { redirects file
if (output_redirection) descriptor 1 onto file
fd=creat("f", ...); descriptor of file f
close(1);
dup(fd);
close(fd);
}
if (exec(file,args)==-1) exit(1); if “&” command,
} does not wait
if (!background_execution) {
procid=wait(status);
if (status!=0) write(2,’cmd not found’);
}
write(1, PROMPT);
}
67