Cgroups
Cgroups
Source
...
A cgroup is a logical grouping of processes that can be used for resource management
in the kernel. Once a cgroup has been created, processes can be migrated in and out
of the cgroup via a pseudo-filesystem API (details can be found in the kernel source
file Documentation/cgroups/cgroups.txt).
Modern:
$ cat /sys/fs/cgroup/cgroup.controllers
blkio
memory
devices
freezer
net_cls
perf_event
net_prio
hugetlb
pids
rdma
misc
$
FYI, the lssubsys utility is part of the cgroup-tools package on Debian/Ubuntu, and
the libcgroup-tools package on SuSe.
>>
The CPU controller mechanism allows a system manager to control the percentage of
CPU time given to a cgroup. The CPU controller can be used both to guarantee that a
cgroup gets a guaranteed minimum percentage of CPU on the system, regardless of
other load on the system, and also to set an upper limit on the amount of CPU time
used by a cgroup, so that a rogue process can't consume all of the available CPU
time.
CPU scheduling is first of all done at the cgroup level, and then across the processes
within each cgroup. As with some other controllers, CPU cgroups can be nested, so
that the percentage of CPU time allocated to a top-level cgroup can be further
subdivided across cgroups under that top-level cgroup.
The memory controller mechanism can be used to limit the amount of memory that a
process uses. If a rogue process runs over the limit set by the controller, the kernel
will page out that process, rather than some other process on the system.
...
Resources
See <kernel-src-tree>/Documentation/cgroups
$ ls git-kernel/Documentation/cgroups/ |col
00-INDEX
blkio-controller.txt
cgroups.txt
cpuacct.txt
cpusets.txt
devices.txt
freezer-subsystem.txt
hugetlb.txt
memcg_test.txt
memory.txt
net_cls.txt
net_prio.txt
unified-hierarchy.txt
$
https://www.kernel.org/doc/Documentation/cgroups/cgroups.txt
man systemd.resource-control
[Note: this material below is taken from the public RHEL 7 Resource Guides; thus, it pertains
specifically to RHEL 7].
The control groups, abbreviated as cgroups in this guide, are a Linux kernel feature that allows you
to allocate resources — such as CPU time, system memory, network bandwidth, or combinations of
these resources — among hierarchically ordered groups of processes running on a system.
By using cgroups, system administrators gain fine-grained control over allocating, prioritizing,
denying, managing, and monitoring system resources. Hardware resources can be smartly divided
up among applications and users, increasing overall efficiency.
Control Groups provide a way to hierarchically group and label processes, and to apply resource
limits to them. Traditionally, all processes received similar amount of system resources that
administrator could modulate with the process niceness value. With this approach, applications that
involved a large number of processes got more resources than applications with few processes,
regardless of the relative importance of these applications.
Red Hat Enterprise Linux 7 moves the resource management settings from the process level to the
application level by binding the system of cgroup hierarchies with the systemd unit tree. Therefore,
you can manage system resources with systemctl commands, or by modifying systemd unit
In previous versions of Red Hat Enterprise Linux, system administrators built custom cgroup
hierarchies with use of the cgconfig command from the libcgroup package. This package is now
deprecated and it is not recommended to use it since it can easily create conflicts with the default
cgroup hierarchy. However, libcgroup is still available to cover for certain specific cases, where
systemd is not yet applicable, most notably for using the net-prio subsystem. See Chapter 3, Using
libcgroup Tools.
The aforementioned tools provide a high-level interface to interact with cgroup controllers (also
known as subsystems) in Linux kernel. The main cgroup controllers for resource management are
cpu, memory and blkio, see Available Controllers in Red Hat Enterprise Linux 7 for the list of
controllers enabled by default. For detailed description of resource controllers and their
configurable parameters, refer to Controller-Specific Kernel Documentation.
<<
Cgroup Terminology:
• A *cgroup* associates a set of tasks with a set of parameters for one or more subsystems.
• A *subsystem* is a module that makes use of the task grouping facilities provided by
cgroups to treat groups of tasks in particular ways.
• A subsystem is also called as a cgroup resource *controller*.
Eg. cpu, cpuset, memory, blkio, net_cls, freezer, etc.
• A *hierarchy* is a set of cgroups arranged in a tree, such that every task in the system is in
exactly one of the cgroups in the hierarchy, and a set of subsystems (or resource controllers);
each subsystem has system-specific state attached to each cgroup in the hierarchy. Each
hierarchy has an instance of the cgroup virtual filesystem associated with it (IOW, it's
mounted into the root filesystem via the “cgroup” VFS type).
>>
...
A resource controller, also called a cgroup subsystem, represents a single resource, such as CPU
time or memory. The Linux kernel provides a range of resource controllers, that are mounted
automatically by systemd. Find the list of currently mounted resource controllers in
/proc/cgroups, or use the lssubsys monitoring tool. In Red Hat Enterprise Linux 7, systemd
• memory — sets limits on memory use by tasks in a cgroup, and generates automatic reports
on memory resources used by those tasks;
• net_cls — tags network packets with a class identifier (classid) that allows the Linux
traffic controller (the tc command) to identify packets originating from a particular cgroup
task;
• perf_event — enables monitoring cgroups with the perf tool;
• hugetlb — allows to use virtual memory pages of large sizes, and to enforce resource
limits on these pages.
The Linux Kernel exposes a wide range of tunable parameters for resource controllers that can be
configured with systemd. See the kernel documentation (list of references in Controller-Specific
Kernel Documentation) for detailed description of these parameters.
<<
Eg. On an Ubuntu 16.04.2 LTS desktop class system:
# cat /etc/issue
Ubuntu 16.04.2 LTS \n \l
hugetlb
pids
#
# cat /proc/cgroups
#subsys_name hierarchy num_cgroups enabled
cpuset 5 1 1
cpu 6 75 1
cpuacct 6 75 1
blkio 11 73 1
memory 8 95 1
devices 2 73 1
freezer 3 8 1
net_cls 9 1 1
perf_event 7 1 1
net_prio 9 1 1
hugetlb 4 1 1
pids 10 74 1
#
>>
...
...
cpu.shares - The default value is 1024. This gives any process in this cgroup 1024 out of 1024
"CPU shares". In other words if you lower this value it will limit the process. For example: if I set
this value to 512 then the process will receive a maximum of 50% of the CPU if and only if another
process is also requesting CPU time (ignoring any nice and realtime values you may have set). It
still has the option to consume 100% of the idle CPU time.
cpu.cfs_period_us - The default value is 100000 and refers to the time period in which the standard
scheduler "accounts" the process in microseconds. It does little on its own. (100 ms).
cpu.cfs_quota_us - The default value is -1 which means it has no effect. Any valid value here is in
microseconds and must not exceed cpu.cfs_period_us.
Example
In this first example I have set cpu.shares = 100 for the matho-primes process, which gives the process 100 out
of 1024 arbitrary cycles.
As you can see this has not throttled my process. Because the system has CPU time to spare the process still
consumes all that it can.
In this next example I set cpu.cfs_period_us = 50000 and cpu.cfs_quota_us = 1000 for the same process.
This has had the desired effect. For every 50,000 µs time slice, the process is only allowed to use 1,000 µs
(2%) and is paused until the next time slice is available. This is true regardless of the current system demand.
(Note: the process can still receive less than its 2% allotted time if the system is heavily loaded or a higher
priority process demands the time.)
I can check the amount of throttling that has been done at any time:
$ cat cpu.stat
nr_periods 336
nr_throttled 334
throttled_time 16181179709
To launch the process within the cgroup I used the following command:
Summary
Once running the cgroup can be edited by the owner of that group, or the process can be moved to a different
cgroup. This may be handy using a Cron job to give a process more time at certain times of the day.
Notes:
• The current (April 2015) Linux cgroup documentation doesn't mention the cpu subsystem at all and has
introduced the cpusets subsystem, but the two do not do the same job. It is not clear if this type of throttling
capability has been removed altogether or moved to a different area of the kernel.
• There are, of course, bugs.
• Just because you have a process in one cgroup, doesn't mean it cannot also be in another.
• Different process subsystems (cpu, memory, freezer, etc.) can be in different in cgroups.
• Multiple processes can and do share cgroups, but only when told to do so.
• Child processes remain inside the cgroups unless moved out of them, and only if they are allowed to be
moved (set by policy).
• cgroups are hierarchical and you can create sub-cgroups to limit certain processes further. A sub-cgroup
cannot receive more resources than its parent cgroup.
• There may be a slight performance penalty depending on your choice for period and quota. You will only really
need to worry about this on highly optimised or incredibly large systems.
• cgroups do not offer virtualisation or jailing of a process, though they can be used alongside these systems,
and are in many circumstances (such as Android and LXC).
• You can still set the nice and realtime values to give processes certain priorities. This will not affect the
maximum CPU time allowed by the cgroup and the CPU time will be shared in a complex manner between all
processes, as One would expect from a decent operating system.
• The cpu.cfs* values of course refer to the CFS scheduler, the cpu.rt* values refer to the realtime scheduler. It
is unlikely you will want to change the realtime values unless you want fine-grained control over realtime
processes.
Also see:
Linux cgroups: limit CPU usage in absolute values which do not depend on CPU speed
“... There are at least three ways in which you can control how much CPU time a process
gets:
...”
-20 0 +19
Best Default Worst
Sample Runs:
Notice the output CPU %age from htop.
<< restrict each instance to run on CPU #1 only (available CPUs: [0-3])
>>
$ taskset 02 ./primegen 1 1000000000 >/dev/null &
[1] 23752
$ taskset 02 ./primegen 1 1000000000 >/dev/null &
[2] 23753
Notice how they together take up 100% load on that CPU, and equally share CPU bandwidth
between them.
1.2 Restrict primegen to 1 CPU, run 2 instances but use (re)nice to lower the priority of one of
them:
<< restrict each instance to run on CPU #1 only (available CPUs: [0-3])
>>
$ taskset 02 ./primegen 1 1000000000 >/dev/null &
[1] 23945
$ taskset 02 ./primegen 1 1000000000 >/dev/null &
[2] 23946
$ renice -n 10 -p 23945 << restrict cpu bandwidth of PID 23945 via the
(re)nice ! >>
23945 (process ID) old priority 0, new priority 10 << 10 is worse! >>
$
Notice how now they together take up 100% load on that CPU, but do not equally share CPU
bandwidth between them – the one with the lower nice value is relegated a (much) lower CPU
share.
/dev/null
$
Source
“ ... The cpulimit tool curbs the CPU usage of a process by pausing the process at different
intervals to keep it under the defined ceiling. It does this by sending SIGSTOP and SIGCONT
signals to the process. It does not change the nice value of the process, instead it monitors
and controls the real-world CPU usage.
cpulimit is useful when you want to ensure that a process doesn't use more than a
certain portion of the CPU. The disadvantage over nice is that the
process can't use all of the available CPU time when the system is idle. ...”
This can also cause issues with interactive shells that detect or
otherwise depend on SIGSTOP/SIGCONT. For example, you may place a job
in the foreground, only to see it immediately stopped and restarted in
the background. (See also <http://bugs.debian.org/558763>.) ...”
$ cpulimit
Error: You must specify a target process
CPUlimit version 2.1
Usage: cpulimit TARGET [OPTIONS...] [-- PROGRAM]
TARGET must be exactly one of these:
-p, --pid=N pid of the process
-e, --exe=FILE name of the executable program file
The -e option only works when
cpulimit is run with admin rights.
-P, --path=PATH absolute path name of the
executable program file
OPTIONS
-b --background run in background
-c --cpu=N override the detection of CPUs on the machine.
2.1 Restrict primegen to 1 CPU (via taskset), run one instance, wrapped around cpulimit forcing
50% max CPU bandwidth:
$ pkill primegen
$ Process 25132 dead!
2.2 Run three instances of primegen on 1 CPU (via taskset), restricting them to 30%, 60% and 10%
CPU bandwidth respectively:
$ pkill primegen
$ Process 25340 dead!
Process 25343 dead!
Process 25346 dead!
Source
“... Control groups (cgroups) are a Linux kernel feature that allows you to specify how
the kernel should allocate specific resources to a group of processes. With cgroups
you can specify how much CPU time, system memory, network bandwidth, or
combinations of these resources can be used by the processes residing in a certain
group.
The advantage of control groups over nice or cpulimit is that the limits are
applied to a set of processes, rather than to just one.
Also, nice or cpulimit only limit the CPU usage of a process, whereas cgroups can
limit other process resources.
Source
“Control groups can be used << accessed >> in multiple ways:
$ ls /sys/fs/cgroup/
blkio/ cpuacct@ cpuset/ freezer/ memory/ net_cls,net_prio/
perf_event/ systemd/
cpu@ cpu,cpuacct/ devices/ hugetlb/ net_cls@ net_prio@
pids/
$
$ cgcreate -g cpu:/cgcpu1_more
cgcreate: can't create cgroup /cgcpu1_more: Cgroup, operation not
allowed
$ sudo cgcreate -g cpu:/cgcpu1_more
$ ls /sys/fs/cgroup/cpu,cpuacct/
cgcpu1_more/ cgroup.sane_behavior cpuacct.usage_percpu
cpu.shares notify_on_release tasks
cgroup.clone_children cpuacct.stat cpu.cfs_period_us
cpu.stat release_agent user.slice/
cgroup.procs cpuacct.usage cpu.cfs_quota_us
init.scope/ system.slice/
$ cat /sys/fs/cgroup/cpu,cpuacct/cgcpu1_more/cpu.shares
1024 << Leave it’s ‘cpu share’ as default (1024) => it can use 100%
cpu bandwidth >>
$
3.1: execute our ‘primegen’ app within the first CPU Cgroup
$ sudo cgexec -g cpu:cgcpu1_more ./primegen 1 1000000000 >/dev/null &
[1] 5404
If you run htop you will see that the process is taking all of the available CPU time.
This is because when a single process is running, it uses as much CPU as necessary,
regardless of which cgroup it is placed in. The CPU limitation only comes into effect when
two or more processes compete for CPU resources.
<< Note: FAQ- One can move a process(es) into a given cgroup via the cgclassify(1) command. >>
3.2: Execute the ‘primegen’ app 2 instances: one within the first ‘more’ CPU Cgroup, 2nd within
the ‘less’ cgroup
$ sudo /bin/bash
Password: xxx
#
You will notice that the first process consumes pretty much 100% cpu bandwidth; the second
instance consumes a lot less (it varies and that too when there’s a chance for it to run.
3.3 Now we run a third instance of the process - within the ‘less’ cgroup:
Interestingly, the latter two processes will now split their available CPU bandwidth within their
cgroup!
A further splitting of CPU occurs now between the 3 processes in the ‘less’ cgroup, each getting
approximately 15%-17% CPU bandwidth.
The first process in the ‘more’ cgroup happily continues to consume 100%.
# kill %1 %2 %3 %4
#
[1] Terminated cgexec -g cpu:cgcpu1_more ./primegen 1 1000000000 >
/dev/null
[2] Terminated cgexec -g cpu:cgcpu2_less ./primegen 500 1000000000
> /dev/null
[3]- Terminated cgexec -g cpu:cgcpu2_less ./primegen 600 1000000000
> /dev/null
[4]+ Terminated cgexec -g cpu:cgcpu2_less ./primegen 700 1000000000
> /dev/null
#
<<
See the Cgroups v2 CPU example here:
https://github.com/PacktPublishing/Linux-Kernel-Programming/tree/master/ch11/
cgroups_v2_cpu_eg
>>
CGManager
One would expect some management utilities for administrators and even users to manage cgroups.
There are several: a well-accepted one from the LXC project (?) is Cgroup Manager (cgm). From
the website:
“CGManager is a central privileged daemon that manages all your cgroups for you
through a simple D-Bus API. It's designed to work with nested LXC containers
as well as accepting unprivileged requests including resolving user namespaces UIDs/GIDs. ...”
$ man -k cgm
cgm (1) - a client script for cgmanager
cgmanager (8) - a daemon to manage cgroups
cgproxy (8) - a proxy for cgmanager
$
<< RHEL 7 uses systemd for cgroup management (and not libcgroup, which was the preferred
interface upto RHEL 6 >>
“This guide focuses on utilities provided by systemd that are preferred as a way of cgroup
management and will be supported in the future. Previous versions of Red Hat Enterprise Linux
used the libcgroup package for the same purpose. This package is still available to assure backward
compatibility (see Warning), but it will not be supported in the future versions of Red Hat
Enterprise Linux.”
...
Because cgroups “live” in a pseudo-filesystem (of type cgroup), no cgroups exist when the system
first boots up. The cgroups have to be mounted and setup each time.
By default, when a Linux system boots up, all processes belong in a control group called the “root
hierachy”, with equal resource sharing and prioritization.
The user (or more likely, scripts) can then create additional cgroups or hierarchies under the root
cgroup, partitioning processes and performing resource sharing according to the project's goals.
Ref:
http://www.janoszen.com/2013/02/06/limiting-linux-processes-cgroups-explained/
...
Example:
On a QEMU-emulated ARM-Linux:
First mount a pseudo-filesystem of type tmpfs, which wil act as the root or base point for the cgroup
hierachy.
CPUset
We've booted the system as a dual-core ARM Cortex-A9 (by using the “-smp 2,sockets=1” QEMU
parameter) :
processor : 1
model name : ARMv7 Processor rev 0 (v7l)
BogoMIPS : 474.31
Features : swp half thumb fastmult vfp edsp neon vfpv3 tls vfpd32
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x0
CPU part : 0xc09
CPU revision : 0
ARM # cd /cgroup
/--
|-- voip--
|-- cpuset_cg
Thus we now create and mount folders appropriately, which will in effect become the cgroup
“hierarchy”.
ARM # cd cpuset_cg/
ARM # cat tasks
1
2
3
4
5
6
7
8
9
10
11
12
206
209
211
219
230
331
332
341
347
432
461
576
592
651
653
658
659
665
741
ARM #
We can change both these: the tasks in the CG and the CPU affinity of those tasks, simply by
echo'ing different values into the (virtual) files.
ARM # ls cpuset_cg/
cgroup.clone_children cpuset.memory_pressure_enabled
cgroup.procs cpuset.memory_spread_page
cgroup.sane_behavior cpuset.memory_spread_slab
cpuset.cpu_exclusive cpuset.mems
cpuset.cpus cpuset.sched_load_balance
cpuset.mem_exclusive cpuset.sched_relax_domain_level
cpuset.mem_hardwall notify_on_release
cpuset.memory_migrate release_agent
cpuset.memory_pressure tasks
ARM #
...
Another example:
ARM # pwd
/sys/fs/cgroup
ARM #
Create dir and mount the “cpu” subsystem (resource controller) filesystem
CPU shares:
Note:
Automatic process grouping (a.k.a. "the patch that does wonders")
The most impacting feature in this release is the so-called "patch that does wonders", a patch that
changes substantially how the process scheduler assigns shares of CPU time to each process. With
this feature the system will group all processes with the same session ID as a single scheduling
entity.
Example: Let's imagine a system with six CPU-hungry processes, with the first four sharing the
same session ID and the other using another two different sessions each one.
The session ID is a property of processes in Unix systems (you can see it with commands like ps -
eo session,pid,cmd). It is inherited by forked child processes, which can start a new session using
setsid(3). The bash shell uses setsid(3) every time it is started, which means you can run a "make -j
20" inside a shell in your desktop and not notice it while you browse the web.
This feature is implemented on top of group scheduling (merged in [2.6.24). You can disable it in
/proc/sys/kernel/sched_autogroup_enabled
Code: (commit)
<<
Src: https://wiki.archlinux.org/index.php/Cgroups
Note that the memory limit applies to RAM use only -- once tasks hit this limit, they will begin to
swap. But it won't affect the performance of other processes significantly.
Similarly you can change the CPU priority ("shares") of this group. By default all groups have 1024
shares. A group with 100 shares will get a ~10% portion of the CPU time:
# echo 100 > /sys/fs/cgroup/cpu/groupname/foo/cpu.shares
...
>>
FAQ:
# ps -A|grep qemu
5002 pts/17 00:18:18 qemu-system-arm
# cat /proc/5002/cgroup
10:hugetlb:/user.slice/user-1000.slice/session-c2.scope
9:net_cls,net_prio:/user.slice/user-1000.slice/session-c2.scope
8:devices:/user.slice/user-1000.slice/session-c2.scope
7:freezer:/user.slice/user-1000.slice/session-c2.scope
6:perf_event:/user.slice/user-1000.slice/session-c2.scope
5:cpuset:/user.slice/user-1000.slice/session-c2.scope
4:memory:/user.slice/user-1000.slice/session-c2.scope
3:cpu,cpuacct:/user.slice/user-1000.slice/session-c2.scope
2:blkio:/user.slice/user-1000.slice/session-c2.scope
1:name=systemd:/user.slice/user-1000.slice/session-c2.scope
#
4:memory:/user.slice/user-1000.slice/session-c2.scope
# ls /sys/fs/cgroup/memory/user.slice/user-1000.slice/session-c2.scope
cgroup.clone_children memory.kmem.limit_in_bytes
memory.kmem.tcp.usage_in_bytes memory.oom_control
memory.use_hierarchy
cgroup.event_control memory.kmem.max_usage_in_bytes
memory.kmem.usage_in_bytes memory.pressure_level
notify_on_release
cgroup.procs memory.kmem.slabinfo
memory.limit_in_bytes memory.soft_limit_in_bytes tasks
memory.failcnt memory.kmem.tcp.failcnt
memory.max_usage_in_bytes memory.stat
memory.force_empty memory.kmem.tcp.limit_in_bytes
memory.move_charge_at_immigrate memory.swappiness
memory.kmem.failcnt memory.kmem.tcp.max_usage_in_bytes
memory.numa_stat memory.usage_in_bytes
#
From Documentation/cgroups/memory.txt
...
Brief summary of control files.
Run a script that recursively prints the contents of readable files under a given starting folder:
# xplore_fs.sh /sys/fs/cgroup/memory/user.slice/user-1000.slice/session-
c2.scope/
===================== SUMMARY LIST of Files ==========================
/sys/fs/cgroup/memory/user.slice/user-1000.slice/session-c2.scope/
/sys/fs/cgroup/memory/user.slice/user-1000.slice/session-c2.scope/
memory.pressure_level
/sys/fs/cgroup/memory/user.slice/user-1000.slice/session-c2.scope/
memory.kmem.max_usage_in_bytes
/sys/fs/cgroup/memory/user.slice/user-1000.slice/session-c2.scope/
memory.use_hierarchy
/sys/fs/cgroup/memory/user.slice/user-1000.slice/session-c2.scope/
memory.swappiness
/sys/fs/cgroup/memory/user.slice/user-1000.slice/session-c2.scope/tasks
...
-----------------------------------------------------------------------
/sys/fs/cgroup/memory/user.slice/user-1000.slice/session-c2.scope/ :
<dir>
/sys/fs/cgroup/memory/user.slice/user-1000.slice/session-c2.scope/
memory.pressure_level : cat: /sys/fs/cgroup/memory/user.slice/user-
1000.slice/session-c2.scope/memory.pressure_level: Invalid argument
/sys/fs/cgroup/memory/user.slice/user-1000.slice/session-c2.scope/
memory.kmem.max_usage_in_bytes : 0
/sys/fs/cgroup/memory/user.slice/user-1000.slice/session-c2.scope/
memory.use_hierarchy : 1
/sys/fs/cgroup/memory/user.slice/user-1000.slice/session-c2.scope/
memory.swappiness : 60
/sys/fs/cgroup/memory/user.slice/user-1000.slice/session-c2.scope/
tasks : 933
1054
1064
1065
1066
...
<< memory.limit_in_bytes # set/show limit of memory usage >>
/sys/fs/cgroup/memory/user.slice/user-1000.slice/session-c2.scope/
memory.limit_in_bytes : 9223372036854771712 (9007199254740988 KB)
(8796093022207.00 MB)(8589934591.00 GB) << 8 EB ! >>
...
...
<< memory.usage_in_bytes # show current usage for memory >>
/sys/fs/cgroup/memory/user.slice/user-1000.slice/session-c2.scope/
memory.usage_in_bytes : 4593033216 (4485384 KB)(4380.00 MB)( 4.00 GB)
...
node >>
/sys/fs/cgroup/memory/user.slice/user-1000.slice/session-c2.scope/
memory.numa_stat : total=1121292 N0=1121292
file=278154 N0=278154
anon=843138 N0=843138
unevictable=0 N0=0
hierarchical_total=1121292 N0=1121292
hierarchical_file=278154 N0=278154
hierarchical_anon=843138 N0=843138
hierarchical_unevictable=0 N0=0
...
>>
/sys/fs/cgroup/memory/user.slice/user-1000.slice/session-c2.scope/
memory.soft_limit_in_bytes : 9223372036854771712 (9007199254740988 KB)
(8796093022207.00 MB)(8589934591.00 GB) << 8 EB >>
/sys/fs/cgroup/memory/user.slice/user-1000.slice/session-c2.scope/
memory.kmem.tcp.failcnt : 0
#
Resource-
Hands on Linux sandbox with namespaces and cgroups
Rule 1
Rule 2
Any single subsystem (such as cpu) cannot be attached to more than one hierarchy if one
of those hierarchies has a different subsystem attached to it already.
As a consequence, the cpu subsystem can never be attached to two different hierarchies if
one of those hierarchies already has the memory subsystem attached to it. However, a single
subsystem can be attached to two hierarchies if both of those hierarchies have only that
subsystem attached.
Figure 1.2. Rule 2—The numbered bullets represent a time sequence in which the
subsystems are attached.
Rule 3
Each time a new hierarchy is created on the systems, all tasks on the system are initially
members of the default cgroup of that hierarchy, which is known as the root cgroup. For
any single hierarchy you create, each task on the system can be a member of exactly
one cgroup in that hierarchy. A single task may be in multiple cgroups, as long as each of
those cgroups is in a different hierarchy. As soon as a task becomes a member of a second
cgroup in the same hierarchy, it is removed from the first cgroup in that hierarchy. At no
time is a task ever in two different cgroups in the same hierarchy.
When the first hierarchy is created, every task on the system is a member of at least one
cgroup: the root cgroup. When using cgroups, therefore, every system task is always in at
least one cgroup.
Rule 4
Any process (task) on the system which forks itself creates a child task. A child task
automatically inherits the cgroup membership of its parent but can be moved to different
cgroups as needed. Once forked, the parent and child processes are completely
independent.
Figure 1.4. Rule 4—The numbered bullets represent a time sequence in which the task
forks.
• Because a task can belong to only a single cgroup in any one hierarchy, there is only
one way that a task can be limited or affected by any single subsystem. This is
logical: a feature, not a limitation.
• You can group several subsystems together so that they affect all tasks in a single
hierarchy. Because cgroups in that hierarchy have different parameters set, those
tasks will be affected differently.
• The design allows for simple cgroup usage, such as setting a few parameters for
specific tasks in a single hierarchy, such as one with just
the cpu and memory subsystems attached.
• The design also allows for highly specific configuration: each task (process) on a
system could be a member of each hierarchy, each of which has a single attached
subsystem. Such a configuration would give the system administrator absolute
control over all parameters for every single task.
Ref:
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/
Resource_Management_Guide/sec-Creating_a_Hierarchy_and_Attaching_Subsystems.html
...
Example 2.3. Using the mount command to attach subsystems
In this example, a directory named /cgroup/cpu_and_mem already exists, which will serve as
the mount point for the hierarchy that you create. Attach the cpu, cpuset and memory
subsystems to a hierarchy named cpu_and_mem, and mount the cpu_and_mem hierarchy on
/cgroup/cpu_and_mem:
~]# mount -t cgroup -o cpu,cpuset,memory cpu_and_mem /cgroup/cpu_and_mem
You can list all available subsystems along with their current mount points (i.e. where the hierarchy
they are attached to is mounted) with the lssubsys [3] command:
~]# lssubsys -am
cpu,cpuset,memory /cgroup/cpu_and_mem
net_cls
ns
cpuacct
devices
freezer
blkio
the net_cls, ns, cpuacct, devices, freezer and blkio subsystems are as yet unattached
to any hierarchy, as illustrated by the lack of a corresponding mount point.
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/
Resource_Management_Guide/chap-Introduction_to_Control_Groups.html#sec-
What_are_Control_Groups
systemd-cgls(1)