3 min read | by Jordi Prats
When running containers, by default we will have an isolation between the host and the running container: you cannot access the host’s resources. But when you run a Pod with the privileged flag, you are effectively disabling this isolation making it equivalent to running that process as root on the host server.
When you tun a privileged Pod means that the pod can access the host’s resources and kernel capabilities: This is essentially equivalent to root on the host. This might be needed on some scenarios, such as being able to run GPU enabled containers in a Kubernetes cluster, so that the GPU can be accessed directly from the container.
We can test it out using the following Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: test-privileged
spec:
selector:
matchLabels:
app: nginx
replicas: 1
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 80
securityContext:
privileged: true
Now if we try list the files we can see on the /dev filesystem we will be able to see all the devices that the host can see:
$ kubectl --context arm64 exec -it test-privileged-8468dbb5c7-6m7mk -- ls /dev
autofs loop6 rpivid-hevcmem tty20 tty49 vcs3
bsg loop7 rpivid-intcmem tty21 tty5 vcs4
btrfs-control mapper rpivid-vp9mem tty22 tty50 vcs5
bus mem sda tty23 tty51 vcs6
cachefiles mqueue sda1 tty24 tty52 vcs7
cec0 net sda2 tty25 tty53 vcsa
cec1 null sda3 tty26 tty54 vcsa1
cpu_dma_latency port sdb tty27 tty55 vcsa2
cuse ppp sg0 tty28 tty56 vcsa3
dma_heap ptmx sg1 tty29 tty57 vcsa4
dri pts sg2 tty3 tty58 vcsa5
fd ram0 shm tty30 tty59 vcsa6
full ram1 snd tty31 tty6 vcsa7
fuse ram10 stderr tty32 tty60 vcsu
gpiochip0 ram11 stdin tty33 tty61 vcsu1
gpiochip1 ram12 stdout tty34 tty62 vcsu2
gpiomem ram13 termination-log tty35 tty63 vcsu3
hwrng ram14 tty tty36 tty7 vcsu4
i2c-11 ram15 tty0 tty37 tty8 vcsu5
i2c-12 ram2 tty1 tty38 tty9 vcsu6
input ram3 tty10 tty39 ttyAMA0 vcsu7
kmsg ram4 tty11 tty4 ttyprintk vga_arbiter
kvm ram5 tty12 tty40 uhid vhci
longhorn ram6 tty13 tty41 uinput vhost-net
loop-control ram7 tty14 tty42 urandom watchdog
loop0 ram8 tty15 tty43 vc-mem watchdog0
loop1 ram9 tty16 tty44 vchiq zero
loop2 random tty17 tty45 vcio
loop3 raw tty18 tty46 vcs
loop4 rfkill tty19 tty47 vcs1
loop5 rpivid-h264mem tty2 tty48 vcs2
On a non-privileged Pod we wouldn't be able to see all the devices:
$ kubectl exec -it pet2cattle-7f9775bbd8-klkfd -c pet2cattle -- ls /dev
fd ptmx stderr tty
full pts stdin urandom
mqueue random stdout zero
null shm termination-log
So, as soon as we have access to all the devices (disks) we can do whatever we want to the Kubernetes node by mounting the relevant filesystems and writing whatever change we want.
We must bear in mind that privileged=true is a shorthand for ALL PRIVILEGES, if we really need to add some extra capabilities we can configure permissions in a more granular way. You can check the SecurityContext reference for a comprehensive list.
Posted on 22/12/2021