Namespaces and nsenter

Namespaces in Linux

What is a namespace?

A namespace is a subset of a global system resource (process IDs, user IDs, etc). Processes within a namespace can only see the resources in that namespace. They can't see the resources in other namespaces. This makes them appear as if they are the only ones using that resource.

How many namespaces are there?

There are 8 Linux namespaces:

Namespace Isolates
Cgroup Control group hierarchy
Inter-process Communication (IPC) System V IPC and POSIX message queues
Mount Mount points
Network Network stacks
PID Process IDs
Time Boot and monotonic clocks
User User and group IDs
UTS Hostname and NIS domain name

Namespaces in Docker

Containers rely on the following Linux namespaces to isolate their resources:

PID namespace

Processes inside a container share a unique set of process IDs (PIDs). This means:

  • Each container has its own PID 1 process.
  • Containers can't see processes on the host system.
  • Containers can't see processes in other containers.

Network namespace

Each container will have its own virtualized network stack, so that:

  • Each container will have its own loopback interface (localhost).
  • IP addresses and port numbers won't conflict between containers or with the host system.

Mount namespace

It provides isolation of the list of mounts seen by each container.

IPC namespace

It prevents issues with shared memory access between containers.

UTS namespace

It allows each container to have its own hostname and NIS domain name.

Cgroup namespace

It hides the identity of the control group of which the process is a member.

User namespace

It isolates security-related resources, such as user IDs and group IDs. Docker rootless mode uses this to map users and groups in the container to different users and groups on the host system. The main benefits of this approach are:

  • Users don't need to be root on the host system to run containers.
  • Processes running as root in the container are not root on the host system.

How can I list the namespaces of a container?

You can use the lsns command to list the namespaces of a container:

CONTAINER_PID=$(docker inspect CONTAINER_NAME -f '{{ .State.Pid }}')
sudo lsns -p $CONTAINER_PID
lsns usage

Description: lsns lists namespaces.

Usage: lsns [OPTIONS]

Example: lsns -p 1234

This example shows all namespaces of the process with PID 1234.

Common options:

  • -p PID: show all namespaces of the given PID.
  • -t TYPE: only show namespaces of the given type(s).

In a default Docker setup, you can see all namespaces being used by the container:

NS         TYPE   NPROCS   PID USER COMMAND
4026531834 time      385     1 root /usr/lib/systemd/systemd --system [TRUNCATED]
4026531837 user      356     1 root /usr/lib/systemd/systemd --system [TRUNCATED]
4026532836 mnt         1 38910 root /app/myapp
4026532839 uts         1 38910 root /app/myapp
4026532840 ipc         1 38910 root /app/myapp
4026532842 pid         2 38910 root /app/myapp
4026533240 cgroup      1 38910 root /app/myapp
4026533246 net         1 38910 root /app/myapp

In a rootless Docker setup, the time and user namespaces are owned by the user and isolated from the host system:

NS         TYPE   NPROCS   PID USER   COMMAND
4026531834 time       16   661 lab    /lib/systemd/systemd --user
4026532238 mnt         1  1386 296603 /app/myapp
4026532240 uts         1  1386 296603 /app/myapp
4026532241 ipc         1  1386 296603 /app/myapp
4026532242 pid         1  1386 296603 /app/myapp
4026532243 cgroup      1  1386 296603 /app/myapp
4026532244 net         1  1386 296603 /app/myapp
4026532355 user       10   778 lab    /proc/self/exe [TRUNCATED] /usr/bin/dockerd-rootless.sh

Troubleshooting

We will explore different methods to troubleshoot Docker containers using the same challenge as a reference. As we progress, the difficulty will increase by introducing new constraints to the challenge.

Challenge

Description: There is a Docker container running called c1 .

Task: List all listening TCP ports (IPv4) inside c1 .

Initial constraint: you can't modify the container image by installing new packages, copying files from the host, or downloading files from the internet.

Using docker port

docker port lists port mappings for the container:

docker port c1

This is not a valid solution as it won't show all listening ports inside the container.

Using docker inspect

docker inspect shows the container's configuration:

docker inspect c1 | grep -i port

This is not a valid solution as it will only show exposed ports, not all listening ports inside the container.

Using docker exec

docker exec executes a command in a running container:

docker exec c1 ss -tln

State  Recv-Q Send-Q  Local Address:Port  Peer Address:Port
LISTEN 0      4096    0.0.0.0:80          0.0.0.0:*

The output shows the listening port in the container: 80.

Using docker exec with limited commands

Additional constraint: commands such as ss, netstat, lsof, lsfd, and fuser are not available inside the container.

If the command you want to use is not available inside the container, you'll get an error like this:

OCI runtime exec failed: exec failed: unable to start container process: exec: "ss": executable file not found in $PATH

If you can use commands that read the content of a file, like cat, you can show the content of /proc/net/tcp inside the container. This file shows the TCP socket table for IPv4. The listening ports are the ones with the value 0A (LISTEN state) in the st column:

docker exec c1 cat /proc/net/tcp
sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt   uid  timeout inode
 0: 00000000:0050 00000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 285153 1 0000000000000000 100 0 0 10 0

The local_address column uses the format IP:PORT, both encoded in hexadecimal. The IP is 0.0.0.0, but the port (0050) needs to be converted to decimal format:

printf '%d\n' 0x0050

The output shows the port in decimal format: 80.

Using a helper container

Additional constraint: c1 is a minimal container (distroless or scratch), and it doesn't have any of those commands available inside it.

You can run a helper container that has those commands available and uses the same network namespace as c1:

docker run --rm --network container:c1 alpine:latest netstat -tln
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State
tcp        0      0 0.0.0.0:80              0.0.0.0:*               LISTEN

That makes the helper container see the same listening ports as c1 because they share the same network namespace.

If you want to verify this, change the command run by the helper container so that it doesn't exit immediately:

docker run --name helper -d --network container:c1 alpine:latest sleep infinity

If you use lsns on both containers and filter by the network namespace, you'll see that they share the same network namespace:

C1_PID=$(docker inspect c1 -f '{{ .State.Pid }}')
sudo lsns -p $C1_PID -t net
NS         TYPE NPROCS   PID USER NETNSID NSFS                           COMMAND
4026533246 net       2 38910 root       4 /run/docker/netns/37f33686e313 /app/myapp
HELPER_PID=$(docker inspect helper -f '{{ .State.Pid }}')
sudo lsns -p $HELPER_PID -t net
NS         TYPE NPROCS   PID USER NETNSID NSFS                           COMMAND
4026533246 net       2 38910 root       4 /run/docker/netns/37f33686e313 /app/myapp

Using nsenter

Additional constraint: you can't start new containers.

You can use nsenter to run commands inside the container using binaries available on the host:

C1_PID=$(docker inspect c1 -f '{{ .State.Pid }}')
sudo nsenter -t $C1_PID -n ss -tln
nsenter usage

Description: nsenter runs commands in different namespaces.

Usage: nsenter [OPTIONS] COMMAND

Example: nsenter -t 1234 -U -n ss -tupln

This example runs the ss -tupln command in the user and network namespaces of the process with PID 1234.

Common options:

  • -t PID: to target the process with the given PID.
  • -a: to enter all namespaces.
  • -n: to enter the network namespace.
  • -U: to enter the user namespace.
  • -m: to enter the mount namespace.
  • -p: to enter the PID namespace.
  • -i: to enter the IPC namespace.
  • -u: to enter the UTS namespace.
  • -T: to enter the time namespace.
  • -C: to enter the Cgroup namespace.

This command enters the container's network namespace and runs the ss -tln command from the host, exactly as if you were running it from inside the container.

State   Recv-Q  Send-Q  Local Address:Port  Peer Address:Port
LISTEN  0       511     0.0.0.0:80          0.0.0.0:*

Using nsenter in a rootless Docker setup

Additional constraints:

  • Docker is running in rootless mode.
  • Your user can't use sudo.

You can't use sudo to run the nsenter command as in the previous example, but you don't need it. Docker is running in rootless mode, so you can use nsenter to enter both the user and network namespaces and run the ss -tln command from the host:

C1_PID=$(docker inspect c1 -f '{{ .State.Pid }}')
nsenter -t $C1_PID -U -n ss -tln
State   Recv-Q  Send-Q  Local Address:Port  Peer Address:Port
LISTEN  0       128     *:80                *:*

More info

Man pages:

External links: