Pākiki Blog

The latest insights, vulnerabilities, research, and release information from the Pākiki team.

Follow us on social media:

Docker Networking Vulnerability

Published on by Karl
Categories: Security

Background

Docker v28 was released on 20/02/2025 to very little fanfare despite fixing a networking weakness that had been documented since May 2023. The vulnerability would allow attackers on the local network to access services even when they were bound to localhost. The warning below was present on multiple pages of the official documentation (for those who read documentation), with additional details at moby#24510.

Docker Documentation Warning

What Did It Mean?

L2 segments: Networks which route directly at Layer 2 (Data Link Layer) of the OSI Model. In practice, this means the host is accessible by its MAC Address (either directly or via a L2 Router).

Ports published to localhost: Container ports which were bound to the “private” loopback interface of the host. This is intended that only users on the machine itself could access the exposed services (compared to services intentionally shared with other network-based users).

Together, this typically meant that attackers on the same subnet or switch could access services that you published to the loopback (or any other) interface.

Who Was Affected?

This primarily affected systems which used the localhost bind as a security control (assuming that only trusted parties could access the local interface). Examples of this would include developers writing and testing their applications at a coffee shop, workstations running applications published via Docker Hub, or on-premise servers running dockerised solutions (either internally managed or vendor bundled platforms).

Consider a hypothetical web application that has both user facing (all interfaces) and administrative (localhost) web portals. An attacker could leverage this weakness to access the administrative portal from another machine on its subnet.

What Caused It?

The Docker Daemon is an orchestration layer for managing isolation tools built into the Linux Kernel (namespaces, control groups) and host networking (iptables). For this issue, we will focus on the networking side.

Firstly, Docker enables a networking flag called ip_forward. This flag is typically used for routers (e.g. forwarding packets from your home Wi-Fi out to the internet). It instructs the networking stack to accept packets destined for other machines and to forward them on according to its own routing rules. Docker uses this to assist routing in and out of container networks.

Secondly, Docker creates DNAT (Destination Network Address Translation) rules for each exposed port (regardless of which interface it was bound to). This tells the network stack to rewrite the target IP of packets to match the container’s private IP (otherwise the container would ignore the request due to a mismatched IP).

The combination of these settings meant that packets targeted at the host’s network interface (MAC address) using the container’s address (IP) would be forwarded into the container regardless of the bind address.

Docker runs a network proxy service (docker-proxy) per binding to listen on the requested bind and forward packets on; however, we fully bypass this service by using the native host routing stack.

How They Fixed It

The fix ended up be reasonably simple, they now add additional iptables rules for each bind that will drop any packets destined for the bound container/port if it did not originate from the expected network interface (using the “raw” table “OUTPUT” chain - for those interested in the finer details).

Let’s Exploit It

(Note: This section is Linux specific, although detection will work cross-platform).

Detection

Before we attempt to exploit the vulnerability, we should confirm that the target is actually vulnerable to it. The network scanning tool nmap has a script that does this. As it requires a target address, we will use the default Docker network for scanning (172.17.0.0./16) where 172.17.x.1 maps to the Docker host itself.

sudo nmap -sn -n --script ip-forwarding --script-args target=172.17.0.1 192.168.0.0/24 (replacing 192.168.0.0/24 with your intended range).

This requires root (sudo) access because it leverages raw sockets.

The Basics

Ethernet frames contain a link layer address (MAC) and payload. For TCP/IP packets, the payload then includes a network address (IP). This allows us to create packets with mismatched MAC and IP combinations to influence how the host (MAC address) handles the packet.

Under normal operation, the following protocols are involved in managing these pairs:

So a new client will conduct a DHCP request, the DHCP server will provide an IP and suggested routing information which will then be stored in the workstation’s “routing table”. Any network requests on the same subnet will conduct an ARP request so that the packet can be sent directly to the target MAC address (either directly on a switch, or via Layer 2 routing on a router).

To exploit the Docker weakness, we can either inject raw packets into the network or overwrite the supplied routing table rules. Of these options, leveraging the system’s existing network stack is likely to be more predictable and reliable.

The Naive Approach

The most direct approach to rewriting packets is to replace the default routing rule (gateway) to be the IP of the targeted host. This causes all traffic to be routed to the targeted system’s MAC address, but also means you’ll likely lose internet connectivity because it affects all traffic.

An example command for configuring this would be:

sudo ip route add default via {TARGET_IP}

This can be partially mitigated by specifying an IP range instead of “default”; however, it still affects the entire host’s routing and there must be a better way.

The Improved Approach

Similar to Docker, we can create an isolated network environment using Network Namespaces. This means that we can retain normal operation on the host, while rewriting packets within the namespace. An example command flow for configuring this would be:

sudo ip netns add my_namespace
sudo ip link add link {UPSTREAM_INTERFACE_NAME} name my_dev netns my_namespace type macvlan mode private
sudo ip netns exec my_namespace bash
	dhclient -1 my_dev
	ip route change default via {TARGET_IP}

This creates a new namespace (my_namespace), creates a new virtual reference to the existing network interface ({UPSTREAM_INTERFACE_NAME} - e.g. “ens33”) and places it inside our new namespace. We need to “copy” the existing interface because each interface can only exist in a single namespace at a time. We use the “macvlan” type because it provides the required access to Layer 3 routing (MAC Addresses).

“dhclient” (not necessarily installed by default on all linux distributions) provides an IP via DHCP for our new virtual interface and then we can resort to the previous approach of modifying the (now isolated) routing table to redirect traffic.

This means the host retains normal operation AND we can use modified routes from within the namespace, but it has the following downsides:

The approach solved problems of host routing contamination, but we can still do better!

The Optimised Approach

While Network Namespaces provide networking isolation from the host, they also introduced a few overheads. We can avoid these overheads by using conditional routing instead.

One approach would be to allocate multiple IP addresses to the same network interface and conditionally route traffic based on which one we bind to. This has a similar problem to the previous option where we need to consume a range of IP addresses from the environment. Additionally, many tools do not provide the option to select the source binding port.

A better approach is to use “cgroups” (control groups) - a Linux Kernel isolation feature (also used by Docker) to add limits, prioritisation, and accounting on resources such as cpu, disk, and memory. We can use the following process to create new conditionally routed environments:

While this seems more complicated than the other options, it provides us the ability to create shell environments with custom routing rules that don’t affect the main host, do not consume IPs from the environment, and can be scaled to many concurrent instances. They are also universally supported by tools due to hooking Kernel level routing.

The following script automates this entire process, either running a given command or dropping down to a user-level shell - both options cleaning up the created rules on close:

Conclusion

Although the patch has been released, it doesn’t necessarily mean that it has been applied to all environments. Some solutions may be months away from having the patch fully propagated into their build cycle.

We now also have a generic script for creating custom routing environments without significant host overheads. This tool can be used to test other systems (routers or otherwise) which have ip_forwarding enabled.

Insecure By Default

While Docker has patched the “warning” from the original screenshot, the “important” remains unresolved as it is an intentional design decision. Let’s also quickly address this while we are thinking about it!

Background

As indicated in the sidenote… “Publishing container ports is insecure by default”. Docker intentionally publishes ports on 0.0.0.0 (all interfaces) by default. This means any services published without specifying an IP to bind to will be automatically publicly accessible.

Mitigation

You can override this behaviour by explicitly setting a bind IP (e.g. 127.0.0.1) on every launch as per the documentation.

Alternatively, you can make it “secure by default” by changing the ip property in the dockerd configuration file to “127.0.0.1” (or your preferred default bind address).

The behaviour can be reverted back on a case-by-case basis by manually binding to 0.0.0.0.

Also… patch to v28, because this mitigation would still be undermined without it!