Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rootless docker cannot pull images with user xattrs on directories - lsetxattr operation not supported #47962

Open
BrianHVB opened this issue Jun 12, 2024 · 6 comments
Labels
area/rootless Rootless mode kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. status/confirmed version/25.0

Comments

@BrianHVB
Copy link

BrianHVB commented Jun 12, 2024

Description

Error: failed to register layer: lsetxattr user.overlay.origin /etc: operation not supported

Starting with v25, docker will no longer silently remove extended attributes when unpacking layers, and will instead fail (see v25.0 release notes and PR #45464 - Fail unpacking images with xattrs to filesystems without xattr support)

Images built with podman and overlayfs set the user.overlay.origin extended attribute on (some? all?) layers.

Rootless docker cannot run lsetxattr even if the underlying filesystem supports the operation.

As a result, docker pull of podman built images will produce a failed to register layer: lsetxattr user.overlay.origin /etc: operation not supported error.

Images built with Podman + VFS do not set extended attributes and can be pulled with rootless docker.
Images pulled with rootful docker can run lsetxattr and pull successfully
Images pulled with rootless docker + fuse_overlayfs can set extended attributes and pull successfully, but running fuse_overlayfs on newer kernels is no longer recommended or even mentioned in the rootless docker instructions.

Some related issues (and fixes)

Tagging @thaJeztah since it looks like you have been involved in the extended attribute discussions and fixes. Can rootless docker fall back to the v25 behavior of ignoring (but maybe warning) failures to set extended attributes?

Reproduce

  1. Install Docker Engine and configure Docker rootless as outlined in https://docs.docker.com/engine/security/rootless/
  2. docker pull bhpiq/podman-docker-issue-test:latest
  3. Note the error failed to register layer: lsetxattr user.overlay.origin /etc: operation not supported

Expected behavior

docker pull should still proceed with unpacking and registering the layer, even if it cannot set extended attributes

docker version

Client: Docker Engine - Community
 Version:           26.1.4
 API version:       1.45
 Go version:        go1.21.11
 Git commit:        5650f9b
 Built:             Wed Jun  5 11:29:22 2024
 OS/Arch:           linux/amd64
 Context:           rootless

Server: Docker Engine - Community
 Engine:
  Version:          26.1.4
  API version:      1.45 (minimum version 1.24)
  Go version:       go1.21.11
  Git commit:       de5c9cf
  Built:            Wed Jun  5 11:29:22 2024
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.33
  GitCommit:        d2d58213f83a351ca8f528a95fbd145f5654e957
 runc:
  Version:          1.1.12
  GitCommit:        v1.1.12-0-g51d5e94
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0
 rootlesskit:
  Version:          2.0.2
  ApiVersion:       1.1.1
  NetworkDriver:    slirp4netns
  PortDriver:       builtin
  StateDir:         /run/user/1000/dockerd-rootless
 slirp4netns:
  Version:          1.2.0
  GitCommit:        656041d45cfca7a4176f6b7eed9e4fe6c11e8383

docker info

Client: Docker Engine - Community
 Version:    26.1.4
 Context:    rootless
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.14.1
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.27.1
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 0
 Server Version: 26.1.4
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: false
  userxattr: true
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: d2d58213f83a351ca8f528a95fbd145f5654e957
 runc version: v1.1.12-0-g51d5e94
 init version: de40ad0
 Security Options:
  seccomp
   Profile: builtin
  rootless
  cgroupns
 Kernel Version: 6.1.0-13-cloud-amd64
 Operating System: Debian GNU/Linux 12 (bookworm)
 OSType: linux
 Architecture: x86_64
 CPUs: 1
 Total Memory: 1.934GiB
 Name: i-011c974e5754e30c1
 ID: f2e0510d-4812-46b6-982a-ae600792d6c7
 Docker Root Dir: /home/admin/.local/share/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

Additional Info

Podman System Info from the build environment

host:
  arch: amd64
  buildahVersion: 1.28.2
  cgroupControllers:
  - cpu
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon_2.1.6+ds1-1_amd64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.6, commit: unknown'
  cpuUtilization:
    idlePercent: 98.8
    systemPercent: 0.3
    userPercent: 0.9
  cpus: 1
  distribution:
    codename: bookworm
    distribution: debian
    version: "12"
  eventLogger: journald
  hostname: i-011c974e5754e30c1
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 6.1.0-13-cloud-amd64
  linkmode: dynamic
  logDriver: journald
  memFree: 285814784
  memTotal: 2076667904
  networkBackend: netavark
  ociRuntime:
    name: crun
    package: crun_1.8.1-1+deb12u1_amd64
    path: /usr/bin/crun
    version: |-
      crun version 1.8.1
      commit: f8a096be060b22ccd3d5f3ebe44108517fbf6c30
      rundir: /run/user/1000/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
  os: linux
  remoteSocket:
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns_1.2.0-1_amd64
    version: |-
      slirp4netns version 1.2.0
      commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.4
  swapFree: 0
  swapTotal: 0
  uptime: 1h 32m 49.00s (Approximately 0.04 days)
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries: {}
store:
  configFile: /home/admin/.config/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /home/admin/.local/share/containers/storage
  graphRootAllocated: 52591984640
  graphRootUsed: 2092654592
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 2
  runRoot: /run/user/1000/containers
  volumePath: /home/admin/.local/share/containers/storage/volumes
version:
  APIVersion: 4.3.1
  Built: 0
  BuiltTime: Thu Jan  1 00:00:00 1970
  GitCommit: ""
  GoVersion: go1.19.8
  Os: linux
  OsArch: linux/amd64
  Version: 4.3.1

MVP configuration for building the test image

Debian 12 base install (cloud image)

 
Install Podman

sudo apt-get install containers-storage podman
systemctl --user start dbus

 
Confirm Podman is using the native, non-fuser overlay storage driver

podman system info | grep -A 10 graph
# graphDriverName: overlay
# graphStatus -> Native Overlay Diff: "true"

 
Create a minimal Dockerfile

mkdir -p ~/container_test && cd ~/container_test
cat <<EOF > Dockerfile
FROM debian:12

RUN echo test > /tmp/test.txt
EOF

 
Build and push

podman build -t bhpiq/podman-docker-issue-test:latest -f ./Dockerfile .
podman login docker.io
podman push bhpiq/podman-docker-issue-test:latest
@BrianHVB BrianHVB added kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. status/0-triage labels Jun 12, 2024
@BrianHVB
Copy link
Author

BrianHVB commented Jun 12, 2024

It looks like there may been a buildah PR that removes the setting of extra attributes - containers/storage#1847

But those attributes are still present in any images built by the standard podman installs on Debian, Ubuntu, and other distributions. And even when the fix does eventually roll down, those attributes will still be present in any existing images.

@AkihiroSuda AkihiroSuda added the area/rootless Rootless mode label Jun 13, 2024
@corhere
Copy link
Contributor

corhere commented Jun 25, 2024

Can rootless docker fall back to the v25 behavior of ignoring (but maybe warning) failures to set extended attributes?

No. That way lies madness, and subtly-broken containers. There is no way for dockerd to know whether or not any particular extended file attribute is critical for the proper functioning of a container. That is especially true for user xattrs, since the user namespace is reserved for userspace processes. Only the image author knows for sure whether an extended attribute is necessary for the proper functioning of a container. They can express that the image does not require a particular extended attribute by excluding it from the image. (Hence why we are more permissive about applying extended attributes when building images compared to extracting them.)


User extended attributes are allowed to be set by unprivileged users (subject to permissions checks) so rootless dockerd should have been able to successfully set the user.overlay.origin xattr when extracting the image. If there was a bug in dockerd, I would have expected the error to be operation not permitted. The operation not supported error suggests that the image failed to extract because the underlying filesystem the image was being extracted into does not support (user) extended attributes. If that is the case, such a configuration is incompatible with the overlay2 storage driver in general, as whiteout files will fail to be created. There exists a draft PR to better detect such issues early:


The presence of a user.overlay.origin xattr on any file in any image layer is a bug in the image authoring tool. That xattr is the -o userxattr analogue of the trusted.overlay.origin xattr, which is used by the "index" feature of overlayfs.

With the “index” feature, on the first time mount, an NFS file handle of the lower layer root directory, along with the UUID of the lower filesystem, are encoded and stored in the “trusted.overlay.origin” extended attribute on the upper layer root directory. On subsequent mount attempts, the lower root directory file handle and lower filesystem UUID are compared to the stored origin in upper root directory. On failure to verify the lower root origin, mount will fail with ESTALE.

The user.overlay.origin xattr value encodes the UUID of the underlying filesystem of the machine used to build the image.

It is quite a common practice to copy overlay layers to a different directory tree on the same or different underlying filesystem, and even to a different machine. With the “index” feature, trying to mount the copied layers will fail the verification of the lower root file handle.

The existence of such an xattr in a container image therefore makes the image inherently non-portable. While the overlay filesystem's index feature is forcefully disabled as of #37993, that is subject to change:

@BrianHVB
Copy link
Author

BrianHVB commented Jun 25, 2024

@corhere - Thanks for the detailed explanation, especially as to the origin of where user.overlay.origin is being set. I agree that ultimately this is caused by an improperly configured build tool (buildah). I believe that the setting of that attribute has been removed in source (containers/storage#1847). However that doesn't change the fact that any existing images built with older and current (as of June 2024) versions of buildah/Podman running Overlay can not be pulled/used by rootless Docker >= v25.0.

Had this incompatibility been known at the time the "stop silently ignoring xattr errors" PR was made, would the change still have been introduced to v25? Perhaps the intersection of "built with Podman/Buildah/Overlay2" and "run with rootless Docker" is small, but the potential scope of images that suddenly went from working to broken seems large.

 

No. That way lies madness, and subtly-broken containers.

I agree that the old behavior of silently ignoring extended attribute errors can lead to confusing breakages of images, but I don't think issuing a warning that directly names the failure along with one or more specific extended attributes is subtle.

While I understand the distinction between the building images vs pulling/consuming them, it seems like it is not enough to warrant an inconsistent handling of extended attribute application failures.

 

The operation not supported error suggests that the image failed to extract because the underlying filesystem the image was being extracted into does not support (user) extended attributes.

I don't believe that is the case, as I was able to successfully pull the image via rootful Docker running on the same system with the same storage driver as rootless Docker.

@corhere
Copy link
Contributor

corhere commented Jun 25, 2024

I don't believe that is the case, as I was able to successfully pull the image via rootful Docker running on the same system with the same storage driver as rootless Docker.

Was it on the same filesystem, though? Rootful Docker defaults to /var/lib/docker, while your rootless setup (according to your docker info) uses /home/admin/.local/share/docker.

@BrianHVB
Copy link
Author

Was it on the same filesystem, though?

Yes. This was tested on a bare bones Debian install with only a single partition/file system.

@corhere
Copy link
Contributor

corhere commented Jun 25, 2024

I can reproduce the issue in a docker:dind-rootless container on Docker Desktop (mac) 4.31.0, which rules out multiple-filesystem shenanigans as the culprit.

There is nothing terribly unusual about the image. The topmost layer is fairly straightforward. The only thing of note is the user.overlay.origin xattrs set on some directories, as we expected.

etc
	SCHILY.xattr.user.overlay.origin=''

etc/hosts

run
	SCHILY.xattr.user.overlay.origin=''

tmp
	SCHILY.xattr.user.overlay.origin=''

tmp/test.txt

I have no trouble setting user extended attributes using the setfattr command inside effectively the same rootlesskit environment that dockerd runs in, so there's something weird going on. I suspect that the provenance of the xattr is a red herring: any image with any user xattr might fail to extract in rootless Docker, in which case the correct course of action would be to fix the underlying cause so that user xattrs are successfully applied under rootless Docker.

@corhere corhere changed the title Rootless docker cannot pull images built with Podman + Overlay - lsetxattr operation not supported Rootless docker cannot pull images with user xattrs on directories - lsetxattr operation not supported Jun 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/rootless Rootless mode kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. status/confirmed version/25.0
Projects
None yet
Development

No branches or pull requests

4 participants