Skip to content

[Bug] create snapshot: missing parent "..." bucket: not found #680

@fidencio

Description

@fidencio

Problem Description

After having kata-containers using nydus snapshotter for around a week, we've that our node became unusable due to the following issue:

  Normal   Scheduled               6s    default-scheduler  Successfully assigned default/kata-quay-io-mongodb-mongodb-community-server-sha256-8b737338 to snapshotters-ci-1
  Warning  FailedCreatePodSandBox  6s    kubelet            Failed to create pod sandbox: rpc error: code = NotFound desc = failed to create containerd container: create snapshot: missing parent "k8s.io/2/sha256:961e93cda9dd918dbe26aca24cccd6c5db05176850d2c53476d881df5d0d4816" bucket: not found

This could be easily reproduced with containerd 1.7, 2.0, 2.1, and 2.2.

The nydus configuration used is:

version = 1

# Snapshotter's own home directory where it stores and creates necessary resources
root = "/var/lib/containerd-nydus"

# The snapshotter's GRPC server socket, containerd will connect to plugin on this socket
address = "/run/containerd-nydus/containerd-nydus-grpc.sock"

[daemon]
# Enable proxy mode
fs_driver = "proxy"

[snapshot]
# Insert Kata volume information to `Mount.Options`
enable_kata_volume = true

Containerd's configuration has:

[plugins."io.containerd.cri.v1.images"]
disable_snapshot_annotations = false

...

[proxy_plugins.nydus]
type = "snapshot"
address = "/run/nydus-snapshotter/containerd-nydus-grpc.sock"

@imeoer, I'd appreciate if you have any hint on why it happens and how we could potentially avoid such issue.

Expected Behavior

We would never face the issue described above.

Actual Behavior

The issue described above.

How to reproduce

I have a CI running every hour doing the following:
1, Deploy nydus
2. Pull one specific image using overlayfs
3. Pull the same specific image using nydus-snapshotter
4. Uninstall nydus
5. Repeat all the steps

And after a week running, it started failing.

Environment Details

  • Nydus-snapshotter version: v0.15.2
  • Nydus version: N/A
  • Container runtime: containerd
  • Operating System: Ubuntu 24.04
  • Kernel version: 6.8.0-71-generic

Additional Information

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions