Skip to content

Performance regression for image pull with concurrent container creation #274

@shuochen0311

Description

@shuochen0311

What happened in your environment?

We have multiple containers running on a same node with overlaybd as its container snapshotter, which are doing lazy pulling for all rootfs contents. When running it on prod, we found the P95 latency has huge gaps with P50 (20s vs 10s). After checking some logs we saw an interesting coincident that

For those image pulling with unexpected latencies:

Oct 02 06:04:28 [Event] Start to pull image for container executor: image harbor-xxxxx
Oct 02 06:04:37  [Event] Finish pulling image for container executor: image harbor-xxxx

There is a container creation events hapenning inside containerd

Oct 02 06:04:29 ip-10-1-162-245 containerd[387]: time="2023-10-02T06:04:29.671653617Z" level=info msg="CreateContainer within sandbox \"e0d9308c3259dc01251575ad5c27d2efdbdaf00b7c267f06a7ab15ed6d827e23\""
Oct 02 06:04:29 ip-10-1-162-245 containerd[387]: time="2023-10-02T06:04:29.672341423Z" level=info msg="StartContainer for \"3a6b0dce5e9168993ccd0c3213929af87e4765304774773188e1830631e2ff39\""
Oct 02 06:04:29 ip-10-1-162-245 containerd[387]: time="2023-10-02T06:04:29.672417656Z" level=info msg="container start request for xxxx"
Oct 02 06:04:29 ip-10-1-162-245 containerd[387]: time="2023-10-02T06:04:29.837229175Z" level=info msg="StartContainer for \"3a6b0dce5e9168993ccd0c3213929af87e4765304774773188e1830631e2ff39\" returns successfully"

We are suspecting the container creating events (which contains some container rootfs construction process) is interfering with container image pulling and impact image lazy pull latency.

We are looking for some insights from upstream about what is the potential reason for such performance regression.

What did you expect to happen?

No response

How can we reproduce it?

Use overlaybd as snapshotter, overlap some container creation with container image download.

What is the version of your Overlaybd?

0.6.17

What is your OS environment?

ubuntu 20.04

Are you willing to submit PRs to fix it?

  • Yes, I am willing to fix it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions