Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker rate limiting causes e2e-tests to fail #975

Closed
Tracked by #976
sharnoff opened this issue Jun 19, 2024 · 0 comments · Fixed by #978
Closed
Tracked by #976

Docker rate limiting causes e2e-tests to fail #975

sharnoff opened this issue Jun 19, 2024 · 0 comments · Fixed by #978
Assignees
Labels
a/ci Area: related to continuous integration

Comments

@sharnoff
Copy link
Member

Problem

We sometimes get rate limited by docker in the e2e tests. When this happens, image pulls fail - and therefore the entire e2e test job fails as a result.

As a recent example, I saw a couple cases where deploying the components failed with:

Waiting for daemon set "neonvm-device-plugin" rollout to finish: 0 of 3 updated pods are available...
Error: The action 'deploy components' has timed out after 3 minutes.

and when looking at the events, we see:

LAST SEEN   TYPE      REASON           OBJECT                           MESSAGE
2m50s       Normal    Scheduled        pod/neonvm-device-plugin-blskl   Successfully assigned neonvm-system/neonvm-device-plugin-blskl to k3d-neonvm-agent-0
2m49s       Normal    AddedInterface   pod/neonvm-device-plugin-blskl   Add eth0 [10.0.0.154/32] from cilium
75s         Normal    Pulling          pod/neonvm-device-plugin-blskl   Pulling image "squat/generic-device-plugin"
72s         Warning   Failed           pod/neonvm-device-plugin-blskl   Failed to pull image "squat/generic-device-plugin": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/squat/generic-device-plugin:latest": failed to copy: httpReadSeeker: failed open: unexpected status code https://registry-1.docker.io/v2/squat/generic-device-plugin/manifests/sha256:ba6f0b4cf6c858d6ad29ba4d32e4da11638abbc7d96436bf04f582a97b2b8821: 429 Too Many Requests - Server message: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit
72s         Warning   Failed           pod/neonvm-device-plugin-blskl   Error: ErrImagePull
57s         Warning   Failed           pod/neonvm-device-plugin-blskl   Error: ImagePullBackOff
45s         Normal    BackOff          pod/neonvm-device-plugin-blskl   Back-off pulling image "squat/generic-device-plugin"

ref

Potential solutions

Maybe we can specify credentials for dockerhub with this registries configuration file? https://k3d.io/v5.6.0/usage/registries/

We might also want to look into implementing this for kind, but that's lower priority because we aren't regularly using it in CI.

@sharnoff sharnoff added the a/ci Area: related to continuous integration label Jun 19, 2024
@sharnoff sharnoff self-assigned this Jun 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a/ci Area: related to continuous integration
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant