Skip to content

HTTP 429 (Too Many Requests) errors are not retried despite being marked as temporary #2111

@knqyf263

Description

@knqyf263

Description

HTTP 429 responses are not retried by default in the retry transport, even though TooManyRequestsErrorCode is included in temporaryErrorCodes. The OCI Distribution Spec defines 429 as a standard error response: https://github.com/distribution/distribution/blob/da404778edd3faa665e48ca3bb791b6144f3355e/docs/content/spec/api.md#base

Current Behavior

  1. The defaultRetryStatusCodes (408, 500, 502, 503, 504, 499, 522) does not include 429:

    var defaultRetryStatusCodes = []int{
    http.StatusRequestTimeout,
    http.StatusInternalServerError,
    http.StatusBadGateway,
    http.StatusServiceUnavailable,
    http.StatusGatewayTimeout,
    499, // nginx-specific, client closed request
    522, // Cloudflare-specific, connection timeout
    }

  2. TooManyRequestsErrorCode is included in temporaryErrorCodes:

    var temporaryErrorCodes = map[ErrorCode]struct{}{
    BlobUploadInvalidErrorCode: {},
    TooManyRequestsErrorCode: {},
    UnknownErrorCode: {},
    UnavailableErrorCode: {},
    }

  3. The retry logic in retryTransport.RoundTrip only converts responses to transport.Error if the status code is in t.codes:

    for _, code := range t.codes {
    if out.StatusCode == code {
    return retryError(out)
    }
    }

Problem

Since 429 is not in defaultRetryStatusCodes, the response is never converted to a transport.Error via retryError(). This means:

  • The Temporary() method is never called
  • The check for TooManyRequestsErrorCode in temporaryErrorCodes is never reached from retryTransport.
  • 429 responses are not retried

Question

Is this behavior intentional? It seems reasonable to retry rate-limited requests with appropriate backoff.

If this is intentional (perhaps because rate limiting requires different handling than other temporary errors), should users be expected to explicitly configure retry behavior for rate limiting using WithRetryStatusCodes()?

Please let me know if I'm missing something in my understanding of the retry logic.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions