Retry GCS on 503 errors
Open, NormalPublic


Google Cloud Service sometimes throws 503 Service Temporarily Unavailable errors at us. We might want to retry once or twice after such a reply, to see if the service has come up. However, we should consider this carefully, given that:

  • we probably want to pause for a second or two between retries, to give GCS to come up and service us again, and
  • during this delay the Pillar process is sleeping, blocking others from performing requests on it.

If we do this the wrong way, a hickup at GCS could cause all Pillar processes to hang in a retry loop, DOSsing ourselves.

Probably related to T48956.



Related Objects