Fixing 504 Errors in GKE load balancer: How BackendConfig Solved Our 30-Second Timeout Problem

Table of Contents

Introduction

We first noticed this problem during normal API calls. Everything worked fine at first for all API calls, but when we ran the application for some time we realised that all our API requests that needed more than 30 seconds to process were failing with 503 and 504 Gateway Timeout errors.

At first we thought it was an application issue but then the the didn’t show failures, and pods were still running. That got us confused. After some digging, we realized it wasn’t the app at all — it was the default timeout of GKE’s HTTP(S) Load Balancer. By default, the load balancer cuts off requests after 30 seconds, no matter what’s happening inside your service.

That discovery led us to something we hadn’t used before: BackendConfig. With it, you can adjust timeouts, define custom health check paths, and control how the load balancer communicates with your pods.

In this post, we’ll walk through what BackendConfig is, why you need it, and how to use it with your services in GKE.

What is BackendConfig?

BackendConfig is a Kubernetes Custom Resource Definition (CRD) that lets you adjust load balancer settings for services behind an Ingress.

The GKE Ingress controller (GLBC) looks at the BackendConfig linked to your service and updates the corresponding GCP backend service and health check.

If you don’t create a BackendConfig, GKE automatically applies defaults.

What Happens Without BackendConfig?

When you expose a service with an Ingress, GKE provisions a backend service with default settings:

  • Health check type: HTTP
  • Health check path: /
  • Health check interval: 15s
  • Health check timeout: 15s
  • Healthy threshold: 1
  • Unhealthy threshold: 2
  • Backend request timeout: 30s

These defaults work for many workloads, but they’re not always enough — especially for APIs with long-running requests or apps that need custom health endpoints.

Why Would You Need It?

You should consider creating a BackendConfig if:

  • Your requests often take longer than 30s → increase timeoutSec.
  • Your app responds to health checks on a custom endpoint like /test instead of /.
  • You want stricter or looser health thresholds.
  • You need features like Cloud Armor, session affinity, or CDN policies.

Without BackendConfig, the load balancer will keep probing / and may report your service as unhealthy even though it’s running fine.

How BackendConfig Works in Practice

Here’s a minimal example:

apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
  name: my-backend-config
  namespace: my-namespace
spec:
  timeoutSec: 60
  healthCheck:
    requestPath: /test
    checkIntervalSec: 10
    timeoutSec: 5
    healthyThreshold: 2
    unhealthyThreshold: 2

This increases the backend timeout to 60 seconds and updates the health check path to /test.

Attaching BackendConfig to Services

To link BackendConfig with a Service, you add an annotation. The annotation maps a service port to the BackendConfig name:

apiVersion: v1
kind: Service
metadata:
  name: my-service
  namespace: my-namespace
  annotations:
    cloud.google.com/backend-config: '{"ports": {"8003":"my-backend-config"}}'
spec:
  type: NodePort
  ports:
    - port: 8003
      targetPort: 8003
  selector:
    app: my-app

Now, whenever the Ingress points to this Service on port 8003, it uses the BackendConfig settings.

Example: Timeout + Health Check

Let’s combine everything:

BackendConfig:

apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
  name: common-60s-test-hc
  namespace: aut-prod-main
spec:
  timeoutSec: 60
  healthCheck:
    requestPath: /test
    checkIntervalSec: 10
    timeoutSec: 5
    healthyThreshold: 2
    unhealthyThreshold: 2

Service:

apiVersion: v1
kind: Service
metadata:
  name: aut-prod-main-llm-layoutocr-service
  namespace: aut-prod-main
  annotations:
    cloud.google.com/backend-config: '{"ports":{"8003":"common-60s-test-hc"}}'
spec:
  type: NodePort
  ports:
    - port: 8003
      targetPort: 8003
  selector:
    app: llm-layoutocr

This setup gives the load balancer a 60-second timeout and directs health checks to /test.

NodePort vs NEG Behavior

When you don’t use a NEG, GKE maps your service to a NodePort. The load balancer probes the NodePort on each node, which often requires firewall rules to allow health check traffic.

If you annotate your service with:

cloud.google.com/neg: '{"ingress": true}'

then GKE creates a Network Endpoint Group (NEG). In that case, the load balancer talks directly to pods on their serving port (e.g., 8003). NEGs provide more accurate health checks and don’t need extra firewall rules.

BackendConfig works with both models, but the way health checks hit your app differs.


Frequently Asked Questions

Q: What happens if I don’t create BackendConfig?
A: GKE uses defaults (30s timeout, health check on /). For many apps, that’s enough.

Q: Can I reuse one BackendConfig across multiple services?
A: Yes, as long as the health check path and timeout make sense for all of them.

Q: Do defaults differ for internal vs external load balancers?
A: Defaults are similar, but internal load balancers are regional, while external ones are global. Both can use BackendConfig.

Conclusion

BackendConfig is your tool for fine-tuning how GKE’s load balancer communicates with your workloads.

  • Without it, you get defaults: / health checks and a 30s timeout.
  • With it, you can set custom paths like /test, extend timeouts, and add policies.
  • Attach it through simple annotations on Services, and the Ingress controller does the rest.

For production clusters, it’s a good practice to create one or two common BackendConfigs (e.g., with /test and 60s timeout) and reuse them across services.

This keeps your load balancer behavior predictable and your apps easier to manage.