Table of Contents
Introduction
When applications run on Google Kubernetes Engine (GKE), they often need to call Google Cloud services like Cloud Storage, Pub/Sub, or BigQuery. The traditional pattern was to place JSON service account keys in Kubernetes Secrets and mount them in pods. That approach creates security risks and operational overhead.
GKE Workload Identity Federation (WIF) solves this by giving pods short‑lived tokens tied to their Kubernetes identity. No long‑lived keys, no files to rotate, and enhanced security for your Google Cloud workloads.
This comprehensive guide explains Workload Identity Federation in depth, shows both approaches available in GKE, and provides step‑by‑step commands to implement it yourself. We'll also include information about how KubeNine Consulting can help streamline your GKE security implementation.
What Is Workload Identity Federation?
In GKE, every pod can run as a Kubernetes Service Account (KSA). With WIF, GKE’s metadata server issues a short‑lived token for that KSA. Google Cloud IAM treats the KSA as a trusted identity and, if it has the right role, returns an access token for the target API. Tokens expire quickly and are refreshed automatically by client libraries.
Why teams pick WIF
- No JSON keys in the cluster
- Short‑lived credentials by default
- Clear permission model that maps to your KSAs and namespaces
- Works with Google Cloud client libraries with no code changes.
Two Ways to Use WIF in GKE
A) Direct Workload Identity Federation (Recommended)
With Direct WIF, the KSA itself is an IAM principal. You grant IAM roles straight to the KSA using a special principal string.
Principal format
principal://iam.googleapis.com/projects/PROJECT_NUMBER/locations/global/workloadIdentityPools/PROJECT_ID.svc.id.goog/subject/ns/NAMESPACE/sa/KSA_NAME
What this means
- No Google Service Account (GSA) is required
- Fewer moving parts to manage
- Access is easy to read and audit: “KSA X in namespace Y can do Z”
B) KSA ↔ GSA Mapping (Compatibility Path)
Before Direct WIF, the common pattern linked a KSA to a Google Service Account (GSA). The pod runs as the KSA, which then impersonates the GSA to get tokens.
How it works
- Create a GSA
- Create a KSA
- Annotate the KSA with the GSA email
- Grant IAM roles to the GSA
This still works, but it adds an extra identity to manage and an extra hop. Use it if you have a special case or existing policies tied to a GSA.
Prerequisites
- A GKE cluster (Autopilot or Standard)
gcloud
andkubectl
configured for your project- Your user has rights to update clusters and set IAM bindings (for example, project editor + IAM role admin)
- APIs commonly needed: IAM Service Account Credentials API, Kubernetes Engine API (usually already on)
Prepare the Cluster for WIF
Autopilot
Autopilot clusters come with WIF ready. You can skip cluster updates and move to the identity and IAM steps.
Standard
Set the workload pool on the cluster:
gcloud container clusters update CLUSTER_NAME \
--location=LOCATION \
--workload-pool=PROJECT_ID.svc.id.goog
For existing node pools, switch to the metadata server:
gcloud container node-pools update NODEPOOL_NAME \
--cluster=CLUSTER_NAME \
--location=LOCATION \
--workload-metadata=GKE_METADATA
Tip: New node pools created after you set --workload-pool
use the right metadata mode by default. Older pools may need the update command above.
Direct WIF: Grant a Bucket Read Role to a KSA (End‑to‑End)
Let’s give a pod read access to a single Cloud Storage bucket using Direct WIF.
1) Create a namespace and KSA
kubectl create namespace app-ns
kubectl create serviceaccount app-sa -n app-ns
2) Bind an IAM role to the KSA principal
We’ll grant read‑only object access to bucket my-app-bucket
.
gcloud storage buckets add-iam-policy-binding my-app-bucket \
--role=roles/storage.objectViewer \
--member=principal://iam.googleapis.com/projects/PROJECT_NUMBER/locations/global/workloadIdentityPools/PROJECT_ID.svc.id.goog/subject/ns/app-ns/sa/app-sa \
--condition=None
Choosing the right role
- Read objects:
roles/storage.objectViewer
- Manage objects (create/delete):
roles/storage.objectAdmin
- Full admin on bucket + IAM:
roles/storage.admin
(use with care)
3) Run your pod with the KSA
In your manifest, set the service account on the pod template. You don’t need to paste a full deployment. This line is the part that matters:
spec:
serviceAccountName: app-sa
4) Test from inside the pod
Start any simple container and open a shell, then call Cloud Storage using the Google Cloud CLI or client libraries. The libraries fetch tokens from the metadata server automatically—no key files needed.
Example (listing objects):
gcloud storage objects list gs://my-app-bucket
If the IAM binding is correct and your cluster/node pools are configured for WIF, the command returns the bucket contents.
KSA ↔ GSA Mapping: The Compatibility Approach
If you must use a GSA (policy reasons or an API quirk), use this path.
1) Create a Google Service Account (GSA)
gcloud iam service-accounts create app-gsa \
--display-name="App GSA"
2) Grant the GSA access to your resource
gcloud storage buckets add-iam-policy-binding my-app-bucket \
--member=serviceAccount:app-gsa@PROJECT_ID.iam.gserviceaccount.com \
--role=roles/storage.objectViewer
3) Create a KSA and link it to the GSA
kubectl create namespace app-ns || true
kubectl create serviceaccount app-sa -n app-ns || true
kubectl annotate serviceaccount \
--namespace app-ns \
app-sa iam.gke.io/gcp-service-account=app-gsa@PROJECT_ID.iam.gserviceaccount.com --overwrite
4) Allow the KSA to impersonate the GSA
Bind the GSA’s IAM policy to allow workload identity users from your pool:
gcloud iam service-accounts add-iam-policy-binding \
app-gsa@PROJECT_ID.iam.gserviceaccount.com \
--role=roles/iam.workloadIdentityUser \
--member=principal://iam.googleapis.com/projects/PROJECT_NUMBER/locations/global/workloadIdentityPools/PROJECT_ID.svc.id.goog/subject/ns/app-ns/sa/app-sa
5) Run your pod with the same KSA
Again, the important manifest line is:
spec:
serviceAccountName: app-sa
Now the pod runs as app-sa
, which impersonates app-gsa
when calling Google Cloud.
How WIF Works Under the Hood
- Identity on the node: The node’s metadata server brokers requests from pods.
- Token for the pod: The pod presents its KSA identity to the GKE metadata server.
- Exchange: The metadata server issues a short‑lived token that represents the KSA.
- IAM check: Google Cloud IAM checks the binding you created (either direct KSA principal or the GSA path).
- Access token: If allowed, IAM returns an access token for the target API.
- Client libraries: Your code uses standard Google Cloud libraries; they request and refresh tokens when needed.
Common Pitfalls and Quick Fixes
- Cluster updated, but node pools are not
Run thenode-pools update
command with--workload-metadata=GKE_METADATA
for legacy pools. - Wrong principal string
ConfirmPROJECT_NUMBER
,PROJECT_ID
, namespace, and KSA name. One typo breaks access. - Binding at the wrong scope
If you bind at the project level, the pod may have broader access than intended. Prefer resource‑level roles (one bucket, one topic, one dataset). - Testing with the wrong pod
Double‑check that the pod you’re testing is actually running withserviceAccountName: app-sa
. - Missing APIs
If certain API calls fail, check that the related Google Cloud API is enabled for your project. - Using old images
Some very old base images might lack up‑to‑date client libraries. Use maintained images or install the needed SDKs in your image.
Tips for a Clean Permission Model
- One KSA per microservice: Name them by purpose, like
payments-sa
,reports-sa
. - Split by environment: Use different namespaces and KSAs across dev, stage, and prod.
- Favor resource‑level grants: Bind roles on a single bucket, topic, or dataset instead of the entire project.
- Use minimal roles: Start with read‑only and add rights only when you see a concrete need.
- Document the principal strings: Keep a small registry (even a README) of KSA ↔ roles.
Verifying and Troubleshooting in a Pod
1. Check token is present
curl -H "Metadata-Flavor: Google" \
http://169.254.169.254/computeMetadata/v1/instance/service-accounts/default/token
You should see a JSON blob with an access_token
and expires_in
.
2. Check who you are
gcloud auth list
You’ll see an application default entry picked up from the metadata server.
- Check IAM policy on the resource
gcloud storage buckets get-iam-policy gs://my-app-bucket
- Confirm the principal string or service account member is present with the role you expect.
FAQ
Do I need to change my code?
Usually not. Standard Google Cloud client libraries fetch tokens from the metadata server.
Can one KSA access several resources?
Yes. Bind multiple roles across resources to the same KSA principal, but keep the set small and focused.
Does this work with jobs, CronJobs, and stateful apps?
Yes. Anything that runs as a pod can use the same KSA and receive tokens.
What if an API doesn’t accept the KSA principal?
Use the KSA ↔ GSA approach for that case, or open an internal ticket to revisit the design.
Conclusion
At KubeNine Consulting, we specialize in implementing secure, scalable Kubernetes solutions on Google Cloud Platform. Our team of experienced DevOps engineers and cloud architects has successfully deployed Workload Identity Federation across numerous GKE clusters, delivering enhanced security and simplified access management for Google Cloud services.