Your pip install Just Backdoored Your Kubernetes Cluster

A single pip install can compromise your entire Kubernetes cluster. Learn how modern supply chain attacks exploit dependencies, steal credentials, and spread across CI/CD pipelines.

Vikas Yadav

10 Apr 2026 • 12 min read

In March 2026, a Python package downloaded 3.4 million times per day was backdoored. For three hours, anyone who ran pip install litellm got more than an AI proxy library — they got a three-stage credential stealer that harvested SSH keys, cloud tokens, Kubernetes secrets, and crypto wallets. The malicious version then deployed privileged pods across every node in their Kubernetes clusters and installed a persistent backdoor.

The attacker didn't find a bug in LiteLLM's code. They didn't even target LiteLLM directly. They compromised a security scanner that LiteLLM's CI pipeline depended on, stole a PyPI publishing token, and published two backdoored versions to the registry.

This is a software supply chain attack: instead of breaking into your application, attackers compromise the tools, libraries, and build systems you already trust. The malicious code arrives through npm install, pip install, or docker pull — commands you run every day without a second thought.

In this post, we'll break down how these attacks work, walk through two major incidents, and build a defense strategy you can start implementing today.

How Supply Chain Attacks Work: The Kill Chain

Every supply chain attack follows a similar pattern. Understanding it helps you spot the weak points in your own development workflow.

Stage 1: Entry Point

Attackers need a way in. The most common entry points:

Stolen maintainer credentials — Phishing a package maintainer for their npm or PyPI token, or stealing tokens from CI/CD environments. The TeamPCP group stole a PyPI publishing token by compromising a GitHub Action that ran in LiteLLM's CI pipeline.
Compromised CI tooling — Instead of targeting the package directly, attackers poison the tools that build it. A compromised GitHub Action or security scanner becomes a Trojan horse inside thousands of CI pipelines.
Typosquatting — Publishing packages with names that look like popular ones: python-dateutill instead of python-dateutil, crossenv instead of cross-env. Developers who mistype an install command get the malicious version.
Social engineering — Approaching maintainers of dormant but widely-used packages, offering to "help maintain" the project, then injecting malicious code once they have commit access. The xz Utils backdoor is a textbook example: an attacker spent two years building trust with the maintainer before introducing a sophisticated backdoor into a core Linux compression library.

Stage 2: Payload Injection

Once inside, attackers need their code to execute without the developer explicitly calling it:

npm lifecycle scripts — postinstall and preinstall hooks run automatically during npm install. The Shai-Hulud worm used this to execute on every developer machine and CI runner that installed a compromised package.
Python setup hooks — setup.py can run arbitrary code during pip install. Attackers inject credential stealers into the install process itself.
.pth files — A lesser-known Python trick. Files ending in .pth placed in site-packages execute automatically every time the Python interpreter starts — not just when the library is imported. The LiteLLM backdoor used litellm_init.pth to run on every Python process.
Obfuscated helper files — Innocuous-sounding files like setup_bun.js or bun_environment.js that get dropped during install and contain the real payload.

Stage 3: Propagation

The most dangerous attacks don't stop at one package:

Automated registry publishing — Using a stolen npm or PyPI token to authenticate as the compromised developer, find their other packages, and publish backdoored versions of those too.
Worm-like self-replication — The Shai-Hulud worm read its own source code to inject itself into new packages, spreading without any command-and-control server.
Dependency tree cascading — A single compromised package deep in a dependency tree can affect thousands of projects that depend on it transitively.

Stage 4: Exfiltration and Persistence

Once running, the payload goes to work:

Credential harvesting — SSH keys, cloud tokens, .env files, Kubernetes service account tokens, and API keys get encrypted and exfiltrated to attacker-controlled servers or dumped into public GitHub repos.
Persistent backdoors — systemd services that survive reboots and poll for additional payloads on a schedule.
Kubernetes lateral movement — Deploying privileged pods across every node in a cluster, turning a single compromised developer machine into full cluster access.

Case Study 1: Shai-Hulud — The Self-Replicating npm Worm

Shai-Hulud attack was the first self-replicating worm to spread through the npm ecosystem. It hit in two waves, each more complicated than the last.

Wave 1: September 2025

The first wave began when an attacker compromised the npm accounts of several popular package maintainers. Packages like ngx-bootstrap, ng2-file-upload, and @ctrl/tinycolor — collectively downloaded millions of times per week — received malicious updates.

The compromised versions contained a postinstall hook that pulled an obfuscated bundle.js file. When developers ran npm install, this script executed automatically and:

Harvested npm authentication tokens from the local .npmrc file
Collected GitHub Personal Access Tokens (PATs) and cloud service API keys from environment variables and config files
Exfiltrated everything via webhooks

Here's where it got interesting. Using the stolen npm tokens, the malware authenticated to the npm registry as the compromised developer. It then enumerated every other package that developer maintained, injected itself into each one, and published new, backdoored versions. No human intervention needed.

One compromised developer account could infect dozens of packages within minutes, and each of those packages' users could spread the infection further.

Wave 2: November 2025 (Shai-Hulud 2.0)

Ten weeks later, a second wave appeared — more sophisticated and harder to stop.

The key changes:

Preinstall instead of postinstall — The malware shifted to the preinstall lifecycle hook, meaning it executed before the package finished installing. Even failed installations triggered the payload.
No C2 server needed — The worm read its own source code to propagate, eliminating the need for an external command-and-control server. This made it harder to disrupt by taking down infrastructure.
Public credential dumping — Instead of exfiltrating to hidden servers, the malware created public GitHub repositories named "Shai-Hulud" under victim accounts and committed stolen secrets there in plain sight.

The injected files, setup_bun.js and bun_environment.js, were designed to look like legitimate tooling configuration. A developer reviewing their package.json might not think twice about a preinstall script that referenced Bun.

The Damage

The numbers are staggering (per Wiz and Truesec reporting):

The attack was significant enough that CISA issued a formal advisory and Microsoft published detection and defense guidance.

The Lesson

One compromised developer token cascaded into hundreds of poisoned packages within hours. The npm ecosystem's trust model — anyone can publish, lifecycle scripts run automatically by default — made exponential propagation possible. The worm didn't need a zero-day. It just needed one set of stolen credentials and the default npm configuration.

Case Study 2: TeamPCP — From Security Scanner to AI Library Backdoor

If Shai-Hulud showed how fast a worm can spread, the TeamPCP campaign showed how deep a supply chain attack can go. The attackers didn't target LiteLLM directly — they compromised the security tools in its CI/CD pipeline and worked their way up.

Day 1: Poisoning the Scanner (March 19, 2026)

TeamPCP began by targeting Aqua Security's Trivy — one of the most popular open-source vulnerability scanners. But they didn't publish a fake package. Instead, they overwrote 76 of 77 version tags in the aquasecurity/trivy-action GitHub repository and all 7 tags in aquasecurity/setup-trivy.

# What this means:
# Before the attack, this pinned to a safe version:
uses: aquasecurity/trivy-action@0.69.0
# After the attack, the same tag pointed to malicious code.
# The version number didn't change. The content behind it did.

This is the critical detail: developers who thought they were pinning to a specific version of Trivy were actually running attacker-controlled code. Git tags are mutable — anyone with write access can point them to a different commit.

The compromised Trivy action contained a credential-harvesting payload that exfiltrated environment variables from every CI/CD pipeline that used it. The irony was perfect: the tool organizations ran to find vulnerabilities was itself the vulnerability.

Day 5: Lateral Expansion (March 23, 2026)

TeamPCP used the same playbook against Checkmarx, compromising their OpenVSX extensions (cx-dev-assist, ast-results) and the KICS GitHub Action. Each compromised tool gave them access to more CI/CD environments and more credentials.

Day 6: The Prize (March 24, 2026)

LiteLLM — an AI proxy library that unifies access to OpenAI, Anthropic, Cohere, and dozens of other LLM providers — ran Trivy as part of its CI/CD pipeline. The build pulled Trivy without a pinned commit SHA, relying on a version tag instead.

The compromised Trivy action exfiltrated the maintainer's PYPI_PUBLISH token from the GitHub Actions runner environment. With that token, TeamPCP published two backdoored versions of LiteLLM to PyPI: 1.82.7 and 1.82.8.

The Payload

The backdoored LiteLLM versions contained litellm_init.pth — a file that Python automatically processes every time the interpreter starts. Not when you import LiteLLM. Every time Python runs. Period.
The payload executed in three stages:

Stage 1: Harvested SSH keys, cloud tokens (AWS, GCP, Azure), Kubernetes secrets, .env files, and crypto wallets. Exfiltrated to models.litellm.cloud.
Stage 2: Deployed privileged pods to every Kubernetes node if cluster access existed. One pip install could compromise an entire cluster.
Stage 3: Installed a systemd backdoor (~/.config/sysmon/sysmon.py) that polled checkmarx[.]zone/raw every 50 minutes. Survived reboots and LiteLLM uninstalls.

The Damage of LiteLLM Backdoor

The Lesson

Your dependencies have dependencies. LiteLLM's code didn't have a vulnerability — its CI tool did. Two specific failures made this possible:

Unpinned GitHub Actions — Using a version tag instead of a commit SHA meant the "same version" of Trivy could silently become malicious
Overly broad CI tokens — The PYPI_PUBLISH token was accessible to the entire CI pipeline, not just the publish step

Fix those two things, and this attack chain breaks.

The same supply chain risks apply to containers also. Every time you write FROM python:3.12-slim in a Dockerfile, you're trusting that the image hasn't been tampered with — the same way you trust npm install and pip install. Base images on Docker Hub can be poisoned, typosquatted (think pythonn/slim vs python/slim), or silently updated with malicious layers.

Defense-in-Depth: A Layered Prevention Strategy

No single tool stops supply chain attacks. You need defenses at every layer, from the developer's laptop to production runtime. Here's how to build them.
https://kubenine.slack.com/files/U0973M0JEAY/F0AQW535ZBM/software_supply_chain_defense_layers.drawio__1_.svg

Layer 1: Developer Workstation

This is the first line of defense — and the one most developers skip.

Use lockfiles and commit them. package-lock.json, yarn.lock, poetry.lock, and Pipfile.lock pin exact versions and integrity hashes. If an attacker publishes a backdoored version, the lockfile prevents it from being pulled in automatically. But only if the lockfile is committed to your repo and enforced in CI (npm ci instead of npm install).

Disable automatic lifecycle scripts. This single command would have prevented Shai-Hulud from executing on any developer machine:

``bash npm config set ignore-scripts true``

After setting this, you can selectively run scripts for trusted packages with npm rebuild <package>. Yes, it adds friction. That friction is the point.

Review new dependencies before installing. Before running npm install some-package or pip install some-package, check: How many downloads does it have? When was it last published? Who maintains it? Does the GitHub repo look legitimate? Tools like Socket.dev automate this, but even a quick manual check can catch obvious typosquats.

Layer 2: CI/CD Pipeline

This is where the TeamPCP attack succeeded — and where two changes would have stopped it.

Pin GitHub Actions by commit SHA, not tag. Tags are mutable. TeamPCP overwrote 76 of 77 Trivy version tags to point to malicious code. Commit SHAs are immutable:


``yaml
# Vulnerable — tag can be silently rewritten
- uses: aquasecurity/trivy-action@0.69.0

# Safe — immutable reference to a specific commit
- uses: aquasecurity/trivy-action@7b7aa264d83dc58691451798927ecb2643d3ef9c``

This is the single highest-impact change you can make today. GitHub's Dependabot can help you manage SHA updates.

Scope CI tokens to minimum permissions. The LiteLLM compromise happened because a PYPI_PUBLISH token was accessible to a build step that only needed to run a security scan. Structure your CI pipeline so publishing tokens are only available in the publish job:

``yaml
jobs:
  test:
    # No access to PYPI_PUBLISH — only test credentials
    steps:
      - run: pytest

  publish:
    needs: test
    if: github.ref == 'refs/heads/main'
    environment: pypi  # Token only available in this environment
    steps:
      - uses: pypa/gh-action-pypi-publish@release/v1``

Use ephemeral runners. Don't reuse CI environments. Each job should start from a clean image so that secrets from previous runs can't be harvested. GitHub's hosted runners do this by default, but self-hosted runners need explicit configuration — see our guide on self-hosted GitHub Actions runners on Kubernetes.

Run audit tools as CI gates. Make npm audit and pip-audit part of your CI pipeline so known vulnerabilities block merges:

``bash
# npm
npm audit --audit-level=high

# pip
pip-audit --strict --desc``

Layer 3: Registry and Dependency Management

Put a gate between the public registry and your developers.

Use a private registry or proxy. Tools like Artifactory, Nexus, or AWS CodeArtifact act as a proxy between your team and public registries. You can configure allow-lists, block packages flagged by security scans, and cache known-good versions. If a malicious version hits PyPI, your proxy can quarantine it before any developer pulls it.

Enable provenance attestation. npm provenance and PyPI Trusted Publishers use SLSA-backed attestation to cryptographically prove that a package was built from a specific source repo and commit. If TeamPCP had published LiteLLM 1.82.8 from their own machine instead of the official CI pipeline, provenance verification would have flagged it immediately. Enable it for your own packages:

``bash
# npm — publish with provenance
npm publish --provenance

# PyPI — configure Trusted Publishers in your project settings
# Links your PyPI project to a specific GitHub repo + workflow``

Generate and track SBOMs. A Software Bill of Materials tells you exactly what's in your dependency tree. When a vulnerability is announced, you need to answer "are we affected?" in minutes, not days. Tools like Syft and CycloneDX generate SBOMs from your lockfiles and container images.

Monitor dependencies in real-time. Socket.dev analyzes package behavior at install time — detecting things like network requests, filesystem access, and shell execution that legitimate packages don't normally do. Snyk provides continuous monitoring of your dependency tree against known vulnerability databases.

Layer 4: Runtime and Detection

If a malicious dependency makes it past all previous layers, you need to detect it in production.

Monitor outbound network connections. The LiteLLM payload exfiltrated data to models.litellm.cloud and polled checkmarx.]zone/raw for additional payloads. Network policies in Kubernetes can restrict which domains your pods are allowed to contact. If your application only needs to reach your own API and a database, everything else should be blocked. Learn more about [managing Kubernetes secrets securely and keeping your attack surface minimal.

Use read-only file systems. The LiteLLM backdoor installed a systemd service and wrote files to ~/.config/sysmon/. A read-only root filesystem in your container prevents this:

``yaml
securityContext:
readOnlyRootFilesystem: true``

Watch for suspicious process spawns. Packages should not be creating systemd services, cron jobs, or spawning shell processes. Runtime security tools like Falco can alert on unexpected process execution inside containers.

Have an incident response plan. When a compromised dependency is announced, you need to answer four questions fast:

Which of our systems installed the affected version?
When were they exposed?
What credentials might be compromised?
How do we rotate those credentials?

Your SBOM and container image registry make questions 1 and 2 answerable. Your secrets management practices determine how quickly you can handle 3 and 4.

Start Today

Software supply chain attacks aren't hypothetical. In the past six months, a self-replicating npm worm compromised 796 packages affecting 20 million weekly downloads, and a chained CI/CD attack turned a security scanner into a backdoor that exposed 500,000 corporate identities.

Your code is only as secure as your weakest dependency. And your dependencies have dependencies.

No single tool will protect you. But layered defense — from ignore-scripts on your laptop to network policies in your Kubernetes cluster — makes the attack surface dramatically smaller.

Here's what you can do right now:

Pin your GitHub Actions by commit SHA — not tags, not branches
Enable ignore-scripts in npm — npm config set ignore-scripts true
Run pip-audit / npm audit in CI — block merges with known vulnerabilities
Scope your CI publishing tokens — only the publish job gets the publish token
Commit and enforce your lockfiles — use npm ci, not npm install, in CI

None of these take more than an hour. All of them would have blocked or limited the attacks we covered in this post.

Conclusion

Software supply chain attacks are no longer rare edge cases — they are actively targeting npm, pip, and CI/CD pipelines at scale. Instead of attacking your application directly, attackers exploit the tools and dependencies you trust. A single compromised package or token can cascade into widespread impact across systems, pipelines, and even Kubernetes clusters.

The key takeaway is simple: security can no longer stop at your code. You need visibility and control across your entire development lifecycle — from dependencies and CI/CD pipelines to runtime environments.
Start with small, high-impact steps like pinning dependencies, securing CI/CD tokens, and monitoring runtime behavior.

These changes may seem simple, but they significantly reduce your exposure to modern supply chain threats.
Because in today’s ecosystem, your application is only as secure as your weakest dependency.

Further reading:

Vikas Yadav

Stop Over-Engineering Django State Machines (Use This Instead of django-fsm)

Beyond Git Worktrees: What It Actually Takes to Run Parallel AI Agents

GCP Billing Kill Switch with Terraform (Stop Cloud Costs Fast)

Fluent Bit vs Grafana Alloy: Best Kubernetes Observability Setup (2026 Guide)

Terraform with AI: Build AWS Infra (Cursor + MCP)

On this page

How Supply Chain Attacks Work: The Kill Chain

Stage 1: Entry Point

Stage 2: Payload Injection

Stage 3: Propagation

Stage 4: Exfiltration and Persistence

Case Study 1: Shai-Hulud — The Self-Replicating npm Worm

Wave 1: September 2025

Wave 2: November 2025 (Shai-Hulud 2.0)

The Damage

The Lesson

Case Study 2: TeamPCP — From Security Scanner to AI Library Backdoor

Day 1: Poisoning the Scanner (March 19, 2026)

Day 5: Lateral Expansion (March 23, 2026)

Day 6: The Prize (March 24, 2026)

The Payload

The Damage of LiteLLM Backdoor

The Lesson

Defense-in-Depth: A Layered Prevention Strategy

Layer 1: Developer Workstation

Layer 2: CI/CD Pipeline

Layer 3: Registry and Dependency Management

Layer 4: Runtime and Detection

Start Today

Read More on KubeBlogs

Conclusion

Recent

Stop Over-Engineering Django State Machines (Use This Instead of django-fsm)

Beyond Git Worktrees: What It Actually Takes to Run Parallel AI Agents

GCP Billing Kill Switch with Terraform (Stop Cloud Costs Fast)

Fluent Bit vs Grafana Alloy: Best Kubernetes Observability Setup (2026 Guide)

Terraform with AI: Build AWS Infra (Cursor + MCP)