Google Cloud Storage and the Problem with Zero-Byte Folders

Table of Contents

Overview

On Google Cloud, we were creating a basic event-driven pipeline. The plan was to have a Cloud Function call our backend API whenever a file was uploaded to a GCS bucket.

However, an odd thing began to happen: even when no actual file was uploaded, our backend was being contacted. In several cases, the pipeline was getting called when a folder was getting created in the GCS bucket.

We discovered then that this wasn't a bug in our code.
This was a peculiarity of Google Cloud Storage's operation.

In this post, we’ll break down:

  • How GCS treats folders and objects
  • Why these folders “uploads” trigger events
  • What exactly went wrong in our setup
  • And how we updated our Cloud Function to filter these false positives

Our Initial Setup (The Problem Setup)

We had a simple architecture in mind:

GCS Bucket → Pub/Sub → Cloud Function → Our Backend

We created a bucket notification on the GCS bucket, which would publish messages to a Pub/Sub topic when a new object was finalized (uploaded). The Pub/Sub topic was connected to a Cloud Function (Gen2), which would receive the message and call an internal backend API.

We thought everything was working fine—until we started getting backend logs from “fake” events.

After digging through logs, we found these events looked like this:

{
  "name": "Inbound/",
  "size": "0"
}

No actual file—just a folder.
But wait... folders don’t even exist in GCS. Right?

Why Folders in GCS Are Not Real Folders

Here’s what we found:

  • GCS has a flat namespace, not a folder structure
  • When you "create a folder" in the UI or CLI, GCS adds a zero-byte object with a name like "FolderName/"
  • GCS emits notifications for every object—even those zero-byte folder placeholders

So every time someone created a folder, it looked like a new file had been uploaded, and the pipeline got triggered.

This is expected behavior, not a bug—but it caused real issues for us.

🔍 How We Discovered the Issue

We added logs inside our Cloud Function and noticed that these objects:

  • Had a name ending with /
  • Had a size of 0

We tested manually by uploading only folders from the GCP Console—and confirmed they were triggering events.

We also checked the behavior with multiple file uploads and found that each folder in the path was triggering one extra event.

This explained why we were getting so many unexpected backend hits.

Our Fix: Add Filtering Logic in the Function

Because we were using Cloud Functions Gen2, we couldn’t control the Pub/Sub subscription directly — it was auto-created by Eventarc.

That meant we couldn’t apply filters at the subscription level.
So the only reliable place to filter was inside the function code.

Here’s the Python code we used to filter out non-.pdf and zero-byte/folder objects:

import base64
import json
import requests

def process_gcs_event(event, context=None):
    try:
        pubsub_message = base64.b64decode(event['data']).decode('utf-8')
        payload = json.loads(pubsub_message)

        name = payload.get("name")
        size = int(payload.get("size", 0))

        print(f"[RECEIVED] Object: {name}, Size: {size}")

        if name.endswith("/") and size == 0:
            print(f"[SKIP] Folder-like object: {name}")
            return

        if size == 0:
            print(f"[SKIP] Zero-byte file: {name}")
            return

        if not name.endswith(".pdf"):
            print(f"[SKIP] Not a .pdf file: {name}")
            return

        print(f"[PROCESS] Valid .pdf file: {name}")

        # Send to your backend
        backend_url = "https://your-backend-url"
        headers = {"Content-Type": "application/json"}

        response = requests.post(backend_url, data=json.dumps(payload), headers=headers)
        print(f"[SUCCESS] Sent to backend. Status: {response.status_code}")

    except Exception as e:
        print(f"[ERROR] {str(e)}")

Test Cases We Tried

Scenario

Expected Result

Upload real .pdf file

✅ Backend hit

Upload zero-byte .pdf file

❌ Skipped

Upload folder only (no file)

❌ Skipped

Upload .txt file

❌ Skipped

This helped us confirm that the new logic was working.

Why This Behavior Exists (Not a Bug)

GCS was designed this way—and it's not a mistake.

  • Object names can have slashes (/) to make things look like folders
  • But everything is stored as flat objects
  • “Folders” are just zero-byte files with special names
  • GCS emits notifications for every object, regardless of size or type

This is useful in some cases (like folder indexing), but not when you're building pipelines that care about actual files only.


Conclusion

Google Cloud Storage’s simulated folder behavior can cause unexpected issues in event-driven pipelines. Placeholder objects for empty folders may trigger functions that were only meant to process real files.

Understanding how GCS handles object events helped us rework our event processing. By filtering based on size and file suffix inside the Cloud Function, we now handle only the events we care about—file.pdf uploads.

At Kubenine, we work with cloud-native architecture challenges almost each day. From infrastructure setup to application delivery pipelines, we help teams build and scale on Google Cloud without tripping over platform quirks.