Table of Contents
It’s quite common to load the entire data returned by your SQL queries in memory and then process them. This consumes a lot of memory and causes OOM issues in production as your data grows. To add to it, python is already memory hungry.
Consider a background job or API endpoint that needs to process records from a database generating a report, exporting data, or running a migration. A straightforward implementation often fetches all matching records into memory and then processes them one by one. While this approach works it stresses the memory available to your application.
In this blog we present a solution that revolves around streaming the data rather than loading it all in memory. Rather than treating a dataset as a single unit, the application works with a small number of records at a time, processing each subset as it becomes available.
Rethinking Data Access Patterns
At production scale, the problem is no longer business logic, it’s the data access pattern. We should not tightly couple the application memory usage to the size of the dataset.This requires a shift from eager data loading to incremental processing, where records are fetched, processed, and discarded continuously rather than accumulated in memory.
This approach is commonly referred to as streaming records from the database.
What Does Streaming Records Mean?
Rather than treating a dataset as a single unit, the application works with a small number of records at a time, processing each subset as it becomes available.
The key distinction is not what data is processed, but when it is held in memory. By limiting the number of records retained at any point in time, we could ensure that application memory usage remains predictable and bounded.
This pattern is widely used in large-scale systems, from database access layers to file processing and message-driven architectures. The underlying principle remains the same: process data as a stream, not as a bulk collection.
Application Level Streaming
A simple and effective way to stream records in Python is by using iterators or generators. The data access layer yields records incrementally, allowing the application to process them one at a time.
def stream_orders(batch_size=1000):
offset = 0
while True:
rows = db.query(
"SELECT * FROM orders LIMIT %s OFFSET %s",
(batch_size, offset)
)
if not rows:
break
for row in rows:
yield row
offset += batch_size
The processing logic remains simple:
for order in stream_orders():
process(order)In very large tables, OFFSET-based pagination can become inefficient. In those cases, keyset pagination (e.g., using a monotonically increasing ID or timestamp) is often preferable.
def stream_orders(batch_size=1000):
last_id = 0
while True:
rows = db.query(
"""
SELECT *
FROM orders
WHERE id > %s
ORDER BY id
LIMIT %s
""",
(last_id, batch_size)
)
if not rows:
break
for row in rows:
yield row
last_id = row["id"]Processing logic remains unchanged:
for order in stream_orders():
process(order)Database-Level Streaming with Cursors
In the previous example, records are streamed at the application level, with data fetched in batches from the database. In some cases, it’s possible to stream records even earlier directly from the database itself.
Most database drivers support cursor-based iteration, allowing rows to be fetched incrementally as they are consumed:
cursor = connection.cursor()
cursor.execute("SELECT * FROM orders")
for row in cursor:
process(row)
In this model, the database sends rows gradually rather than returning the entire result set at once. This enables row-by-row processing with significantly reduced memory usage inside the application.
Operational Considerations
Database-level streaming comes with important operational considerations. Because cursors keep database connections open while records are being consumed, long-running streaming operations can tie up connection pools and extend transaction lifetimes. Depending on the database driver and configuration, rows may still be partially buffered, and behavior can vary across systems. For these reasons, cursor-based streaming should be used deliberately, with attention to connection limits, transaction scope, and workload characteristics. In some databases, streaming queries may implicitly run inside a transaction, extending locks longer than expected.
Choosing the Right Level of Streaming
Choosing the right level of streaming depends on where you want to place complexity and control. Application-level streaming offers a good balance of simplicity, performance, and predictable resource usage for most workloads. Database-level streaming minimizes memory usage further but requires stricter operational discipline. Both approaches follow the same principle: keep memory usage bounded and process data incrementally.
Closing Thoughts
Streaming records is less about optimization and more about building systems that behave predictably at scale. As data grows, small design decisions around how records are accessed can have an outsized impact on memory usage, stability, and infrastructure cost.
Streaming is not always necessary. For small, bounded datasets, eager loading can be simpler and perfectly acceptable.
By shifting from eager data loading to incremental processing, applications remain resilient even as datasets grow into the millions. Whether implemented at the application layer or directly at the database level, streaming ensures that memory usage stays bounded and operational behavior remains controlled.
The key takeaway is simple: scalable systems are built not just on correct logic, but on thoughtful data access patterns. Streaming records is one of those patterns.