Skip to content

Conversation

@kholdrex
Copy link
Owner

This PR adds a new .stream_each method to SimpleQuery’s Builder, providing row-by-row retrieval for large datasets in both PostgreSQL (via cursors) and MySQL (via the mysql2 gem’s streaming mode). This prevents loading entire result sets into memory at once, offering significant memory and performance benefits for multi-million-row queries.

Key Changes

  1. PostgresStream & MysqlStream Modules – Separated logic into dedicated modules:
  • stream_each_postgres(batch_size, &block) uses DECLARE ... CURSOR and FETCH ... in a transaction.
  • stream_each_mysql(&block) uses raw_conn.query(sql, stream: true, cache_rows: false, as: :hash) from mysql2.
  1. Builder Integration – .stream_each(batch_size: 1000, &block) dispatches to either Postgres or MySQL method based on the adapter, raising an error if neither.

  2. Tests – Unit tests for each module (PostgresStream, MysqlStream) plus an integration test verifying Builder#stream_each.

Performance Gains

Our benchmarks show .stream_each can use ~30-50% less memory than AR’s find_each, with 2-3× faster iteration on large datasets.

Usage Example:

User.simple_query
    .where(active: true)
    .stream_each(batch_size: 10_000) do |row|
  puts row.name
end

This yields each row individually, preventing large memory spikes.

@kholdrex kholdrex added the enhancement New feature or request label Mar 15, 2025
@kholdrex kholdrex self-assigned this Mar 15, 2025
@kholdrex kholdrex changed the title Feature/streaming for postgres and mysql feat: Add streaming (stream_each) support for PostgreSQL and MySQL Mar 15, 2025
@kholdrex kholdrex merged commit 0ca2bd7 into master Mar 15, 2025
20 checks passed
@kholdrex kholdrex deleted the feature/streaming-for-postgres-and-mysql branch March 15, 2025 16:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants