Skip to content

Conversation

@yjhjstz
Copy link
Member

@yjhjstz yjhjstz commented Oct 16, 2025

Add comprehensive parallel table scan capability to GPORCA optimizer, enabling worker-level parallelism within segments for improved query performance on large table scans.

Key components:

  • New CPhysicalParallelTableScan operator and CDistributionSpecWorkerRandom distribution specification for worker-level data distribution
  • CXformGet2ParallelTableScan transformation with parallel safety checks (excludes CTEs, dynamic scans, foreign tables, replicated tables, etc.)
  • Cost model integration with parallel_setup_cost and efficiency degradation scaling (logarithmic based on worker count)
  • DXL serialization/deserialization for CDXLPhysicalParallelTableScan
  • Plan translation to PostgreSQL SeqScan nodes with parallel_aware=true
  • Rewindability constraints (parallel scans are non-rewindable)
  • GUC integration: max_parallel_workers_per_gather controls worker count

Impl #1316

What does this PR do?

Type of Change

  • Bug fix (non-breaking change)
  • New feature (non-breaking change)
  • Breaking change (fix or feature with breaking changes)
  • Documentation update

Breaking Changes

Test Plan

  • Unit tests added/updated
  • Integration tests added/updated
  • Passed make installcheck
  • Passed make -C src/test installcheck-cbdb-parallel

Impact

Performance:

User-facing changes:

Dependencies:

Checklist

Additional Context

CI Skip Instructions


@yjhjstz yjhjstz marked this pull request as ready for review October 17, 2025 16:41
@my-ship-it my-ship-it force-pushed the yjhjstz/orca_parallel branch from cde0dec to 05d9edf Compare October 20, 2025 06:54
@my-ship-it my-ship-it self-requested a review October 20, 2025 07:50
@my-ship-it
Copy link
Contributor

Please add more test cases

@yjhjstz
Copy link
Member Author

yjhjstz commented Oct 21, 2025

Please add more test cases

see src/test/regress:installcheck-orca-parallel

Add comprehensive parallel table scan capability to GPORCA optimizer,
enabling worker-level parallelism within segments for improved query
performance on large table scans.

Key components:
- New CPhysicalParallelTableScan operator and CDistributionSpecWorkerRandom
distribution specification for worker-level data distribution
- CXformGet2ParallelTableScan transformation with parallel safety checks
(excludes CTEs, dynamic scans, foreign tables, replicated tables, etc.)
- Cost model integration with parallel_setup_cost and efficiency degradation
scaling (logarithmic based on worker count)
- DXL serialization/deserialization for CDXLPhysicalParallelTableScan
- Plan translation to PostgreSQL SeqScan nodes with parallel_aware=true
- Rewindability constraints (parallel scans are non-rewindable)
- GUC integration: max_parallel_workers_per_gather controls worker count
revert CDistributionSpecRandom.cpp
@yjhjstz yjhjstz force-pushed the yjhjstz/orca_parallel branch from 05d9edf to 8a5bc1e Compare October 21, 2025 15:24
Copy link
Contributor

@my-ship-it my-ship-it left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@avamingli
Copy link
Contributor

Add some cases to test the plan?

@yjhjstz
Copy link
Member Author

yjhjstz commented Oct 24, 2025

Add some cases to test the plan?

maybe after impl parallel hash join .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants