-
Notifications
You must be signed in to change notification settings - Fork 140
fix(weave): actually support distributed replicated clickhouse tables #5864
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
| def wf_clickhouse_replicated() -> bool: | ||
| """Whether to use replicated clickhouse tables.""" | ||
| return os.environ.get("WF_CLICKHOUSE_REPLICATED", "false").lower() == "true" | ||
|
|
||
|
|
||
| def wf_clickhouse_replicated_path() -> str | None: | ||
| """The path of the replicated clickhouse tables.""" | ||
| return os.environ.get("WF_CLICKHOUSE_REPLICATED_PATH") | ||
|
|
||
|
|
||
| def wf_clickhouse_replicated_cluster() -> str | None: | ||
| """The cluster of the replicated clickhouse tables.""" | ||
| return os.environ.get("WF_CLICKHOUSE_REPLICATED_CLUSTER") | ||
|
|
||
|
|
||
| def wf_clickhouse_use_distributed_tables() -> bool: | ||
| """Whether to use distributed tables on top of replicated tables.""" | ||
| return ( | ||
| os.environ.get("WF_CLICKHOUSE_USE_DISTRIBUTED_TABLES", "false").lower() | ||
| == "true" | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can drop the wf_/WF_ prefixes at this point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, we probably can... but... wouldn't you rather have everything look uniform instead at the cost of a few characters?
|
Preview this PR with FeatureBee: https://beta.wandb.ai/?betaVersion=a3aa361346468a2c4616d46c7dcdf4170ad1f85d |
| # Validate configuration | ||
| if self.use_distributed and not self.replicated: | ||
| raise MigrationError( | ||
| "Distributed tables can only be used with replicated tables. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a Clickhouse constraint or something we want to enforce ourselves?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a sanity check for ourselves, it doesn't make sense to have "distributed" tables over local ones if they aren't replicated. I guess if you had a sharding system without replication, but that would be an antipattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replication is easier than sharding and provides similar benefits
…lete branch
Pull in distributed mode and migrator enhancements from griffin/calls-complete-step-6-with-migrate:
Migrator improvements:
- Move LOCAL_TABLE_SUFFIX constant to settings for reusability
- Add support for ALTER TABLE DELETE/UPDATE mutations in distributed mode
- Add DROP TABLE/VIEW support for distributed mode with proper cleanup
- Fix replica naming to use {shard}-{replica} for unique identification
- Add _is_local_only_operation() helper for better distributed handling
Server enhancements:
- Add use_distributed_mode property to check environment configuration
- Add clickhouse_cluster_name property for cluster operations
- Add _get_calls_complete_table_name() for proper table targeting
Settings updates:
- Add LOCAL_TABLE_SUFFIX constant for distributed table naming
- Add distributed_product_mode to query settings for cross-table operations
- Add batch update constants for future use
These changes lay the groundwork for distributed ClickHouse deployments
without introducing the calls_complete migration.
|
❌ Documentation Reference Check Failed No documentation reference found in the PR description. Please add either:
This check is required for all PRs except those that start with "chore(weave)" or explicitly state "docs are not required". Please update your PR description and this check will run again automatically. |
Description
WB-29945
Enables proper distributed table support for replicated ClickHouse deployments by automatically creating distributed tables on top of replicated local tables during migrations. This allows queries to be routed correctly across the cluster rather than hitting individual replicas, no changes needed on the query side.
Follows implementation explained in the docs here: https://clickhouse.com/docs/engines/table-engines/special/distributed
This PR:
WF_CLICKHOUSE_REPLICATED,WF_CLICKHOUSE_REPLICATED_PATH,WF_CLICKHOUSE_REPLICATED_CLUSTER,WF_CLICKHOUSE_USE_DISTRIBUTED_TABLES)./migrate_localbut still we should do this)_localsuffix routing- push methods that just generate sql out of class. Now, all class methods actually execute sql.
- all SQL regex is in its own helper class
- split Base, Replicated, Distributed handling into their own classes with hierarchy
Testing
updates existing tests and adds new ones
Local testing: