Skip to content

Conversation

@adriangb
Copy link
Contributor

Most of these file source implementations cannot operate without schema, they all have .expect("schema must be set")s that violate using the language to enforce correctness.

This is an attempt to rework that by making it so you have to pass in a schema to construct them.

That said there are downsides:

  1. More boilerplate.
  2. Requires that the schema passed into FileScanConfig and FileSource match.

I feel like there's another twist to this needed... maybe moving the schema out of FileScanConfig? That's not currently possible, it's used in both places. Maybe having a FileScan and a FileScanConfig and having construction be FileScan::new(FileSource::new(config), config)?

@github-actions github-actions bot added core Core DataFusion crate substrait Changes to the substrait crate catalog Related to the catalog crate proto Related to proto crate datasource Changes to the datasource crate labels Oct 30, 2025
@adriangb adriangb marked this pull request as ready for review October 30, 2025 21:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

catalog Related to the catalog crate core Core DataFusion crate datasource Changes to the datasource crate proto Related to proto crate substrait Changes to the substrait crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant