Skip to content

Conversation

@lkstrp
Copy link
Member

@lkstrp lkstrp commented Nov 19, 2025

Supersedes #1547

Changes proposed in this Pull Request

This PR revives and supersedes #1547. This time using pydantic which can export to JSON Schema instead of writing the JSON Schema directly.

What we can do with this new approach:

  1. Docs: Replace all doctables and define description, types and examples within the model directly
  2. Validation: Run any validation: Just simple type checks (bool, list[str]), Literals (e.g. for logging levels Literal["DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"], as well as any custom validation you can come up with. I added as an example that names either has to match unix date format or start with cool_name (see here)
  3. IDE Docs: Export to a JSON Schema, which then can be used to get in line type hints and documentation for any IDE. See comment in previous pr.
  4. Defaults: Replace config.default.yaml and define them in the model, which then can get exported to a similar yaml again. See here the automatically exported config.default based on the Pydantic Schema.

This is a very minimal example right now, and not integrated into the Snakefile. I want to discuss two things first:

  1. On 4: We could define the defaults right away in the model, which means it would contain everything related: description, validation, type, examples as well as defaults. That would be the cleanest approach. config.default.yaml can still be automatically generated to not break any existing workflows, but if you wanna change defaults you would need to touch the model directly. This is what I vote for and what is done in this minimal version. We could also keep the config.default.yaml and only use the model for validation, so keep defaults seperated from docs, validation, types etc. This needs to be decided now. @fneum @coroa @FabianHofmann
  2. On 2: In a first version we wanna just do basic validation. But if we wanna define more complex stuff, it is a bit a question where do we define it. The cleanest would be to keep that all in lib/validation/config and not in any scripts. Similar to the example I have right now. Nothing we need to decide right now and I just wanna hear some opinions

Checklist

  • I tested my contribution locally and it works as intended.
  • Code and workflow changes are sufficiently documented.
  • Changed dependencies are added to pixi.toml (using pixi add <dependency-name>).
  • Changes in configuration options are added in config/config.default.yaml.
  • Changes in configuration options are documented in doc/configtables/*.csv.
  • Sources of newly added data are documented in doc/data_sources.rst.
  • A release note doc/release_notes.rst is added.

@lkstrp lkstrp changed the title minimal version Config Validation with Pydantic Nov 19, 2025
@lkstrp lkstrp marked this pull request as draft November 19, 2025 13:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant