-
Notifications
You must be signed in to change notification settings - Fork 400
Docs: Lifecycle of a dlt transformation (a sql query model of a transformation relation) #3329
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Deploying with
|
| Status | Name | Latest Commit | Preview URL | Updated (UTC) |
|---|---|---|---|---|
| ✅ Deployment successful! View logs |
docs | 9e16473 | Commit Preview URL Branch Preview URL |
Nov 25 2025, 07:14 PM |
zilto
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation improvements were helpful. OTOH, some tests are failing, but it seems unrelated to code changes?
54de0af to
95675ad
Compare
d0ab97c to
aa93c6b
Compare
| """Set up a fruitshop fixture dataset for transformations examples""" | ||
|
|
||
| # @@@DLT_SNIPPET_START quick_start_example | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
edited these because the docs say "you can copy paste", but imports still needed to be added 👀
0e8b658 to
81d5665
Compare
81d5665 to
9e16473
Compare
* override local marimo theme for dashboard (#3337) This ensures custom CSS is always readable. * Fix DocSearch v4 styles (#3338) * Fix DocSearch v4 styles * Fix search input styles for light and dark modes * docs: update weaviate destination docs and version (#3352) * Redshift feature: Include STS session token in COPY CREDENTIALS. (#3307) * Redshift feature: Include STS session token in COPY CREDENTIALS. If aws_session_token is present, append the session token. Keeps IAM_ROLE path and long-lieved keys unchanged --------- Co-authored-by: Tim Hable <[email protected]> * fixes sqlglot from find (#3357) * fixes athena refresh mode (#3313) * adds filter to exclude dropped tables in staging destination, implements for athena * enables refresh mode tests for athena, fixes tests * fixes staging_allowed_local_path on databricks, bumps databricks connector in lockfile * passes dropped tables schemas to filter, adjust athena filter * allows to disable lake formation * fix: backwards compatible traces (#3354) * makes trace backward compat with 1.17.0 and earlier * skips trace if any error in unpickle * always saves merged pipeline trace to have consistent pipeline.last_trace property * tests for past traces, broken traces and other improvements * (docs) adds community destinations (#3326) * adds community destinations * Apply suggestions from code review applies crate fixes Co-authored-by: Andreas Motl <[email protected]> --------- Co-authored-by: Andreas Motl <[email protected]> * fix: dashboard no longer crashes on broken home cell (#3348) * split home and workspace render methods * header row dry-er * catch-all errors in home()-cell * local try-catch for broken traces * e2e test for broken trace * removes this * shows navigation on pipeline attach error --------- Co-authored-by: Marcin Rudolf <[email protected]> * (fix) use sparse checkout for dlt init dlthub (#3356) * adds option to sparse checkout repo * use sparse checkout for llm context * fixes sqlglot from find * adds checkout after sparse clone * explains unknown path tests * Fix: The child table column remains in the schema as a partial column with seen-null-first=True (#3131) * child table column removed from parent * A utility functin that checks whether a column has seen-null-first set * Improved comments and docstrings, separate method in worker * null column not inferred if exists as compound * Column level x-normalizer cleaning moved outside of worker * Test for empty column becoming compound * Test clean_seen_null_first_hint * Uncalled source in pipeline.run( (#3369) * fix flaky dashboard tests (#3370) * improves dashboard multi schema test * closes and waits for sections in multi-schema test * removes command line snippet with generic text in exceptions * disables transformers pokeapi test * feat: `Schema.to_mermaid()` (#3364) * Add dlt.Schema.to_mermaid() method --------- Co-authored-by: jayant <[email protected]> * Refactor boundary timestamp handling in SqlMergeFollowupJob and SqlalchemyMergeFollowupJob to ensure current load package creation time is used when no boundary timestamp is provided. Update DltResourceHints class to streamline timestamp validation for active_record_timestamp and boundary_timestamp. Adjust tests accordingly. (#3378) * feat: `snowflake` clustering key modifications (#3365) * add support for snowflake clustering key modifications * add cluster column order test case * update snowflake cluster hint docs * switch to reading snowflake cluster hints from table schema * docs: lifecycle of `@dlt.hub.transformation` and `dlt.Relation` (#3329) * Lifecycle of a dlt transformation * Added test to match lifecycle docs * (fix) 3351 fixes default type var (#3373) * tests minimal typing extensions in alpine docker * keeps typevar default but does not use it in the code for backwart compat --------- Co-authored-by: djudjuu <[email protected]> Co-authored-by: Anton Burnashev <[email protected]> Co-authored-by: Violetta Mishechkina <[email protected]> Co-authored-by: Tim Hable <[email protected]> Co-authored-by: Tim Hable <[email protected]> Co-authored-by: rudolfix <[email protected]> Co-authored-by: Andreas Motl <[email protected]> Co-authored-by: anuunchin <[email protected]> Co-authored-by: Thierry Jean <[email protected]> Co-authored-by: jayant <[email protected]> Co-authored-by: Menna <[email protected]> Co-authored-by: Jorrit Sandbrink <[email protected]>
This PR explains what exactly happens in the extract, normalize and load stages when a transformation is run through a pipeline.
Resolves #3206
A lifecycle test is added to show what happens to the model in the pipeline.run().