Skip to content

Conversation

@anuunchin
Copy link
Contributor

@anuunchin anuunchin commented Nov 17, 2025

This PR explains what exactly happens in the extract, normalize and load stages when a transformation is run through a pipeline.

Resolves #3206

A lifecycle test is added to show what happens to the model in the pipeline.run().

@anuunchin anuunchin self-assigned this Nov 17, 2025
@anuunchin anuunchin requested a review from zilto November 17, 2025 17:07
@cloudflare-workers-and-pages
Copy link

cloudflare-workers-and-pages bot commented Nov 17, 2025

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Preview URL Updated (UTC)
✅ Deployment successful!
View logs
docs 9e16473 Commit Preview URL

Branch Preview URL
Nov 25 2025, 07:14 PM

Copy link
Collaborator

@zilto zilto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation improvements were helpful. OTOH, some tests are failing, but it seems unrelated to code changes?

@zilto zilto added the documentation Improvements or additions to documentation label Nov 18, 2025
@anuunchin anuunchin force-pushed the docs/3206-query-lifecycle branch from 54de0af to 95675ad Compare November 25, 2025 13:26
@anuunchin anuunchin requested a review from zilto November 25, 2025 15:03
@anuunchin anuunchin force-pushed the docs/3206-query-lifecycle branch 2 times, most recently from d0ab97c to aa93c6b Compare November 25, 2025 16:25
"""Set up a fruitshop fixture dataset for transformations examples"""

# @@@DLT_SNIPPET_START quick_start_example

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

edited these because the docs say "you can copy paste", but imports still needed to be added 👀

@anuunchin anuunchin force-pushed the docs/3206-query-lifecycle branch 3 times, most recently from 0e8b658 to 81d5665 Compare November 25, 2025 19:02
@anuunchin anuunchin force-pushed the docs/3206-query-lifecycle branch from 81d5665 to 9e16473 Compare November 25, 2025 19:08
@anuunchin anuunchin requested a review from zilto November 25, 2025 19:14
@zilto zilto merged commit 7d7b7af into devel Nov 25, 2025
90 of 96 checks passed
@zilto zilto deleted the docs/3206-query-lifecycle branch November 25, 2025 19:55
sh-rp added a commit that referenced this pull request Nov 26, 2025
* override local marimo theme for dashboard (#3337)

This ensures custom CSS is always readable.

* Fix DocSearch v4 styles (#3338)

* Fix DocSearch v4 styles
* Fix search input styles for light and dark modes

* docs: update weaviate destination docs and version (#3352)

* Redshift feature: Include STS session token in COPY CREDENTIALS. (#3307)

* Redshift feature: Include STS session token in COPY CREDENTIALS. If aws_session_token is present, append the session token. Keeps IAM_ROLE path and long-lieved keys unchanged

---------

Co-authored-by: Tim Hable <[email protected]>

* fixes sqlglot from find (#3357)

* fixes athena refresh mode (#3313)

* adds filter to exclude dropped tables in staging destination, implements for athena

* enables refresh mode tests for athena, fixes tests

* fixes staging_allowed_local_path on databricks, bumps databricks connector in lockfile

* passes dropped tables schemas to filter, adjust athena filter

* allows to disable lake formation

* fix: backwards compatible traces (#3354)

* makes trace backward compat with 1.17.0 and earlier

* skips trace if any error in unpickle

* always saves merged pipeline trace to have consistent pipeline.last_trace property

* tests for past traces, broken traces and other improvements

* (docs) adds community destinations (#3326)

* adds community destinations

* Apply suggestions from code review

applies crate fixes

Co-authored-by: Andreas Motl <[email protected]>

---------

Co-authored-by: Andreas Motl <[email protected]>

* fix: dashboard no longer crashes on broken home cell (#3348)

* split home and workspace render methods

* header row dry-er

* catch-all errors in home()-cell

* local try-catch for broken traces

* e2e test for broken trace

* removes this

* shows navigation on pipeline attach error

---------

Co-authored-by: Marcin Rudolf <[email protected]>

* (fix) use sparse checkout for dlt init dlthub (#3356)

* adds option to sparse checkout repo

* use sparse checkout for llm context

* fixes sqlglot from find

* adds checkout after sparse clone

* explains unknown path tests

* Fix: The child table column remains in the schema as a partial column with seen-null-first=True (#3131)

* child table column removed from parent

* A utility functin that checks whether a column has seen-null-first set

* Improved comments and docstrings, separate method in worker

* null column not inferred if exists as compound

* Column level x-normalizer cleaning moved outside of worker

* Test for empty column becoming compound

* Test clean_seen_null_first_hint

* Uncalled source in pipeline.run( (#3369)

* fix flaky dashboard tests (#3370)

* improves dashboard multi schema test

* closes and waits for sections in multi-schema test

* removes command line snippet with generic text in exceptions

* disables transformers pokeapi test

* feat: `Schema.to_mermaid()` (#3364)


* Add dlt.Schema.to_mermaid() method

---------

Co-authored-by: jayant <[email protected]>

* Refactor boundary timestamp handling in SqlMergeFollowupJob and SqlalchemyMergeFollowupJob to ensure current load package creation time is used when no boundary timestamp is provided. Update DltResourceHints class to streamline timestamp validation for active_record_timestamp and boundary_timestamp. Adjust tests accordingly. (#3378)

* feat: `snowflake` clustering key modifications (#3365)

* add support for snowflake clustering key modifications

* add cluster column order test case

* update snowflake cluster hint docs

* switch to reading snowflake cluster hints from table schema

* docs: lifecycle of `@dlt.hub.transformation` and `dlt.Relation` (#3329)

* Lifecycle of a dlt transformation

* Added test to match lifecycle docs

* (fix) 3351 fixes default type var (#3373)

* tests minimal typing extensions in alpine docker

* keeps typevar default but does not use it in the code for backwart compat

---------

Co-authored-by: djudjuu <[email protected]>
Co-authored-by: Anton Burnashev <[email protected]>
Co-authored-by: Violetta Mishechkina <[email protected]>
Co-authored-by: Tim Hable <[email protected]>
Co-authored-by: Tim Hable <[email protected]>
Co-authored-by: rudolfix <[email protected]>
Co-authored-by: Andreas Motl <[email protected]>
Co-authored-by: anuunchin <[email protected]>
Co-authored-by: Thierry Jean <[email protected]>
Co-authored-by: jayant <[email protected]>
Co-authored-by: Menna <[email protected]>
Co-authored-by: Jorrit Sandbrink <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(relation): clarify dlt.Relation lifecycle

3 participants