Skip to content

Conversation

@renovate
Copy link
Contributor

@renovate renovate bot commented Jan 1, 2025

Note: This PR body was truncated due to platform limits.

This PR contains the following updates:

Package Change Age Confidence
io.openlineage:openlineage-java 1.23.0 -> 1.40.1 age confidence

Release Notes

OpenLineage/OpenLineage (io.openlineage:openlineage-java)

v1.40.1

Compare Source

Fixed
  • Python: re-add missing version variables in top of releaseable modules #4135 @​mobuchowski
    Fixes breaking change in version 1.40.0.

v1.40.0

Compare Source

Added
  • Spec: standardize batch API endpoint #4109 @​jakub-moravec
    Add a standardized batch API endpoint to OpenLineage specification for handling multiple events in a single request.
  • Spec: Add ordinal position to SchemaDatasetFacet #4116 @​mobuchowski
    Add ordinal_position field to track the position of fields in schema (1-indexed).
  • Spec: Add JobDependenciesRunFacet #4112 @​kacpermuda
    Introduce JobDependenciesRunFacet to track dependencies between jobs.
  • Spec: Add support for temporary datasets #4103 @​jakub-moravec
    Add support for temporary datasets to enable job-to-job lineage tracking.
  • Spark: Add fallback for BigQuery project ID configuration #4075 @​luke-hoffman1
    Add fallback configuration for BigQuery project ID in Metastore integration.
  • Spark: Add COALESCE transformation support #4123 @​kacpermuda
    Include examples in Python generated classes for better documentation.
  • Java: Add support for jTDS JDBC URL format #4077 @​dolfinus
    Add support for parsing jTDS JDBC URL format in Java client.
  • Hive: Add ParentRunFacet #4066 @​tnazarew
    Add ParentRunFacet to Hive integration for tracking parent-child run relationships.
  • Hive: Add LOAD and IMPORT handling #4097 @​tnazarew
    Add support for tracking LOAD and IMPORT operations in Hive.
  • Hive: Add EXPORT handling #4085 @​tnazarew
    Add support for tracking EXPORT operations in Hive.
  • Hive: Add START event emission #4079 @​tnazarew
    Add START event emission support to Hive integration.
Fixed
  • Spark: Fix dataset facet builders for inputs #4121 @​usamakunwar
    Fix Spark dataset facet builders for input datasets.
  • Spark: Fix job name trimming #4114 @​kchledowski
    Fix job name trimming logic in Spark integration.
  • Spark: Fix putAll on immutable maps #4113 @​pawel-big-lebowski
    Fix putAll operation failing on immutable maps.
  • Spark: Fix RDD job handling #4108 @​pawel-big-lebowski
    Fix multiple issues with RDD job handling in Spark.
  • Spark: Fix JDBC dbtable parsing #4102 @​kchledowski
    Fix JDBC dbtable parsing to support any FROM clauses.
  • Spark: Fix Databricks setup #4083 @​pawel-big-lebowski
    Fix Spark connector configuration for Databricks environments.
  • Spark: Catch NoClassDefFoundError #4099 @​mobuchowski
    Catch NoClassDefFoundError when buggy implementations exist on classpath.
  • Spark: Fix Snowflake identifier parsing #4104 @​mobuchowski
    Fix Snowflake identifier parsing to handle quoted identifiers correctly.
  • Spark: Fix Snowflake account name handling #4105 @​mobuchowski
    Strip quotes from Snowflake account names for proper handling.
  • Spec: Fix facet property names #4092 @​fm100
    Fix facet property names from snake_case to camelCase for consistency.
  • Python: Fix facet generator after UV migration #4111 @​kacpermuda
    Fix Python client facet generator after moving to UV build system.
  • Python: Fix retry config merge #4093 @​antonlin1
    Fix retry configuration default merge with user-defined config in HTTP transports.
  • Java: Fix CVE in commons-lang3 #4084 @​mandalbalmukund
    Upgrade commons-lang3 version to fix CVE security vulnerability.
  • Hive: Generate same runId for START and STOP events #4126 @​dolfinus
    Ensure START and STOP events share the same runId in Hive integration.

v1.39.0

Compare Source

Added
  • Spark: Normalize dataset names with configurable trimmers #3996 @​pawel-big-lebowski
    Add configurable dataset name normalization with support for date patterns, key-value pairs, and S3 location detection to enable proper dataset subsetting.
  • Spark: Add missing facets in inputs for Databricks Unity Catalog #4057 @​kchledowski
    Add missing input symlink facets for Databricks Unity Catalog tables.
Changed
  • Spark: Refactor tests for dependency collector #4058 @​kchledowski
    Refactor column-level lineage dependency collector tests for better organization and maintainability.
Fixed
  • Spec: Fix typo in iceberg commit report facet spec file #4069 @​fm100
    Fix typo in IcebergCommitReportOutputDatasetFacet property name.
  • Spark: Fix dataset trimming for CLL inputs #4061 @​pawel-big-lebowski
    Fix dataset name trimming for column-level lineage inputs.
  • Python: Remove numpy import #4062 @​kacpermuda
    Remove unnecessary numpy import from Python client.
Removed
  • Dagster: Remove Dagster integration #3844 @​kacpermuda
    Remove Dagster integration from the repository.

v1.38.0

Compare Source

Added
  • Spec: Add subset dataset facets to spec #4008 @​pawel-big-lebowski
    Add subset dataset facets to OpenLineage specification for representing dataset relationships.
  • Spec: Add DatasetQualityMetricsDatasetFacet #3978 @​heron--
    Allow attaching dataset quality information outside of InputDatasetFacet.
  • Spark: Add support for microbatch source write #4018 @​tnazarew
    Add support for Spark structured streaming microbatch source write operations.
  • Spark: Add catalog properties to catalog facet #4016 @​ddebowczyk92
    Add catalog properties support to Spark integration for better catalog metadata tracking.
  • Spark: Add GCP project ID and location to BigQuery Metastore catalog properties #4039 @​ddebowczyk92
    Enhance BigQuery integration with GCP project ID and location in catalog properties.
  • Spark: Add support for COALESCE transformation #3972 @​kchledowski
    Add support for tracking COALESCE transformations in Spark jobs.
  • Spark: Add catalog facet when using vanilla Hive tables #3982 @​ddebowczyk92
    Add catalog facet support for vanilla Hive table operations.
  • Spark: Make output statistics available within complete event #4013 @​pawel-big-lebowski
    Output statistics now available in complete events for better observability.
  • Spark: Add output stats for RDD jobs #3977 @​pawel-big-lebowski
    Add output statistics tracking for Spark RDD-based jobs.
  • Java: Add equals and hashcode methods into generated classes #4050 @​pawel-big-lebowski
    Improve generated model classes with proper equals and hashcode implementations.
  • dbt: Capture dbt tags #4022 @​mobuchowski
    Add support for capturing dbt tags in OpenLineage events.
  • dbt: Add dbt Cloud account ID to DbtRunRunFacet #4017 @​mobuchowski
    Add dbt Cloud account ID tracking to dbt run facets.
  • dbt: Update DbtRunRunFacet to add more useful information #3987 @​mobuchowski
    Enhance DbtRunRunFacet with additional metadata for better observability.
  • Python: Add GCP Lineage transport #4006 @​ddebowczyk92
    Add native Google Cloud Platform Lineage transport for Python client.
  • Python: Add fsspec support for FileTransport #3983 @​JDarDagran
    Add fsspec filesystem support to FileTransport for broader filesystem compatibility.
  • Python: Add default tags with OL client version #3980 @​kacpermuda
    Automatically add OpenLineage client version as default tag in events.
  • Airflow: Add GCP Composer facets #3986 @​gabrysiaolsz
    Add GCP Cloud Composer environment metadata facets to Airflow integration.
Changed
  • dbt: Use alias when naming datasets #4055 @​mobuchowski
    Use dbt model aliases when generating dataset names for more accurate lineage.
  • Spark: Serialize event to JSON for logging #4029 @​EugeneYushin
    Serialize OpenLineage events to JSON format for improved debug logging.
  • Spark: Respect overridden appName in EventEmitter #4030 @​EugeneYushin
    Properly respect user-overridden application names in event emission.
  • Spark: Refactor CLL ExpressionDependencyCollector #4003 @​kchledowski
    Refactor column-level lineage expression dependency collector for better maintainability.
  • Spark: Improve logging in IcebergInputStatisticsInputDatasetFacetBuilder #3994 @​JDarDagran
    Enhance logging for Iceberg input statistics collection.
  • Spark: Limit external getFileStatus calls when dealing with lots of S3 objects #3985 @​pawel-big-lebowski
    Optimize S3 operations by limiting external getFileStatus calls for large object sets.
  • Java/Spark/Hive: Move TransformationInfo to Java client to reuse across integrations #3964 @​kchledowski
    Refactor TransformationInfo into shared Java client for cross-integration reuse.
  • Python: Improve logging in AsyncHttpTransport #4026 @​dolfinus
    Enhance logging capabilities in asynchronous HTTP transport.
  • Python: Allow type aliases #4000 @​JDarDagran
    Support Python type aliases in client code generation.
  • Python: Fix classes generation for almost identical classes #3997 @​JDarDagran
    Improve code generation to properly handle nearly identical class definitions.
  • Python: Raise errors if custom token provider cannot be loaded #4014 @​dolfinus
    Fail fast with clear errors when custom token providers fail to load.
  • Python: Don't silence import errors in DefaultTransportFactory #4015 @​dolfinus
    Improve error visibility by not silencing import errors in transport factory.
  • Python: Import from facet_v2 and event_v2 instead of generated modules #3968 @​kacpermuda
    Update import paths to use versioned facet and event modules.
  • Java: Refactor ExecutorService management in OpenLineageClientUtils #4012 @​JDarDagran
    Improve thread pool management in Java client utilities.
  • CI: Replace pre-commit with prek across CI and documentation #3965 @​JDarDagran
    Migrate from pre-commit to prek for pre-commit hook management.
Fixed
  • Spark: Fix false Hive Glue detection #4053 @​jsjasonseba
    Fix incorrect Glue catalog detection due to always attempting ARN resolution.
  • Spark: Fix CLL on hiveless runtimes #4052 @​kchledowski
    Fix column-level lineage failures on Spark runtimes without spark-hive package.
  • Spark: Fix missing inputs and CLL on some table creation commands #4031 @​kchledowski
    Fix missing input datasets and column-level lineage for CreateDataSourceTableAsSelect and CreateHiveTableAsSelect commands.
  • Spark: Rely on BQ bucket info inside BigQueryIntermediateJobFilter #4044 @​EugeneYushin
    Fix BigQuery intermediate job filtering by using bucket configuration.
  • Spark: Fix for TypeNotPresentException/RefreshTableCommand errors in Spark 3.0.2 #4002 @​MaciejGajewski
    Add additional exception handling for TypeNotPresentException in Spark 3.0.2.
  • Python: Fix license field in pyproject.toml when using build module #4034 @​JDarDagran
    Correct license field specification in Python package metadata.
  • Python: Accept both apikey and api_key in token provider #4045 @​kacpermuda
    Support both naming conventions for API key configuration parameter.
  • Java: Fix empty sources jar generation #4037 @​EugeneYushin
    Fix build issue causing empty sources JAR files to be generated.

v1.37.0

Compare Source

Added
  • Python: Add Datadog transport with configurable async routing #3950 @​mobuchowski
    Add Datadog transport with intelligent routing between sync/async transports based on configurable rules. Supports wildcard matching and provides seamless integration with Datadog's observability platform.
  • Spark: Implement support for WriteDelta, WriteIcebergDelta logical plan nodes #3860 @​orthoxerox
    Add support for WriteDelta and WriteIcebergDelta logical plan nodes in Spark integration.
  • dbt: Add option to override dbt job name #3933 @​mobuchowski
    Add configuration option to override dbt job names in OpenLineage events.
  • Java: Add Jackson Blackbird module for JSON performance optimization #3923 @​kyungryun
    Improve JSON serialization performance with Jackson Blackbird module.
Changed
  • Spark: Remove Spark 2 support #3904 @​pawel-big-lebowski
    Drop support for Spark 2.x versions. Minimum supported version is now Spark 3.x.
  • Python: Change gzip compression level in HTTP transport #3956 @​dolfinus
    Optimize HTTP transport performance by adjusting gzip compression level.
  • Spark: Add support for Spark 4 in streaming tests #3925 @​SalvadorRomo
    Extend streaming integration tests to support Spark 4.0.
Fixed
  • Spark: Improve performance of column level lineage #3946 @​pawel-big-lebowski
    Limit memory consumption, provide limits for the amount of dependencies processed (1M) and input fields returned in the facet (100K). Turns on dataset lineage by default.
  • Spark: Add schema size limit for column level lineage processing #3949 @​ddebowczyk92
    Add limits to prevent performance issues with large schemas in column-level lineage processing.
  • Spark: Fix context factory for Spark 4 #3934 @​pawel-big-lebowski
    Fix context factory implementation for Spark 4.0 compatibility.
  • Spark: Fix LogicalRelation constructor compatibility for Spark 4 #3930 @​yunchipang
    Fix LogicalRelation constructor to maintain compatibility with Spark 4.0.
  • Spark: Fix vendors parsing in SparkOpenLineageConfig #3947 @​ddebowczyk92
    Fix parsing of vendor configurations in Spark OpenLineage configuration.
  • dbt: Use correct namespace for dbt externalQuery facet #3953 @​jroachgolf84
    Fix namespace handling in dbt external query facets.
  • Python: Fix tags configuration #3943 @​JDarDagran
    Fix configuration handling for user-supplied tags in Python client.

v1.36.0

Compare Source

Added
  • Spark: support Delta 4.0 and cover it with tests on Spark 4.0. #3877 @​pawel-big-lebowski
    Fix failing tests for Spark 4.0. Make delta integration tests pass with Delta 4.0 on Spark 4.
  • Spark: Add memory info to debug facet. #3914 @​pawel-big-lebowski
    Extend DebugFacet with additional information on Spark's driver memory configuration and current memory usage.
  • Spark: Add new AlterTableCommandDatasetBuilder for Spark 4.0. #3921 @​pawel-big-lebowski
    Add support for AlterTableCommand dataset building in Spark 4.0.
  • dbt: Add query IDs for dbt. #3890 @​jroachgolf84
    Add query ID tracking to dbt integration.
  • dbt: Add query ID capture in structured logs. #3918 @​mobuchowski
    Capture query IDs from dbt structured logs for better traceability.
  • Python: Formalize dataset naming for Python client. #3816 @​ddebowczyk92
    Formalize dataset naming conventions in Python client implementation.
Changed
  • Spark: bump minor versions 3.4.3 -> 3.4.4, 3.5.4 -> 3.5.6. #3907 @​pawel-big-lebowski
    Bump tested Spark versions.
  • Spark: Close OpenLineageClient in onApplicationEnd. #3851 @​dolfinus
    Ensure proper cleanup of OpenLineageClient when Spark application ends.
  • Python: Do not use f-strings with logging module. #3895 @​dolfinus
    Replace f-string usage in logging calls with proper logging formatting.
  • Python: Update protobuf version to be compatible with newer libraries. #3899 @​Shadi
    Update protobuf dependency to maintain compatibility with newer library versions.
  • Website: Documentation for compatibility tests. #3869 @​mobuchowski
    Add documentation explaining compatibility testing processes.
Fixed
  • Spark: make visitors stateless - avoid memory leak. #3902 @​pawel-big-lebowski
    Merge SqlExecutionRDDVisitor and LogicalRDDVisitor classes to avoid memory leak.
  • Spark: refactor iceberg handler. #3909 @​pawel-big-lebowski
    Refactor Iceberg handler implementation for better maintainability.
  • Spark: retry exception on empty row. #3908 @​pawel-big-lebowski
    Add retry logic for handling empty row exceptions.
  • Spark: fix Spark version for databricks test. #3911 @​pawel-big-lebowski
    Fix Spark version configuration in Databricks test environment.
  • Flink: Fix connector of type kafka-upsert not identifying kafka topics correctly. #3915 @​fetta
    Fix kafka-upsert connector to properly identify kafka topics.
  • Airflow: Fail fast and reduce timeout for airflow tests. #3905 @​kacpermuda
    Improve test performance by implementing fail-fast behavior and reduced timeouts.
  • dbt: more telemetry, fix quadratic file reading. #3916 @​mobuchowski
    Improve telemetry collection and fix performance issues with file reading.
  • dbt: Fix dbt version. #3894 @​mobuchowski
    Fix dbt version compatibility issues.
  • Python: Fix filenames for windows users. #3889 @​pawel-big-lebowski
    Fix filename handling to work correctly on Windows systems.
  • Transport: Adjust log level when aliasing default_http transport. #3897 @​dolfinus
    Adjust logging level for transport aliasing messages.
  • Build: Improve comments and add some tests. #3901 @​kacpermuda
    Improve code documentation and add additional test coverage.

v1.35.0

Compare Source

Added
  • Spark: Include spark_applicationDetails facet to all events #3848 @​dolfinus
    Add spark_applicationDetails facet to all OpenLineage events emitted by the Spark integration
  • Spark: Support additional facets #3850 @​ddebowczyk92
    Adds support for additional facets in Spark integration
  • Spark: disable connector by Spark config parameter #3880 @​pawel-big-lebowski
    Add spark.openlineage.disabled entry to disable OpenLineage integration through Spark config parameters
  • Spark: Fine-grained timeout config #3779 @​pawel-big-lebowski
    Add extra timeout options to emit incomplete OpenLineage events in case of timeout when building facets. See buildDatasetsTimePercentage and facetsBuildingTimePercentage in docs for more details
  • Python: Asynchronous HTTP transport implementation #3812 @​mobuchowski
    Adds high-performance asynchronous HTTP transport with event ordering guarantees, configurable concurrency, and comprehensive error handling. Features START-before-completion event ordering, bounded queues, and real-time statistics
  • dbt: Add DbtRun facet to dbt run events #3764 @​dolfinus
    Adds DbtRun facet for tracking dbt run information
  • Python: Add continue_on_success and sorting transport in CompositeTransport #3829 @​kacpermuda
    Adds configuration options for CompositeTransport to control behavior and ordering
  • Hive: Add jobType facet #3789 @​dolfinus
    Adds jobType facet to Hive integration
  • Hive: Add dialect=hive to SqlJobFacet #3863 @​dolfinus
    Adds dialect field to SqlJobFacet for Hive integration
  • Spec: SqlJobFacet now contains dialect #3819 @​mobuchowski
    Adds dialect field to SqlJobFacet specification
  • Spec: Formalize job naming #3826 @​ddebowczyk92
    Formalizes job naming conventions in the specification
  • Spec: Formalize dataset naming #3775 @​ddebowczyk92
    Formalizes dataset naming conventions in the specification
Changed
  • Spark: Use Hive as default Iceberg catalog implementation #3858 @​dolfinus
    Updates Spark integration to use Hive as the default catalog implementation for Iceberg tables
  • Spark: Replace weak hash-map with a map with weak keys and entries #3856 @​ddebowczyk92
    Improves memory management in Spark integration by replacing weak hash-map implementation
  • Spark: Support latest databricks runtime #3811 @​pawel-big-lebowski
    Updates Spark integration to support the latest Databricks runtime
  • Python: Remove transport.wait_for_completion() #3881 @​dolfinus
    Removes wait_for_completion() method from Python transport interface
  • Python: Reuse session in sync HttpTransport #3843 @​dolfinus
    Improves performance by reusing HTTP sessions in synchronous transport
  • Python: Implement Transport.close() for Datazone and Kinesis #3857 @​dolfinus
    Adds proper cleanup methods for Datazone and Kinesis transports
  • Python: Implement TransformTransport.close #3855 @​dolfinus
    Adds cleanup method for TransformTransport
  • Python: Implement KafkaTransport.wait_for_completion() and .close() #3838 @​dolfinus
    Adds proper cleanup and completion methods for Kafka transport
  • Java: Make CompositeTransport.close() more reliable #3841 @​dolfinus
    Improves reliability of CompositeTransport cleanup process
  • Java: Cover OpenLineageClient.close() with tests #3839 @​dolfinus
    Adds test coverage for OpenLineageClient cleanup methods
  • Java: Name threads used in Java client #3817 @​mobuchowski
    Adds meaningful names to threads used in Java client for better debugging
  • Flink: Close OpenLineageClient in onJobExecuted #3854 @​dolfinus
    Ensures proper cleanup of OpenLineageClient in Flink 1.x integration
  • Flink: Fixed a bug incorrectly loading configuration in Event Emitter #3799 @​pan-siekierski
    Fixes configuration loading issue in Flink Event Emitter
  • dbt: Make invocation_id field optional #3796 @​dolfinus
    Makes invocation_id field optional in dbt integration
  • dbt: More resiliency for missing dbt nodes #3836 @​mobuchowski
    Improves error handling for missing dbt nodes
  • Hive: Add docker-compose example for local testing #3800 @​dolfinus
    Adds docker-compose setup for local Hive integration testing
  • Airflow: Send pending events after Airflow DAG is finished #3849 @​dolfinus
    Ensures all pending events are sent after DAG completion in Airflow integration
Fixed
  • dbt: Fix log path, more precise file reading #3793 @​mobuchowski
    Improves log file handling in dbt integration
  • dbt: Fix deprecated configs #3859 @​kacpermuda
    Replaces deprecated dbt configurations with current alternatives
  • Spark: Fix missing .db suffix in database/namespace location name for BigQueryMetastoreCatalog #3874 @​ddebowczyk92
    Fixes database naming issue in BigQuery Metastore catalog implementation
  • Spark: Fix missing output's dataset catalog facet when running CTAS queries on Iceberg tables #3835 @​ddebowczyk92
    Fixes missing catalog facet in output datasets for CTAS queries on Iceberg tables
  • Spark: Delta merge with column #3871 @​pawel-big-lebowski
    Fixes Delta merge operation handling with column-level lineage
  • Spark: Support delta 3.3.2 #3861 @​pawel-big-lebowski
    Adds support for Delta Lake version 3.3.2
  • Spark: Call version utils method when dataset can be identified #3832 @​pawel-big-lebowski
    Fixes version handling when dataset can be properly identified
  • Java: Raise error if no events were emitted by composite transport #3853 @​kacpermuda
    Adds error handling when CompositeTransport fails to emit any events
  • Java: Fix IntelliJ reload #3825 @​mobuchowski
    Fixes IntelliJ project reload issues
  • Java: Fix spotless in hive integration #3806 @​mobuchowski
    Fixes code formatting issues in Hive integration
  • Java: Fix missing field in SQL facet test #3830 @​mobuchowski
    Fixes missing field in SQL facet test case
  • Java: Fix composite transport logging of multiple transports #3887 @​kacpermuda
    Fixes logging issue when using CompositeTransport with multiple transports
  • Rust: Fix new rust formatting warnings #3814 @​mobuchowski
    Fixes formatting warnings in Rust code

v1.34.0

Compare Source

Added
  • Hive: Integration added. #3555 @​tnazarew with @​ddebowczyk92, @​jphalip
    Added OpenLineage Hive integration
  • Spark: Support dynamic frames. #3691 @​pawel-big-lebowski
    Support lineage extraction from UnionRdd and NewHadoopRDD, which makes dynamic frames docker based test passing.
  • Hive: Add hive_query facet #3781 @​dolfinus
    Adds hive_query facet for Hive integration
  • Hive: Add job sql facet #3777 @​dolfinus
    Adds job sql facet for Hive integration
  • Hive: Add hive session facet #3786 @​dolfinus
    Adds hive session facet for Hive integration
  • Java: Add Location Symlink type #3717 @​tnazarew
    Add new symlink type representing physical location of dataset
  • Spark: Smart debug facet. #3715 @​pawel-big-lebowski
    Automatically turn on debug facet in case of spark connector anomalies detected
  • Spark: Add support for Big Query Metastore catalog type #3760 @​ddebowczyk92
    Adds support for BigQuery Metastore catalog in Spark integration
  • dbt: Add DbtRun facet #3738 @​dolfinus
    Adds DbtRun facet for tracking dbt run information
  • dbt: Initial support for Clickhouse #3739 @​dolfinus
    Adds initial support for ClickHouse in dbt integration
  • dbt: Add processing_engine facet #3725 @​dolfinus
    Adds processing_engine facet for dbt integration
  • Flink: Add facet with Flink jobId #3744 @​dolfinus
    Adds facet containing Flink job ID information
  • Flink: Add processing_engine facet #3726 @​dolfinus
    Adds processing_engine facet for Flink integration
  • JDBC: Column level lineage for jdbc queries load #3763 @​pawel-big-lebowski
    Adds column-level lineage support for JDBC queries for Spark with single input table
  • Spec: Add contentType to documentation facet #3748 @​dolfinus
    Adds contentType field to documentation facet specification
Changed
  • Airflow: Remove Airflow < 2.5.0 support #3669 @​kacpermuda
    Drops support for Airflow versions below 2.5.0
  • dbt: Use adapter rows_affected as outputStatistics #3731 @​dolfinus
    Uses adapter's rows_affected for output statistics instead of custom calculation
  • dbt: Move facets from processor module #3713 @​dolfinus
    Refactors dbt facets organization by moving them from processor module
  • Java: Speedup generateNewUUID #3754 @​dolfinus
    Improves performance of UUID generation in Java client
  • Java: Make UUIDUtils.generateStaticUUID random part more variative #3709 @​dolfinus
    Increases randomness in static UUID generation
  • Java: Add log if load from yaml fails #3766 @​mvitale
    Adds logging when YAML configuration loading fails
  • Spark: Update Spark 4 dependency to 4.0.0 (remove -preview1 suffix) #3751 @​ddebowczyk92
    Updates Spark 4 dependency to final 4.0.0 release
  • Spark: Disable module metadata file generation #3785 @​ddebowczyk92
    Disables generation of module metadata files in Spark integration
  • Python: Use attr.define instead of attr.s #3776 @​kacpermuda
    Modernizes Python code to use newer attrs API
  • Proxy: Remove native proxy #3680 @​mobuchowski
    Removes the native proxy implementation
Fixed
  • Spark: Fix missing table path in InsertIntoHadoopFsRelationCommand #3773 @​dolfinus
    Fixes issue where table path was missing in InsertIntoHadoopFsRelationCommand
  • BigQuery: Filter temp inner jobs for bigquery indirect mode #3722 @​pawel-big-lebowski
    Filters out temporary inner jobs in BigQuery indirect mode
  • dbt: dbt-ol should not error on job complete if there is no start event #3749 @​mobuchowski
    Prevents errors when job completion occurs without corresponding start event
  • Flink: Do not hide OpenLineage config parsing errors #3724 @​dolfinus
    Improves error visibility for OpenLineage configuration parsing in Flink
  • Java: Prevent original events from being mutated in TransformTransport #3728 @​JDarDagran
    Ensures original events remain immutable during transport transformations
  • Java: Fix visibility of GcpLineageTransportConfig.Mode #3762 @​ngorchakova
    Corrects visibility modifier for GCP transport configuration mode

v1.33.0

Compare Source

Added
  • Python: add TransformTransport for Python client #3697 @​kacpermuda
    Introduces the TransformTransport class for event transformations.
  • Spec: add CatalogDatasetFacet #3659 @​mobuchowski
    Adds specification of the CatalogDatasetFacet.
  • Spark: implement CatalogDatasetFacet #3695 @​mobuchowski
    Implements CatalogDatasetFacet for Spark integration.
  • Transport: add Amazon DataZone transport #3685 @​shinabel
    Adds support for publishing events to Amazon DataZone via dedicated transport.
  • dbt: generate new runId for each start event #3696 @​dolfinus
    Ensures dbt start events have unique run identifiers when calling id generation methods multiple times.
  • JobNamespaceReplaceTransformer: modify parent run facet #3706 @​kacpermuda
    Allows modifications of the parent run facet through JobNamespaceReplaceTransformer.
Changed
  • dbt: remove empty facets #3700 @​dolfinus
    Prevents empty facets from being emitted by dbt integration.
  • Spark: broaden exception handling for Iceberg table writes #3686 @​luke-hoffman1
    Broadens exception handling to address missing column-level lineage issues.
Fixed
  • dbt: skip database if not set #3707 @​dolfinus
    Ensures database field is optional and handled gracefully.
  • Spark: don't swallow exceptions in getIcebergTable #3688 @​dolfinus
    Improves exception visibility during Iceberg table retrieval.
  • Flink 2: support JDBC naming #3681 @​pawel-big-lebowski
    Correctly handles JDBC dataset naming conventions in Flink 2 integration.

v1.32.1

Compare Source

Added
  • Avro: support schema facet for Avro datasets #3650 @​pawel-big-lebowski
    This PR adds support for schema facets in Avro datasets.
  • Java: add UUIDUtils.generateStaticUUID utility #3672 @​dolfinus
    Provides utility method for generating static UUIDs.
Fixed
  • dbt: forward dbt's return code #3682 @​MassyB
    Forwards dbt's process return code correctly.
  • dbt: handle dbt log file rotation #3683 @​MassyB
    Implements log rotation handling to avoid log file overflow.
  • Flink2: fix dataset namespace resolvers #3676 @​pawel-big-lebowski
    Correctly resolves dataset namespaces in Flink2.
  • Spark: ensure Spark Job Name suffix extraction #3667 @​luke-hoffman1
    Adds condition and unit test to correctly extract Spark Job Name suffix.
  • Spark: fix NoSuchElementException when calling get() on Option.None #3673 @​ddebowczyk92
    Prevents exceptions when Option.None is encountered.
  • Spark: avoid InputDataset duplicates by skipping visited nodes #3663 @​ddebowczyk92
    Ensures nodes aren't revisited, avoiding duplicate InputDataset creation.

v1.32.0

Compare Source

Added
Changed

Configuration

📅 Schedule: Branch creation - "every 3 months on the first day of the month" (UTC), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

Rebasing: Whenever PR is behind base branch, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.


  • If you want to rebase/retry this PR, check this box

This PR was generated by Mend Renovate. View the repository job log.

@netlify
Copy link

netlify bot commented Jan 1, 2025

Deploy Preview for peppy-sprite-186812 failed.

Name Link
🔨 Latest commit cfc84da
🔍 Latest deploy log https://app.netlify.com/projects/peppy-sprite-186812/deploys/68e551cf01b1d20008f3ae52

@renovate renovate bot force-pushed the renovate/openlineageversion branch from d415db2 to a641137 Compare January 20, 2025 19:42
@renovate renovate bot changed the title Update dependency io.openlineage:openlineage-java to v1.26.0 Update dependency io.openlineage:openlineage-java to v1.27.0 Jan 20, 2025
@codecov
Copy link

codecov bot commented Jan 20, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 81.18%. Comparing base (a89b89c) to head (42f8ad2).

Additional details and impacted files
@@            Coverage Diff            @@
##               main    #3002   +/-   ##
=========================================
  Coverage     81.18%   81.18%           
  Complexity     1506     1506           
=========================================
  Files           268      268           
  Lines          7356     7356           
  Branches        325      325           
=========================================
  Hits           5972     5972           
  Misses         1226     1226           
  Partials        158      158           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@renovate renovate bot force-pushed the renovate/openlineageversion branch 7 times, most recently from 403306b to 96a3b7b Compare January 22, 2025 13:56
@renovate renovate bot force-pushed the renovate/openlineageversion branch 3 times, most recently from dc448cf to ebaf22c Compare February 7, 2025 00:42
@renovate renovate bot changed the title Update dependency io.openlineage:openlineage-java to v1.27.0 Update dependency io.openlineage:openlineage-java to v1.28.0 Feb 7, 2025
@renovate renovate bot force-pushed the renovate/openlineageversion branch from ebaf22c to a164571 Compare February 25, 2025 18:55
@renovate renovate bot changed the title Update dependency io.openlineage:openlineage-java to v1.28.0 Update dependency io.openlineage:openlineage-java to v1.29.0 Feb 25, 2025
@renovate renovate bot force-pushed the renovate/openlineageversion branch from a164571 to f11be42 Compare March 17, 2025 12:11
@renovate renovate bot changed the title Update dependency io.openlineage:openlineage-java to v1.29.0 Update dependency io.openlineage:openlineage-java to v1.30.0 Mar 17, 2025
@renovate renovate bot force-pushed the renovate/openlineageversion branch 6 times, most recently from 432f8d4 to 9d7550b Compare March 26, 2025 17:23
@renovate renovate bot changed the title Update dependency io.openlineage:openlineage-java to v1.30.0 Update dependency io.openlineage:openlineage-java to v1.30.1 Mar 26, 2025
@renovate renovate bot force-pushed the renovate/openlineageversion branch 3 times, most recently from 43064c3 to 7c19900 Compare March 27, 2025 07:34
@renovate renovate bot force-pushed the renovate/openlineageversion branch from 7c19900 to 828aa83 Compare April 10, 2025 15:36
@renovate renovate bot changed the title Update dependency io.openlineage:openlineage-java to v1.30.1 Update dependency io.openlineage:openlineage-java to v1.31.0 Apr 10, 2025
@renovate renovate bot force-pushed the renovate/openlineageversion branch from 828aa83 to 269418c Compare April 24, 2025 17:54
@renovate renovate bot changed the title Update dependency io.openlineage:openlineage-java to v1.31.0 Update dependency io.openlineage:openlineage-java to v1.32.0 Apr 24, 2025
@renovate renovate bot force-pushed the renovate/openlineageversion branch from 269418c to b269206 Compare May 6, 2025 23:41
@renovate renovate bot changed the title Update dependency io.openlineage:openlineage-java to v1.32.0 Update dependency io.openlineage:openlineage-java to v1.32.1 May 6, 2025
@renovate renovate bot force-pushed the renovate/openlineageversion branch from b269206 to 42f8ad2 Compare May 19, 2025 17:23
@renovate renovate bot changed the title Update dependency io.openlineage:openlineage-java to v1.32.1 Update dependency io.openlineage:openlineage-java to v1.33.0 May 19, 2025
@renovate renovate bot force-pushed the renovate/openlineageversion branch from 42f8ad2 to ca5d001 Compare June 18, 2025 20:50
@renovate renovate bot changed the title Update dependency io.openlineage:openlineage-java to v1.33.0 Update dependency io.openlineage:openlineage-java to v1.34.0 Jun 18, 2025
@renovate renovate bot force-pushed the renovate/openlineageversion branch from ca5d001 to dcce969 Compare July 11, 2025 20:06
@renovate renovate bot changed the title Update dependency io.openlineage:openlineage-java to v1.34.0 Update dependency io.openlineage:openlineage-java to v1.35.0 Jul 11, 2025
@renovate renovate bot force-pushed the renovate/openlineageversion branch from dcce969 to d45585c Compare July 22, 2025 20:54
@renovate renovate bot changed the title Update dependency io.openlineage:openlineage-java to v1.35.0 Update dependency io.openlineage:openlineage-java to v1.36.0 Jul 22, 2025
@renovate renovate bot force-pushed the renovate/openlineageversion branch from d45585c to d31b116 Compare August 11, 2025 22:00
@renovate renovate bot changed the title Update dependency io.openlineage:openlineage-java to v1.36.0 Update dependency io.openlineage:openlineage-java to v1.37.0 Aug 11, 2025
@renovate renovate bot force-pushed the renovate/openlineageversion branch from d31b116 to bec3771 Compare October 1, 2025 22:10
@renovate renovate bot changed the title Update dependency io.openlineage:openlineage-java to v1.37.0 Update dependency io.openlineage:openlineage-java to v1.38.0 Oct 1, 2025
@renovate renovate bot force-pushed the renovate/openlineageversion branch from bec3771 to cfc84da Compare October 7, 2025 17:45
@renovate renovate bot changed the title Update dependency io.openlineage:openlineage-java to v1.38.0 Update dependency io.openlineage:openlineage-java to v1.39.0 Oct 7, 2025
@renovate renovate bot force-pushed the renovate/openlineageversion branch from cfc84da to f9f7c18 Compare November 14, 2025 02:13
@renovate renovate bot changed the title Update dependency io.openlineage:openlineage-java to v1.39.0 Update dependency io.openlineage:openlineage-java to v1.40.0 Nov 14, 2025
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
@renovate renovate bot force-pushed the renovate/openlineageversion branch from f9f7c18 to 3b1b2ea Compare November 14, 2025 12:55
@renovate renovate bot changed the title Update dependency io.openlineage:openlineage-java to v1.40.0 Update dependency io.openlineage:openlineage-java to v1.40.1 Nov 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant