Skip to content

Conversation

@MACKAT05
Copy link
Contributor

This pull request introduces a major refactor and enhancement to how local data files are injected into Jinja templates, replacing the legacy JinjaEnvVar approach with the new LocalDataInjection class. It adds new functions for loading CSV, JSON, and YAML files directly into templates, updates documentation, and provides comprehensive demos comparing the new and legacy methods. The changes are backward compatible and include deprecation notices for legacy features.

Local Data Injection Enhancements

  • Refactored the legacy JinjaEnvVar class into the new LocalDataInjection class, which now provides the env_var() function and introduces from_csv(), from_json(), and from_yaml() functions for loading local data files in Jinja templates. This enables direct access to structured data from CSV, JSON, and YAML files within templates. (CHANGELOG.md [1] README.md [2] [3]
  • Updated the Jinja template processor and documentation to use LocalDataInjection and its new functions, including detailed usage examples and a dedicated documentation file. (README.md [1] [2] docs/LocalDataInjection.md [3]

Demo and Documentation

  • Added a comprehensive demo (demo/citibike_demo_jinja_data_injection) showcasing both the new LocalDataInjection approach and the legacy method, including SQL scripts and configuration files that demonstrate loading and processing data from external files. (demo/citibike_demo_jinja_data_injection/1_setup/A__setup.sql [1] 2_test/V1.1.0__initial_database_objects.sql [2] 2_test/V1.1.0__initial_database_objects_legacy.sql [3] and related files)

Deprecation and Backward Compatibility

  • Deprecated the JinjaEnvVar class in favor of LocalDataInjection, while maintaining backward compatibility for existing templates using env_var(). (CHANGELOG.md CHANGELOG.mdR6-R22)

Configuration Files for Demo

  • Added example configuration files (file_formats.json, stages.yaml) and their legacy Jinja equivalents to support the demo and illustrate the benefits of loading data from external sources. (file_formats.json [1] stages.yaml [2] file_formats_legacy.j2 [3]

These changes make it much easier and more maintainable to inject local structured data into Jinja templates, improving both developer experience and template flexibility.

…Injection class. Added functions to load data from CSV, JSON, and YAML files directly into Jinja templates. Updated documentation and demo to showcase new features and deprecated the legacy JinjaEnvVar class.
@MACKAT05
Copy link
Contributor Author

MACKAT05 commented Sep 1, 2025

I have yet to try to deploy the new example... i have run the render command and inspected the results... the main focus was to show the macro pattern that would inflate structured data instead of hard coded SQL and that the structured data was loading without throwing fits about not being able to find files.


##### from_csv, from_json, from_yaml

These functions provide access to local data files for use in Jinja templates. For detailed documentation and examples, see [LocalDataInjection.md](docs/LocalDataInjection.md).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reference to LocalDataInjection.md is invalid. We can remove .md if the file is still valid.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i missed this one. I used AI tooling to help create examples ( without posting sensitive customer/ client data) it unfortunately was very aggressive about littering the repository with .md files and not using the existing markdown file entries.

I think this should at this point be revised to:
For examples, see citibike_demo_jinja_data_injection.

How are you finding this feature branch otherwise?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not completed my review. I generally understand what you are doing.

Curious - Was there a scenario you faced that lead to developing this solution? Does it relate to an open PR?

Copy link

@sahil-walia sahil-walia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for contributing.


##### from_csv, from_json, from_yaml

These functions provide access to local data files for use in Jinja templates. For detailed documentation and examples, see [LocalDataInjection.md](docs/LocalDataInjection.md).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not completed my review. I generally understand what you are doing.

Curious - Was there a scenario you faced that lead to developing this solution? Does it relate to an open PR?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we not rename this file from schemachange/JinjaEnvVar.py to JinjaTemplateDataProvider and code from your localdatainjection.py since we are extending the capability?

I think we should not name the file with Injection as it is generally perceived with negative connotation.

expected = "John is 30 years old\nJane is 25 years old\n"
assert result == expected
finally:
os.unlink(csv_path)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add one blank line at the end

with open(yaml_path, encoding=encoding) as yamlfile:
data = yaml.safe_load(yamlfile)

return data

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add a blank line at the end.

return data

@staticmethod
def from_json(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: formatting inconsistencies .

Suggested change
def from_json(
def from_json(
file_path: str,
encoding: str = "utf-8"
) -> dict[str, Any] | list[Any]:

return data

@staticmethod
def from_yaml(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: formatting for consistency similar to from_csv

Suggested change
def from_yaml(
def from_yaml(
file_path: str,
encoding: str = "utf-8"
) -> dict[str, Any] | list[Any]:

csv_path = Path(file_path)
if not csv_path.exists():
raise FileNotFoundError(f"CSV file not found: {file_path}")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validation for input delimiter. .

    if not isinstance(delimiter, str) or len(delimiter) != 1:
        raise ValueError("Delimiter must be a single character")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more to check file is actually a csv

    if not csv_path.suffix.lower() == '.csv':
        raise ValueError(f"File {file_path} is not a CSV file")

@sfc-gh-tmathew sfc-gh-tmathew added Under Review This is being discussed without planned changes community-contribution Submitted by community target: 4.3.0 Planned for 4.3.0 release labels Nov 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution Submitted by community target: 4.3.0 Planned for 4.3.0 release Under Review This is being discussed without planned changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants