Skip to content

Conversation

@hoboristi
Copy link

@hoboristi hoboristi commented Jul 27, 2025

Key Improvements:

  1. Configurable gsub filters: The hardcoded gsub! patterns are now in a default_gsub_filters array that can be overridden using vars(:gsub_filters)

  2. Configurable line filters: The hardcoded line rejection patterns are now in a default_line_filters array that can be overridden using vars(:line_filters)

  3. Cleaner code structure: Split the filtering logic into two separate private methods:

    • apply_gsub_filters(cfg) - handles string replacement patterns
    • apply_line_filters(cfg) - handles line rejection patterns
  4. Optional history: history now can be switched off if not neccesary by setting exclude_history to true

How to use the new configuration:

Users can now customize the filtering behavior in their Oxidized configuration by adding variables like:

vars:
gsub_filters:
- [ !ruby/regexp '/\\r?\n\s+/', ''] # Custom gsub pattern
- [ !ruby/regexp '# my custom comment', '']
line_filters:
- !ruby/regex '/^# custom line pattern.*$/'
exclude_history: true # set to true to skip history

Benefits:

  • Backward compatible: Uses the same default filters, so existing setups continue to work
  • Flexible: Users can add, remove, or modify filtering patterns without touching the model code
  • Maintainable: The filtering logic is now organized and documented
  • Extensible: Easy to add new types of filters in the future

The refactored code maintains all the original functionality while providing the flexibility you requested for ignoring certain changes through configuration.

Pre-Request Checklist

  • Passes rubocop code analysis (try rubocop --auto-correct)
  • Tests added or adapted (try rake test)
  • Changes are reflected in the documentation
  • User-visible changes appended to CHANGELOG.md

Description

## Key Improvements:

1. *Configurable gsub filters*: The hardcoded gsub! patterns are now in a default_gsub_filters array that can be overridden using vars(:gsub_filters)

2. *Configurable line filters*: The hardcoded line rejection patterns are now in a default_line_filters array that can be overridden using vars(:line_filters)

3. *Cleaner code structure*: Split the filtering logic into two separate private methods:
   - apply_gsub_filters(cfg) - handles string replacement patterns
   - apply_line_filters(cfg) - handles line rejection patterns

4. Removed the backup of change history, its important for security point of view, not for config backup.
 
## How to use the new configuration:

Users can now customize the filtering behavior in their Oxidized configuration by adding variables like:

vars:
  gsub_filters:
    - [ !ruby/regexp '/\\\r?\n\s+/', '']  # Custom gsub pattern
    - [ !ruby/regexp '# my custom comment', '']
  line_filters:
    - !ruby/regex '/^# custom line pattern.*$/'

## Benefits:

- *Backward compatible*: Uses the same default filters, so existing setups continue to work
- *Flexible*: Users can add, remove, or modify filtering patterns without touching the model code
- *Maintainable*: The filtering logic is now organized and documented
- *Extensible*: Easy to add new types of filters in the future

The refactored code maintains all the original functionality while providing the flexibility you requested for ignoring certain changes through configuration.
@hoboristi hoboristi marked this pull request as draft July 27, 2025 21:11
Added optional history
@ytti
Copy link
Owner

ytti commented Jul 28, 2025

At least the issue text is obvious LLM slop, I think the code is LLM slop too. Vars is not Hash, so fetch won't work.

Now ignoring the LLM slop we should prefix model based bards with model name, to create namespace, otherwise we have high risk that someone else uses same variable in in some other model specific context and creates unexpected behaviour.

If that is fixed, I'm still on the fence on why the filtering should be configurable. The issue doesn't address what problem is being solved and why it is well solved here, compared to other options, like fixing it in the model or expecting users to create custom models which superclass the distribution model.

changed logic to make history default, it can be omitted by defining exclude_history var
@hoboristi
Copy link
Author

At least the issue text is obvious LLM slop, I think the code is LLM slop too. Vars is not Hash, so fetch won't work.

Now ignoring the LLM slop we should prefix model based bards with model name, to create namespace, otherwise we have high risk that someone else uses same variable in in some other model specific context and creates unexpected behaviour.

If that is fixed, I'm still on the fence on why the filtering should be configurable. The issue doesn't address what problem is being solved and why it is well solved here, compared to other options, like fixing it in the model or expecting users to create custom models which superclass the distribution model.

Got you, I'll try to fix this. The idea was modifying the model instead of override, to not to miss updates from the official repo, and make filtering user-friendly

@github-actions
Copy link

This PR is stale because it has been open 90 days with no activity.

@github-actions github-actions bot added the Stale label Oct 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants