Skip to content

Conversation

@valentijnscholten
Copy link
Member

@valentijnscholten valentijnscholten commented Dec 13, 2025

Reimport processes all findings in the report. For each finding it tries to find a match in the existing findings already present in the Test. This matching used to be done a a per finding basis. So 1000 findings results in a 1000 database queries. This PR batches this based on the code we also use for deduplication. Candidates for matches are retrieved in batches for a 1000 incoming findings at once. This also allows for prefetching vulnerability_ids, endpoints and endpoin statuses. This makes reimport a lot more effecient.

An improvement of ~77.7% on reimport of identical reports:

dev: Successfully reimported 'jfrog_xray_unified/very_many_vulns.json' into test ID 1008 (took 116.76 seconds)
batch: Successfully reimported 'jfrog_xray_unified/very_many_vulns.json' into test ID 1008 (took 25.78 seconds)

If the report being reimported has some new or mitigated findings, of course there will still be some time used to process these changes making the real world gains a little lower.

The PR also includes some other optimizations/fixes:

  • Only reconcile vulnerability ids if they have changed
  • No longer delete existing vulnerability ids twice
  • Separate vulnerability_id handling for new and existing findings to allow optimization
  • Some low hanging bulk update optimizations for endpoint statuses
  • Fix a bug where a change in cve was not stored during reimport

This PR affects Pro.

@valentijnscholten valentijnscholten added this to the 2.54.0 milestone Dec 13, 2025
@github-actions github-actions bot added settings_changes Needs changes to settings.py based on changes in settings.dist.py included in this PR unittests labels Dec 13, 2025
@github-actions github-actions bot added the lint label Dec 14, 2025
@valentijnscholten valentijnscholten changed the title Reimport batching reimport: match findings in batches Dec 14, 2025
@valentijnscholten valentijnscholten added the performance performance improvement or bug report label Dec 14, 2025
@github-actions github-actions bot added the docs label Dec 14, 2025
@valentijnscholten valentijnscholten added the affects_pro PRs that affect Pro and need a coordinated release/merge moment. label Dec 14, 2025
@github-actions
Copy link
Contributor

This pull request has conflicts, please resolve those before we can evaluate the pull request.

@github-actions
Copy link
Contributor

Conflicts have been resolved. A maintainer will review the pull request shortly.

@valentijnscholten valentijnscholten marked this pull request as ready for review December 17, 2025 15:41
@dryrunsecurity
Copy link

DryRun Security

🔴 Risk threshold exceeded.

This pull request modifies sensitive codepaths (dojo/finding/deduplication.py and dojo/importers/default_reimporter.py) and the scanner flagged those edits as sensitive; you can configure allowed authors and sensitive paths in .dryrunsecurity.yaml to adjust risk handling.

🔴 Configured Codepaths Edit in dojo/finding/deduplication.py
Vulnerability Configured Codepaths Edit
Description Sensitive edits detected for this file. Sensitive file paths and allowed authors can be configured in .dryrunsecurity.yaml.
🔴 Configured Codepaths Edit in dojo/importers/default_reimporter.py
Vulnerability Configured Codepaths Edit
Description Sensitive edits detected for this file. Sensitive file paths and allowed authors can be configured in .dryrunsecurity.yaml.
🔴 Configured Codepaths Edit in dojo/importers/default_reimporter.py
Vulnerability Configured Codepaths Edit
Description Sensitive edits detected for this file. Sensitive file paths and allowed authors can be configured in .dryrunsecurity.yaml.

We've notified @mtesauro.


All finding details can be found in the DryRun Security Dashboard.

Copy link
Contributor

@mtesauro mtesauro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

affects_pro PRs that affect Pro and need a coordinated release/merge moment. docs lint performance performance improvement or bug report settings_changes Needs changes to settings.py based on changes in settings.dist.py included in this PR unittests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants