-
Notifications
You must be signed in to change notification settings - Fork 1.8k
reimport: match findings in batches #13889
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
reimport: match findings in batches #13889
Conversation
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
|
Conflicts have been resolved. A maintainer will review the pull request shortly. |
🔴 Risk threshold exceeded.This pull request modifies sensitive codepaths (dojo/finding/deduplication.py and dojo/importers/default_reimporter.py) and the scanner flagged those edits as sensitive; you can configure allowed authors and sensitive paths in .dryrunsecurity.yaml to adjust risk handling.
🔴 Configured Codepaths Edit in
|
| Vulnerability | Configured Codepaths Edit |
|---|---|
| Description | Sensitive edits detected for this file. Sensitive file paths and allowed authors can be configured in .dryrunsecurity.yaml. |
🔴 Configured Codepaths Edit in dojo/importers/default_reimporter.py
| Vulnerability | Configured Codepaths Edit |
|---|---|
| Description | Sensitive edits detected for this file. Sensitive file paths and allowed authors can be configured in .dryrunsecurity.yaml. |
🔴 Configured Codepaths Edit in dojo/importers/default_reimporter.py
| Vulnerability | Configured Codepaths Edit |
|---|---|
| Description | Sensitive edits detected for this file. Sensitive file paths and allowed authors can be configured in .dryrunsecurity.yaml. |
We've notified @mtesauro.
All finding details can be found in the DryRun Security Dashboard.
mtesauro
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved
Reimport processes all findings in the report. For each finding it tries to find a match in the existing findings already present in the Test. This matching used to be done a a per finding basis. So 1000 findings results in a 1000 database queries. This PR batches this based on the code we also use for deduplication. Candidates for matches are retrieved in batches for a 1000 incoming findings at once. This also allows for prefetching vulnerability_ids, endpoints and endpoin statuses. This makes reimport a lot more effecient.
An improvement of ~77.7% on reimport of identical reports:
If the report being reimported has some new or mitigated findings, of course there will still be some time used to process these changes making the real world gains a little lower.
The PR also includes some other optimizations/fixes:
cvewas not stored during reimportThis PR affects Pro.