fix(taskomatic): remove systems with null inventory ids #2106

mtclinton · 2025-12-16T23:05:12Z

fix(taskomatic): remove systems with null inventory ids

Systems with null as inventory_id are not properly deleted (see steps below)
The delete_systems job selects inventory_id without filtering NULL values.
When inventory_id is NULL, fetchone() returns (None,), and delete_system(None) is called.
The delete_system() function uses WHERE inventory_id = inventory_id_in, which fails when inventory_id_in is NULL (because NULL = NULL is false in SQL).
This leaves systems with NULL inventory_id stuck in the database.

Secure Coding Practices Checklist GitHub Link

https://github.com/RedHatInsights/secure-coding-checklist

Secure Coding Checklist

Summary by Sourcery

Bug Fixes:

Ensure systems with NULL inventory_id are also deleted during periodic cleanup without leaving orphaned related data.

jira-linking · 2025-12-16T23:05:16Z

Commits missing Jira IDs:
a152235

sourcery-ai · 2025-12-16T23:05:19Z

Reviewer's Guide

This PR updates the taskomatic cleanup job to ensure that systems with null inventory_ids are also fully cleaned up, by tightening the existing deletion query to only process valid inventory_ids and adding a new deletion loop that safely removes legacy/orphaned systems lacking inventory_ids along with their related records.

ER diagram for cleanup of systems with null inventory_id

erDiagram
    system_platform {
        bigint id PK
        uuid inventory_id
        bigint rh_account_id
        timestamp when_deleted
        boolean opt_out
    }

    system_vulnerabilities {
        bigint id PK
        bigint system_id FK
        bigint rh_account_id
        text vulnerability_data
    }

    system_vulnerable_package {
        bigint id PK
        bigint system_id FK
        bigint rh_account_id
        text package_data
    }

    system_repo {
        bigint id PK
        bigint system_id FK
        text repo_data
    }

    system_cve_data {
        bigint id PK
        bigint system_id FK
        text cve_data
    }

    system_platform ||--o{ system_vulnerabilities : system_id
    system_platform ||--o{ system_vulnerable_package : system_id
    system_platform ||--o{ system_repo : system_id
    system_platform ||--o{ system_cve_data : system_id

Flow diagram for updated system deletion job logic

flowchart TD
    A[start_run] --> B[open_db_connection]
    B --> C[init_deleted_counter]

    subgraph Loop_valid_inventory_id
        C --> D[query_next_system_where_when_deleted_threshold_and_inventory_id_not_null]
        D -->|no_row| H[proceed_to_null_inventory_loop]
        D -->|row_found| E[delete_system_with_inventory_id_via_existing_logic]
        E --> F[increment_deleted_counter]
        F --> G[commit_transaction]
        G --> D
    end

    H --> I[query_next_system_where_when_deleted_threshold_and_inventory_id_null]

    subgraph Loop_null_inventory_id
        I -->|no_row| O[close_cursor_and_connection]
        I -->|row_found| J[load_system_id_and_rh_account_id]
        J --> K[update_system_platform_set_opt_out_true]
        K --> L[delete_from_system_vulnerabilities_by_system_id_and_rh_account_id]
        L --> M[delete_from_system_vulnerable_package_by_system_id_and_rh_account_id]
        M --> N[delete_from_system_repo_by_system_id]
        N --> P[delete_from_system_cve_data_by_system_id]
        P --> Q{delete_from_system_platform_by_id}
        Q -->|success| S[increment_deleted_counter]
        Q -->|foreign_key_violation| R[delete_from_system_vulnerabilities_all_partitions_and_system_vulnerable_package_all_partitions_then_delete_system_platform]
        R --> S
        S --> T[log_success_for_null_inventory_system]
        T --> U[commit_transaction]
        U --> I
    end

    O --> V[end_run]

File-Level Changes

Change	Details	Files
Restrict the existing deletion loop to systems with non-null inventory_ids and add a new cleanup loop to safely delete systems with null inventory_ids and their related data.	Add an inventory_id IS NOT NULL predicate to the existing system selection query used for inventory-based deletion. Introduce a second while loop that selects one legacy system at a time by id and rh_account_id where inventory_id IS NULL and when_deleted is older than the configured threshold, using FOR UPDATE for row locking. Within the new loop, mark each system as opted out and delete associated records from system_vulnerabilities, system_vulnerable_package, system_repo, and system_cve_data before attempting to delete the system row itself. Handle ForeignKeyViolation when deleting the system row by retrying deletions on system_vulnerabilities and system_vulnerable_package without rh_account_id filtering, then deleting the system row again. Increment the deletion counter and add structured logging for both successful deletions and failures when processing systems with null inventory_ids.	`taskomatic/jobs/delete_systems.py`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey there - I've reviewed your changes - here's some feedback:

The two while True cleanup loops share a lot of structure (time threshold, single-row SELECT with FOR UPDATE, per-row commit, logging); consider extracting a helper to reduce duplication and make the deletion strategies (with/without inventory_id) easier to reason about and maintain.
The nested try for handling psycopg2.errors.ForeignKeyViolation could be made clearer and safer by using an explicit savepoint/rollback to avoid partially-applied statements in the outer try and by documenting why a second round of deletions without rh_account_id is necessary.
Committing inside each loop iteration can be expensive and may lead to inconsistent batches; consider processing multiple rows per transaction (e.g., by increasing the LIMIT or batching deletes) to improve performance while still keeping transactions small.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The two `while True` cleanup loops share a lot of structure (time threshold, single-row SELECT with `FOR UPDATE`, per-row commit, logging); consider extracting a helper to reduce duplication and make the deletion strategies (with/without inventory_id) easier to reason about and maintain.
- The nested `try` for handling `psycopg2.errors.ForeignKeyViolation` could be made clearer and safer by using an explicit savepoint/rollback to avoid partially-applied statements in the outer `try` and by documenting why a second round of deletions without `rh_account_id` is necessary.
- Committing inside each loop iteration can be expensive and may lead to inconsistent batches; consider processing multiple rows per transaction (e.g., by increasing the `LIMIT` or batching deletes) to improve performance while still keeping transactions small.

## Individual Comments

### Comment 1
<location> `taskomatic/jobs/delete_systems.py:31` </location>
<code_context>
     deleted = 0

+    # Delete systems with valid inventory_id
     while True:
         curr_time = datetime.now(tz=pytz.utc)
         cur.execute(
</code_context>

<issue_to_address>
**issue (complexity):** Consider refactoring the two deletion loops into shared helper functions that encapsulate the looping, per-row deletion, and error handling logic to reduce duplication and flatten control flow.

You can keep the new behavior but reduce complexity by extracting the common loop structure and the per-system delete logic into helpers. That removes duplication, flattens control flow, and centralizes error handling.

### 1. Extract shared cursor loop

Both loops share the same pattern: “select one row with `FOR UPDATE`, process it, commit, repeat”. You can move that into a small helper to avoid the duplicated `while True` logic:

```python
def process_deletions(cur, select_sql, select_params_fn, handle_row_fn):
    deleted = 0
    while True:
        curr_time = datetime.now(tz=pytz.utc)
        cur.execute(select_sql, select_params_fn(curr_time))
        row = cur.fetchone()
        if not row:
            break

        try:
            if handle_row_fn(row):
                deleted += 1
        except Exception as e:
            LOGGER.error("Error processing row %s: %s", row, e)

        cur.connection.commit()

    return deleted
```

Then use it for both cases:

```python
# 1) inventory_id not null
deleted += process_deletions(
    cur,
    """SELECT inventory_id FROM system_platform sp
       WHERE when_deleted IS NOT NULL
         AND when_deleted < %s
         AND inventory_id IS NOT NULL
       LIMIT 1 FOR UPDATE OF sp""",
    lambda curr_time: (curr_time - timedelta(hours=CFG.system_deletion_threshold),),
    lambda row: delete_with_inventory_id(cur, row[0]),
)

# 2) inventory_id is null
deleted += process_deletions(
    cur,
    """SELECT id, rh_account_id FROM system_platform sp
       WHERE when_deleted IS NOT NULL
         AND when_deleted < %s
         AND inventory_id IS NULL
       LIMIT 1 FOR UPDATE OF sp""",
    lambda curr_time: (curr_time - timedelta(hours=CFG.system_deletion_threshold),),
    lambda row: delete_orphan_system(cur, row[0], row[1]),
)
```

### 2. Extract per-system delete logic for NULL `inventory_id`

Move the “delete all data for system X” sequence into its own function. This keeps the main loop linear and removes nested `try/except`:

```python
def delete_orphan_system(cur, system_id, rh_account_id):
    # Mark as opt_out to match current behavior
    cur.execute(
        "UPDATE system_platform SET opt_out = true WHERE id = %s",
        (system_id,),
    )

    # Delete partitioned data first (with account filter)
    cur.execute(
        "DELETE FROM system_vulnerabilities "
        "WHERE system_id = %s AND rh_account_id = %s",
        (system_id, rh_account_id),
    )
    cur.execute(
        "DELETE FROM system_vulnerable_package "
        "WHERE system_id = %s AND rh_account_id = %s",
        (system_id, rh_account_id),
    )

    # Non-partitioned tables
    cur.execute("DELETE FROM system_repo WHERE system_id = %s", (system_id,))
    cur.execute("DELETE FROM system_cve_data WHERE system_id = %s", (system_id,))

    # Try delete system_platform, fall back to “all partitions” delete if needed
    try:
        cur.execute("DELETE FROM system_platform WHERE id = %s", (system_id,))
    except psycopg2.errors.ForeignKeyViolation:
        cur.execute(
            "DELETE FROM system_vulnerabilities WHERE system_id = %s",
            (system_id,),
        )
        cur.execute(
            "DELETE FROM system_vulnerable_package WHERE system_id = %s",
            (system_id,),
        )
        cur.execute("DELETE FROM system_platform WHERE id = %s", (system_id,))

    LOGGER.info(
        "Deleted system with NULL inventory_id: system_id=%s, rh_account_id=%s",
        system_id,
        rh_account_id,
    )
    return True
```

### 3. Reuse existing `delete_system` behavior

Similarly, wrap the existing `delete_system` call so the loop uses the same error/logging pattern:

```python
def delete_with_inventory_id(cur, inventory_id):
    cur.execute("SELECT deleted_inventory_id FROM delete_system(%s)", (inventory_id,))
    success = cur.fetchone()
    if success:
        return True

    LOGGER.error("Unable to delete inventory_id: %s", inventory_id)
    return False
```

This keeps all existing behavior (including the partition-aware FK handling) but:

- Removes the duplicated `while True` loops.
- Centralizes error handling and logging in one place.
- Makes the per-system delete path linear and easier to reason about.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2025-12-16T23:06:07Z

taskomatic/jobs/delete_systems.py

    deleted = 0

+    # Delete systems with valid inventory_id
    while True:


issue (complexity): Consider refactoring the two deletion loops into shared helper functions that encapsulate the looping, per-row deletion, and error handling logic to reduce duplication and flatten control flow.

You can keep the new behavior but reduce complexity by extracting the common loop structure and the per-system delete logic into helpers. That removes duplication, flattens control flow, and centralizes error handling.

1. Extract shared cursor loop

Both loops share the same pattern: “select one row with FOR UPDATE, process it, commit, repeat”. You can move that into a small helper to avoid the duplicated while True logic:

def process_deletions(cur, select_sql, select_params_fn, handle_row_fn): deleted = 0 while True: curr_time = datetime.now(tz=pytz.utc) cur.execute(select_sql, select_params_fn(curr_time)) row = cur.fetchone() if not row: break try: if handle_row_fn(row): deleted += 1 except Exception as e: LOGGER.error("Error processing row %s: %s", row, e) cur.connection.commit() return deleted

Then use it for both cases:

# 1) inventory_id not null deleted += process_deletions( cur, """SELECT inventory_id FROM system_platform sp WHERE when_deleted IS NOT NULL AND when_deleted < %s AND inventory_id IS NOT NULL LIMIT 1 FOR UPDATE OF sp""", lambda curr_time: (curr_time - timedelta(hours=CFG.system_deletion_threshold),), lambda row: delete_with_inventory_id(cur, row[0]), ) # 2) inventory_id is null deleted += process_deletions( cur, """SELECT id, rh_account_id FROM system_platform sp WHERE when_deleted IS NOT NULL AND when_deleted < %s AND inventory_id IS NULL LIMIT 1 FOR UPDATE OF sp""", lambda curr_time: (curr_time - timedelta(hours=CFG.system_deletion_threshold),), lambda row: delete_orphan_system(cur, row[0], row[1]), )

2. Extract per-system delete logic for NULL inventory_id

Move the “delete all data for system X” sequence into its own function. This keeps the main loop linear and removes nested try/except:

def delete_orphan_system(cur, system_id, rh_account_id): # Mark as opt_out to match current behavior cur.execute( "UPDATE system_platform SET opt_out = true WHERE id = %s", (system_id,), ) # Delete partitioned data first (with account filter) cur.execute( "DELETE FROM system_vulnerabilities " "WHERE system_id = %s AND rh_account_id = %s", (system_id, rh_account_id), ) cur.execute( "DELETE FROM system_vulnerable_package " "WHERE system_id = %s AND rh_account_id = %s", (system_id, rh_account_id), ) # Non-partitioned tables cur.execute("DELETE FROM system_repo WHERE system_id = %s", (system_id,)) cur.execute("DELETE FROM system_cve_data WHERE system_id = %s", (system_id,)) # Try delete system_platform, fall back to “all partitions” delete if needed try: cur.execute("DELETE FROM system_platform WHERE id = %s", (system_id,)) except psycopg2.errors.ForeignKeyViolation: cur.execute( "DELETE FROM system_vulnerabilities WHERE system_id = %s", (system_id,), ) cur.execute( "DELETE FROM system_vulnerable_package WHERE system_id = %s", (system_id,), ) cur.execute("DELETE FROM system_platform WHERE id = %s", (system_id,)) LOGGER.info( "Deleted system with NULL inventory_id: system_id=%s, rh_account_id=%s", system_id, rh_account_id, ) return True

3. Reuse existing delete_system behavior

Similarly, wrap the existing delete_system call so the loop uses the same error/logging pattern:

def delete_with_inventory_id(cur, inventory_id): cur.execute("SELECT deleted_inventory_id FROM delete_system(%s)", (inventory_id,)) success = cur.fetchone() if success: return True LOGGER.error("Unable to delete inventory_id: %s", inventory_id) return False

This keeps all existing behavior (including the partition-aware FK handling) but:

Removes the duplicated while True loops.

Centralizes error handling and logging in one place.

Makes the per-system delete path linear and easier to reason about.

fix(taskomatic): remove systems with null inventory ids

mtclinton · 2025-12-16T23:49:00Z

/retest

mtclinton · 2025-12-16T23:50:07Z

@jdobes not sure if I should add a new database function to migrations or if this is even needed (based on how FedRamp is testing systems)

jdobes · 2026-01-02T15:49:46Z

Closing as NULL inventory IDs are not supported, they can't occur in database - https://github.com/RedHatInsights/vulnerability-engine/blob/master/database/schema/ve_db_postgresql.sql#L197

sourcery-ai bot reviewed Dec 16, 2025

View reviewed changes

fix(taskomatic): remove systems with null inventory ids

a152235

fix(taskomatic): remove systems with null inventory ids

mtclinton force-pushed the remove-system-null-inventory-id branch from fd03f27 to a152235 Compare December 16, 2025 23:15

jdobes closed this Jan 2, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(taskomatic): remove systems with null inventory ids #2106

fix(taskomatic): remove systems with null inventory ids #2106

Uh oh!

mtclinton commented Dec 16, 2025 •

edited

Loading

Uh oh!

jira-linking bot commented Dec 16, 2025 •

edited

Loading

Uh oh!

sourcery-ai bot commented Dec 16, 2025 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai bot left a comment

Uh oh!

sourcery-ai bot Dec 16, 2025

Uh oh!

mtclinton commented Dec 16, 2025

Uh oh!

mtclinton commented Dec 16, 2025

Uh oh!

jdobes commented Jan 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix(taskomatic): remove systems with null inventory ids #2106

fix(taskomatic): remove systems with null inventory ids #2106

Uh oh!

Conversation

mtclinton commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Secure Coding Practices Checklist GitHub Link

Secure Coding Checklist

Summary by Sourcery

Uh oh!

jira-linking bot commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sourcery-ai bot commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

ER diagram for cleanup of systems with null inventory_id

Flow diagram for updated system deletion job logic

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Dec 16, 2025

Choose a reason for hiding this comment

1. Extract shared cursor loop

2. Extract per-system delete logic for NULL inventory_id

3. Reuse existing delete_system behavior

Uh oh!

mtclinton commented Dec 16, 2025

Uh oh!

mtclinton commented Dec 16, 2025

Uh oh!

jdobes commented Jan 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mtclinton commented Dec 16, 2025 •

edited

Loading

jira-linking bot commented Dec 16, 2025 •

edited

Loading

sourcery-ai bot commented Dec 16, 2025 •

edited

Loading

2. Extract per-system delete logic for NULL `inventory_id`

3. Reuse existing `delete_system` behavior