Skip to content

Conversation

@whatisgalen
Copy link
Member

@whatisgalen whatisgalen commented May 3, 2025

Types of changes

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Description of Change

Deleting data should be the fastest CRUD operation, so why does it take as long as (or longer than) Load inserts? The short answer is:

  • too much stored in-memory which can lead to costly swaps
  • various unnecessary indexing operations
  • a few too many ORM queries

A few highlights:

  • in the ResourceXResource.delete method, deletes a tile that has no related resources in tile.data array
  • creates a new kwarg for tile.save and tile.delete: recalculate_descriptors=True since most of the time it's not needed, especially when bulk indexing usually handles recalculate_descriptors in its own block of code
  • Resource proxy model now has self.torelations = [] and self.fromrelations = [] properties for storing prefetched relations
  • Resource proxy model save method now calls self.save_descriptors and passes recalculate_descriptors=False to local tile.saves
  • index_database.optimize_resource_iteration now prefetches a resource's related resources
  • BulkDataDeletion.delete_resources uses optimize_resource_iteration instead of doing its own thing
  • reverse_edit_log_entries now uses iterators, bulk indexes tile deletion, and doesn't index until necessary
  • failed load.reverse will reset the load status to the original status instead of being stuck in "unloading"

Issues Solved

Addresses #12015

Checklist

  • I targeted one of these branches:
    • dev/8.1.x (under development): features, bugfixes not covered below
    • dev/8.0.x (under development): features, bugfixes not covered below
    • dev/7.6.x (main support): regressions, crashing bugs, security issues, major bugs in new features
    • dev/6.2.x (extended support): major security issues, data loss issues
  • I added a changelog in arches/releases
  • I submitted a PR to arches-docs (if appropriate)
  • Unit tests pass locally with my changes
  • I added tests that prove my fix is effective or that my feature works
  • My test fails on the target branch

Accessibility Checklist

Developer Guide

Topic Changed Retested
Color contrast
Form fields
Headings
Links
Keyboard
Responsive Design
HTML validation
Screen reader

Ticket Background

  • Sponsored by:
  • Found by: @
  • Tested by: @
  • Designed by: @

Further comments

@whatisgalen whatisgalen changed the title Gvm/bulk delete improvements bulk delete speed improvements May 3, 2025
@whatisgalen whatisgalen changed the title bulk delete speed improvements BDM bulk delete speed improvements May 3, 2025
@whatisgalen whatisgalen marked this pull request as draft May 3, 2025 16:35
@whatisgalen whatisgalen marked this pull request as ready for review May 8, 2025 14:30
@whatisgalen
Copy link
Member Author

I didn't do a numerical comparison but reversing loads is significantly faster, somewhere between 10-50x faster

@whatisgalen whatisgalen requested a review from aarongundel May 23, 2025 22:49
@whatisgalen whatisgalen linked an issue May 24, 2025 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

speed improvement to bulk deletion of resources

3 participants