Add query limit for inactive users query #69178

juanmanzojr · 2025-10-29T19:03:41Z

This PR changes the use of the limit parameter in the InactiveUserDeleter job to prevent a large number of users from being deleted.

Previously, a dry run of the job caused total lockup of the machine it ran on because we have a lot of users considered inactive, when our query inactive_users.size ran. Removing that safety constraint and instead adding a break block within our loop ensures we no longer run this expensive query and enforce a limit within our loop as to how many users can be deleted.

This will only really occur for this first batch of inactive users since we have so many.

Follow up work: Bring back the safety constraint?

Links

Jira: here

carl-codeorg

Now that I look at it, having two different limit parameters feels confusing to me. What do you think about simplifying it to just having the single limit and removing the safety constraint limit? It also seems to me that if we're artificially limiting the query that would trigger the safety constraint to a number lower than what would actually trigger it, then the safety check becomes useless. What do you think?

carl-codeorg

Also, if we were already calling account_batch = inactive_users.limit(BATCH_SIZE), I wonder if the limit in the query itself would even help? Starting to second guess whether it's the query, or if the logic in the loop needs to be refactored.

juanmanzojr · 2025-10-29T20:18:50Z

Now that I look at it, having two different limit parameters feels confusing to me. What do you think about simplifying it to just having the single limit and removing the safety constraint limit? It also seems to me that if we're artificially limiting the query that would trigger the safety constraint to a number lower than what would actually trigger it, then the safety check becomes useless. What do you think?

Yeah i agree.
If we use just one limit, it effectively makes the safety constraint useless for sizes under 8,000. I know we mentioned an estimate as to how many inactive users to delete per run but I have forgotten the estimate. If we remove the safety constraint and rely on the query limit, you are suggesting then there would be really no point in account_batch = inactive_users.limit(BATCH_SIZE)? If we have a large limit say 100,000 would batching not be beneficial?

carl-codeorg · 2025-10-29T22:05:19Z

Now that I look at it, having two different limit parameters feels confusing to me. What do you think about simplifying it to just having the single limit and removing the safety constraint limit? It also seems to me that if we're artificially limiting the query that would trigger the safety constraint to a number lower than what would actually trigger it, then the safety check becomes useless. What do you think?

Yeah i agree. If we use just one limit, it effectively makes the safety constraint useless for sizes under 8,000. I know we mentioned an estimate as to how many inactive users to delete per run but I have forgotten the estimate. If we remove the safety constraint and rely on the query limit, you are suggesting then there would be really no point in account_batch = inactive_users.limit(BATCH_SIZE)? If we have a large limit say 100,000 would batching not be beneficial?

I think batching could still potentially be useful. I was more getting at the safety constraint not being beneficial if we have another limit on the query prior to batching. Before we process any users, the whole query would be millions - so if we use a safety limit that is reasonable, it'll always fail - and if we artificially limit the query to something much lower, say a limit of 100k per run, it'll never hit that safety limit. I was thinking that, at least for the initial large group of users, it'd more useful just to have one upper limit of the number of records processed per run of the job. But keep that limit low enough to avoid the resource consumption issue from the dry run last week.

But now I'm second guessing whether the limit approach is actually going to fix that issue.

artem-vavilov

Could you please share more details about this change? Was it made due to query timeouts or another reason?

artem-vavilov · 2025-11-03T15:24:24Z

dashboard/lib/inactive_user_deleter.rb

  # @param limit [Integer] The maximum number of accounts to delete in a single run.
  #   This is a safety limit to prevent accidental deletion of too many accounts.
-  def initialize(dry_run: false, inactive_since: nil, limit: ACCOUNT_DELETION_LIMIT)
+  def initialize(dry_run: false, inactive_since: nil, query_limit: nil)


What is the reason for renaming limit to query_limit?

We renamed limit to query_limit because we wanted to apply the limit directly to the query rather than as a separate safety constraint after fetching all inactive users. Previously, the safety limit (limit) was meant to prevent accidentally deleting too many accounts, but in practice, it wasn’t very effective, the initial query could still return millions of records before that limit was enforced.
This change also makes the safety constraint somewhat redundant, since we’re now controlling how many users can even be considered for deletion in the first place. I like the idea of having an upper limit for how many records are processed per run of the job.

artem-vavilov · 2025-11-03T15:33:41Z

dashboard/lib/inactive_user_deleter.rb

+      Queries::User::Inactive.
+      call(inactive_since: inactive_since).
+      where.not(id: processed_user_ids).
+      limit(@query_limit)


The query_limit will be overridden by the batch_size limit, meaning it has no real effect, and all matched accounts will be deleted in batches of 1,000 records:

code-dot-org/dashboard/lib/inactive_user_deleter.rb

Lines 59 to 72 in f62dce7

loop do

account_batch = inactive_users.limit(BATCH_SIZE)

account_batch.each do |user|

delete_user(user)

self.num_accounts_deleted += 1

rescue StandardError => exception

self.num_errors += 1

Honeybadger.notify(exception, context: {user_id: user.id})

log_message("Error deleting user_id #{user.id}: #{exception.message}")

ensure

processed_user_ids << user.id

end

break if account_batch.size < BATCH_SIZE

end

We need to add another break condition based on num_accounts_deleted in the loop:

break if num_accounts_deleted >= query_limit

juanmanzojr · 2025-11-07T18:24:35Z

Could you please share more details about this change? Was it made due to query timeouts or another reason?

Yeah, because the initial batch of inactive users is so large, when we run this query it locked up the machine it ran on. We need a way at least the first go time around to limit the amount of users we process per run, since total_size = inactive_users.size
if total_size > limit was already running the query. At least for the initial batch of inactive users, the limit constraint will not be very useful.

Add query limit for inactive users

2cd82b0

juanmanzojr requested a review from carl-codeorg October 29, 2025 19:03

carl-codeorg reviewed Oct 29, 2025

View reviewed changes

juanmanzojr requested a review from carl-codeorg October 29, 2025 21:45

WIP

f62dce7

artem-vavilov reviewed Nov 3, 2025

View reviewed changes

juanmanzojr added 2 commits November 6, 2025 16:49

Use user account deletion limit in inactive deleter

a8b4c62

Use user account deletion limit in inactive deleter

9272d2e

juanmanzojr requested review from Copilot and removed request for carl-codeorg and Copilot November 7, 2025 08:33

Add break statements

e3350a3

juanmanzojr requested a review from artem-vavilov November 7, 2025 18:24

Reuse limit argument

27355aa

juanmanzojr requested a review from carl-codeorg November 7, 2025 21:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add query limit for inactive users query #69178

Add query limit for inactive users query #69178

Uh oh!

juanmanzojr commented Oct 29, 2025 •

edited

Loading

Uh oh!

carl-codeorg left a comment

Uh oh!

carl-codeorg left a comment

Uh oh!

juanmanzojr commented Oct 29, 2025 •

edited

Loading

Uh oh!

carl-codeorg commented Oct 29, 2025

Uh oh!

artem-vavilov left a comment

Uh oh!

artem-vavilov Nov 3, 2025

Uh oh!

juanmanzojr Nov 6, 2025

Uh oh!

artem-vavilov Nov 3, 2025 •

edited

Loading

Uh oh!

juanmanzojr commented Nov 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	loop do
	account_batch = inactive_users.limit(BATCH_SIZE)
	account_batch.each do \|user\|
	delete_user(user)
	self.num_accounts_deleted += 1
	rescue StandardError => exception
	self.num_errors += 1
	Honeybadger.notify(exception, context: {user_id: user.id})
	log_message("Error deleting user_id #{user.id}: #{exception.message}")
	ensure
	processed_user_ids << user.id
	end
	break if account_batch.size < BATCH_SIZE
	end

Add query limit for inactive users query #69178

Are you sure you want to change the base?

Add query limit for inactive users query #69178

Uh oh!

Conversation

juanmanzojr commented Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Links

Uh oh!

carl-codeorg left a comment

Choose a reason for hiding this comment

Uh oh!

carl-codeorg left a comment

Choose a reason for hiding this comment

Uh oh!

juanmanzojr commented Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

carl-codeorg commented Oct 29, 2025

Uh oh!

artem-vavilov left a comment

Choose a reason for hiding this comment

Uh oh!

artem-vavilov Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

juanmanzojr Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

artem-vavilov Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

juanmanzojr commented Nov 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

juanmanzojr commented Oct 29, 2025 •

edited

Loading

juanmanzojr commented Oct 29, 2025 •

edited

Loading

artem-vavilov Nov 3, 2025 •

edited

Loading