Skip to content

Conversation

@damiendoligez
Copy link
Member

This PR is a prerequisite of #14324 .

Here is the original message for this patch:


Major GC work accounting fixes (oxcaml/oxcaml#3618)

The goal is to make the amount of "work" done by the major GC in a cycle match what the pacing logic expects:

  • Marking: work is the size of the reachable data
  • Sweeping: work is the size of the allocated data (reachable or not)

There are two fixes in this patch to make this closer to true:

  • Extra marking work done by caml_darken should count towards the total
  • Free blocks should not count for sweeping work

The former makes a noticeable difference on some weird programs, and a new test (darkening_work.ml) is added to demonstrate this. (It fails on all prior versions of OCaml, as far as I've tested). The latter doesn't really make much difference in practice, as the amount of free blocks in in-use pools tends to be low, but it makes the code closer to the theory.

------

Major GC work accounting fixes (oxcaml#3618)

The goal is to make the amount of "work" done by the major GC in a cycle match
what the pacing logic expects:

  - Marking: work is the size of the reachable data
  - Sweeping: work is the size of the allocated data (reachable or not)

There are two fixes in this patch to make this closer to true:

  - Extra marking work done by caml_darken should count towards the total
  - Free blocks should not count for sweeping work

The former makes a noticeable difference on some weird programs, and a new test
(darkening_work.ml) is added to demonstrate this. (It fails on all prior
versions of OCaml, as far as I've tested). The latter doesn't really make much
difference in practice, as the amount of free blocks in in-use pools tends to
be low, but it makes the code closer to the theory.
@damiendoligez damiendoligez changed the title This is a rebase of oxcaml/oxcaml#3618 This is a rebase of oxcaml#3618 Oct 30, 2025
@damiendoligez damiendoligez changed the title This is a rebase of oxcaml#3618 GC work accounting fixes (oxcaml#3618) Oct 30, 2025
@gasche
Copy link
Member

gasche commented Oct 31, 2025

@damiendoligez @stedolan I understand the small book-keeping fixes, but I have two related question:

  1. What is the reason for separating the tracking of "work done between slices" in two separate counters, one for marking work and one for sweeping work? From a distance it looks like both counters end up added to work_counter anyway. Is the fact that they are added at different times important?

  2. What is the intended usage discipline for work_done_between_slices() functions and commit_major_slice_work? From a distance it looks like you consistently ask mark_work_done_between_slices() before committing major slice marking work, but you do not consistently ask sweep_work_done_between_slices() before committing major slice sweepking work. I don't understand if this discrepancy is intended.

Naively I would expect one of the following:

a. commit_major_slice_work is in charge of adding both mark_work_done_between_slices and sweep_work_done_between_slices, so its callers do not need to take care of it, or

b. commit_major_slice_work takes an extra work_between_slices parameter, and the caller passes either mark_work_done_between_slices() or sweep_work_done_between_slices() (or 0 in some odd cases), so they are called consistently at the same moment and in the same way.

@stedolan
Copy link
Contributor

stedolan commented Nov 3, 2025

  1. What is the reason for separating the tracking of "work done between slices" in two separate counters, one for marking work and one for sweeping work? From a distance it looks like both counters end up added to work_counter anyway. Is the fact that they are added at different times important?

At the moment that's true. The next GC pacing patch uses separate units for mark work and sweep work, and separating the "between slices" counters here makes that easier.

  1. What is the intended usage discipline for work_done_between_slices() functions and commit_major_slice_work? From a distance it looks like you consistently ask mark_work_done_between_slices() before committing major slice marking work, but you do not consistently ask sweep_work_done_between_slices() before committing major slice sweepking work. I don't understand if this discrepancy is intended.

There are two places where mark_work_done_between_slices() is called outside of commit_major_slice_work:

      /* It is possible to call caml_darken directly during marking,
         if we e.g. discover a continuation and mark its stack.
         This work should count towards this slice */
      work_done += mark_work_done_between_slices();
...
          /* caml_darken is called by ephe_mark, so count the work it does */
          work_done += mark_work_done_between_slices();

These places are where some work has been incorrectly attributed to "work done between slices" despite being part of the slice, because of the structure of caml_darken (which is used both from inside and outside slices). Sweeping does not cause such situations.

Copy link
Member

@gasche gasche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am okay with the proposed changes, approved.

@gasche gasche merged commit 1a7c388 into ocaml:trunk Nov 3, 2025
32 of 33 checks passed
@damiendoligez damiendoligez deleted the gc-pacing-ephe-accounting branch November 3, 2025 16:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants