Skip to content

Conversation

@weihanglo
Copy link
Member

@weihanglo weihanglo commented Nov 28, 2025

What does this PR try to resolve?

This is an unstable feature that we designed to fix several performance problems with the old system:

  1. You couldn't easily build crate docs in hermetic environments. This doesn't matter for Cargo, but it was one of the original reasons to implement the feature.
  2. We have to build all the doc resources in their final form at every step, instead of delaying slow parts (mostly the search index) until the end and only doing them once.
  3. It requires rustdoc to take a lock at the end. This reduces available concurrency for generating docs.

Part of

How to test and review this PR?

Design decisions and questions:

Simple benchmark

  • Tested against this PR on top of 15fde07
  • rustdoc 1.93.0-nightly (1be6b13be 2025-11-26)
  • macOS Sequoia 15.7.1
  • Apple M1 Pro 2021
  • 32GB RAM; 10 cores
  • 300 crate doc merge
  • Test steps:
     cargo check
     cargo doc
     cargo clean --doc && rm -rf target/debug/build/**/deps/*.json
     cargo doc -Zrustdoc-mergeable-info
     cargo clean --doc && rm -rf target/debug/build/**/deps/*.json

With -Zrustdoc-mergeable-info:

  • <40s: ~30s documenting time + 3-5s merge time
  • 76262 files, 970.5MiB total

Without -Zrustdoc-mergeable-info:

  • ~500s: 8m10s - 8m30s
  • 75242 files, 932.8MiB total

@rustbot
Copy link
Collaborator

rustbot commented Nov 28, 2025

r? @epage

rustbot has assigned @epage.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added A-build-execution Area: anything dealing with executing the compiler A-cfg-expr Area: Platform cfg expressions A-documenting-cargo-itself Area: Cargo's documentation A-layout Area: target output directory layout, naming, and organization A-testing-cargo-itself Area: cargo's tests A-unstable Area: nightly unstable support Command-doc S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 28, 2025
assert!(unit.mode.is_doc());
assert!(self.metas.contains_key(unit));
let dir = self.pkg_dir(unit);
self.layout(unit.kind).build_dir().doc_parts(&dir)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cargo clean need to take care of this. I don't because there will be conflicts and I'd rather waiting for them merging.

cmd.arg("-o")
.arg(out_dir)
.arg("-Zunstable-options")
.arg("--merge=finalize");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't add --emit=toolchain-shared-resources here. See https://github.com/rust-lang/cargo/pull/16167/files#r2570359518.

}

/// Returns the directory where mergeable cross crate info for docs is stored.
pub fn doc_parts_dir(&self, unit: &Unit) -> PathBuf {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A new intermediate build directory target/debug/doc-parts/. Doc parts of each doc unit will be put in a sub-directory of <crate>-<hash>/ for artifact isolation and rebuild detection purpose. Example full path to foo crate-info JSON file: target/debug/doc-parts/foo-12ab34cd/foo.json.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this fit in with the effort to re-organize build-dir?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#16309 (comment)
It is now a shared directory for all invocation to achieve the additive nature of rustdoc output.

How does this fit in with the effort to re-organize build-dir?

Given rustdoc immediate artifacts are pretty tied to its final arifacts, maybe we should have a package local private build dir?

The other way is to have a <build-dir>/<workspace-hash>/docdeps so it is not shared with other workspaces.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One alternative: <build-dir>/docdeps/<crate-name>/<hash>/crate-info.json

So that new layout can still be a per-unit layout.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest commit 93ab48d always uses new layout for doc parts files on per-unit basis. It would look like <build-dir>/target/debug/build/foo-12ab34cd/deps/foo.json

The commits also starts storing information of previous rustdoc invocations in .rustdoc_fingerprint.json file.

@weihanglo weihanglo force-pushed the rustdoc-mergeable-info branch from 15fde07 to 8b5d4d8 Compare November 28, 2025 03:23

if ws.gctx().cli_unstable().rustdoc_mergeable_info {
let now = std::time::Instant::now();
for (kind, doc_merge_info) in compilation.doc_merge_info.iter() {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • One doc-merge per target platform
  • and run in serial

I feel like these design decisions are quite straightforwards

  • Cargo already organizes docs by target platforms.
  • Multi-target is not common. If it becomes a performance issue, we can simply run in parallel.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Putting this in the build graph would also solve this.

@weihanglo weihanglo force-pushed the rustdoc-mergeable-info branch from 8cf3ec7 to 93ab48d Compare December 3, 2025 14:56
@rustbot rustbot added the A-rebuild-detection Area: rebuild detection and fingerprinting label Dec 3, 2025
@rustbot

This comment has been minimized.

github-merge-queue bot pushed a commit that referenced this pull request Dec 3, 2025
### What does this PR try to resolve?

Previously if rustc version mismatches,
Cargo removes all `doc` directories including target platforms that are
not part of the build.

This PR makes it `--target` aware and stops excessive cleanup,
by putting `.rustdoc_fingerprint.json` in each target platform
directory.

### How to test and review this PR?

I'd like to reuse this file to track doc parts directories in
<#16309>,
and noticed that this file is for the entire workspace rather than
per-target,
hich is not compatible with mergeable cross-crate info JSONs.

For concurrent safety,
`build-dir` should be locked already since Cargo locks every intent
except check (see <#16307>).
This file is touched only under `UserIntent::Doc`.
Co-authored-by: Michael Howell <[email protected]>
Co-authored-by: Weihang Lo <[email protected]>
@weihanglo weihanglo force-pushed the rustdoc-mergeable-info branch from 93ab48d to 80bf923 Compare December 3, 2025 18:35
@rustbot
Copy link
Collaborator

rustbot commented Dec 3, 2025

This PR was rebased onto a different master commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

@weihanglo weihanglo force-pushed the rustdoc-mergeable-info branch from 80bf923 to 07c4885 Compare December 3, 2025 19:45
Copy link
Member Author

@weihanglo weihanglo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is ready for another round of review.

View changes since this review

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commented on a random file to make it a thread

cc @notriddle

I've discovered some panics in rustdoc. Here are reproducible steps (you need to build cargo from source from this PR):

rustup run nightly target/release/cargo check -Zrustdoc-mergeable-info -p cargo

# `--merge=finalize` took 3secs in this call
rustup run nightly target/release/cargo doc -Zrustdoc-mergeable-info -p cargo

# `--merge=finalize` took 1mins in this call
rustup run nightly target/release/cargo doc -Zrustdoc-mergeable-info -p cargo-utils-schemas

# `rustdoc --merge=none` failed on random crates
rustup run nightly target/release/cargo doc -Zrustdoc-mergeable-info -p cargo

Rustdoc version:

rustdoc 1.93.0-nightly (646a3f8c1 2025-12-02)

I feel like it is search index on disk are not prepared for reuse across multiuple --merge=finalize. The code panicked at https://github.com/rust-lang/rust/blob/83e49b75e7daf827e4390ae0ccbcb0d0e2c96493/src/librustdoc/html/render/search_index.rs#L1347.

Also, mind the 1min slow merge in the second cargo doc call.

click to see the log
Caused by:
  process didn't exit successfully: `/Users/user/.rustup/toolchains/nightly-aarch64-apple-darwin/bin/rustdoc --edition=2018 --crate-type lib --crate-name smallvec /Users/user/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/smallvec-1.15.1/src/lib.rs --cap-lints allow -o /Users/user/dev/cargo-master/target/doc --cfg 'feature="const_generics"' --cfg 'feature="write"' --check-cfg 'cfg(docsrs,test)' --check-cfg 'cfg(feature, values("arbitrary", "bincode", "const_generics", "const_new", "debugger_visualizer", "drain_filter", "drain_keep_rest", "impl_bincode", "malloc_size_of", "may_dangle", "serde", "specialization", "union", "unty", "write"))' --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --diagnostic-width=177 --emit=invocation-specific -Zunstable-options --merge=none --parts-out-dir=/Users/user/dev/cargo-master/target/debug/build/smallvec-8f6fc8f61133a3c6/deps -C metadata=39fc2b80d6a50c9c -L dependency=/Users/user/dev/cargo-master/target/debug/deps --crate-version 1.15.1` (exit status: 101)

thread 'rustc' (107763199) panicked at src/librustdoc/html/render/search_index.rs:1347:55:
index out of bounds: the len is 7516184 but the index is 7526966
stack backtrace:
   0:        0x10f1930ac - <<std[d4ca2ac3ebc8103a]::sys::backtrace::BacktraceLock>::print::DisplayBacktrace as core[b5ad265dbd7a4861]::fmt::Display>::fmt
   1:        0x10c65a8d4 - core[b5ad265dbd7a4861]::fmt::write
   2:        0x10f1a9f78 - <std[d4ca2ac3ebc8103a]::sys::stdio::unix::Stderr as std[d4ca2ac3ebc8103a]::io::Write>::write_fmt
   3:        0x10f16a658 - std[d4ca2ac3ebc8103a]::panicking::default_hook::{closure#0}
   4:        0x10f185cdc - std[d4ca2ac3ebc8103a]::panicking::default_hook
   5:        0x10d1c1720 - std[d4ca2ac3ebc8103a]::panicking::update_hook::<alloc[1bda16a0b8fec711]::boxed::Box<rustc_driver_impl[78050a7ccd3dc9d6]::install_ice_hook::{closure#1}>>::{closure#0}
   6:        0x10f186040 - std[d4ca2ac3ebc8103a]::panicking::panic_with_hook
   7:        0x10f16a700 - std[d4ca2ac3ebc8103a]::panicking::panic_handler::{closure#0}
   8:        0x10f15f43c - std[d4ca2ac3ebc8103a]::sys::backtrace::__rust_end_short_backtrace::<std[d4ca2ac3ebc8103a]::panicking::panic_handler::{closure#0}, !>
   9:        0x10f16bd18 - __rustc[2b496e5da265df0c]::rust_begin_unwind
  10:        0x111edf55c - core[b5ad265dbd7a4861]::panicking::panic_fmt
  11:        0x111edf370 - core[b5ad265dbd7a4861]::panicking::panic_bounds_check
  12:        0x1008406a0 - rustdoc[7ccbdf937b02d493]::html::render::write_shared::write_shared
  13:        0x100783690 - <rustdoc[7ccbdf937b02d493]::html::render::context::Context>::init
  14:        0x100717c64 - rustdoc[7ccbdf937b02d493]::main_args::{closure#2}::{closure#0}
  15:        0x10070e5bc - rustc_interface[2208a15c44c71837]::interface::run_compiler::<(), rustdoc[7ccbdf937b02d493]::main_args::{closure#2}>::{closure#1}
  16:        0x10067df54 - std[d4ca2ac3ebc8103a]::sys::backtrace::__rust_begin_short_backtrace::<rustc_interface[2208a15c44c71837]::util::run_in_thread_with_globals<rustc_interface[2208a15c44c71837]::util::run_in_thread_pool_with_globals<rustc_interface[2208a15c44c71837]::interface::run_compiler<(), rustdoc[7ccbdf937b02d493]::main_args::{closure#2}>::{closure#1}, ()>::{closure#0}, ()>::{closure#0}::{closure#0}, ()>
  17:        0x10073be30 - <std[d4ca2ac3ebc8103a]::thread::lifecycle::spawn_unchecked<rustc_interface[2208a15c44c71837]::util::run_in_thread_with_globals<rustc_interface[2208a15c44c71837]::util::run_in_thread_pool_with_globals<rustc_interface[2208a15c44c71837]::interface::run_compiler<(), rustdoc[7ccbdf937b02d493]::main_args::{closure#2}>::{closure#1}, ()>::{closure#0}, ()>::{closure#0}::{closure#0}, ()>::{closure#1} as core[b5ad265dbd7a4861]::ops::function::FnOnce<()>>::call_once::{shim:vtable#0}
  18:        0x10f190784 - <std[d4ca2ac3ebc8103a]::sys::thread::unix::Thread>::new::thread_start
  19:        0x1828ffbc8 - __pthread_cond_wait

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@weihanglo weihanglo force-pushed the rustdoc-mergeable-info branch 2 times, most recently from 02db666 to f2bd8dd Compare December 4, 2025 04:47
Comment on lines +146 to +147
// rustdoc needs to read doc parts files from build dir
build_dir.open_ro_shared_create(".cargo-lock", ws.gctx(), "build directory")?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-blocker: technically there is a race condition between the build and this but the window is small and likely the users fault if they have a cargo clean running in parallel, blocked on the build.

This would be solved by us moving the merge into the build graph

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that was noticed but agreed it is user's fault. Tracked in the issue

This is an unstable feature that we designed to fix several
performance problems with the old system:

1. You couldn't easily build crate docs in hermetic environments.
   This doesn't matter for Cargo, but it was one of the original
   reasons to implement the feature.
2. We have to build all the doc resources in their final form at
   every step, instead of delaying slow parts (mostly the
   search index) until the end and only doing them once.
3. It requires rustdoc to take a lock at the end. This reduces
   available concurrency for generating docs.

A nightly feature `-Zrustdoc-mergeable-info` is added.

Co-authored-by: Michael Howell <[email protected]>
Co-authored-by: Weihang Lo <[email protected]>
@epage epage enabled auto-merge December 4, 2025 14:44
@epage epage added this pull request to the merge queue Dec 4, 2025
Merged via the queue into rust-lang:master with commit 1d3fb78 Dec 4, 2025
26 checks passed
@rustbot rustbot removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Dec 4, 2025
bors added a commit to rust-lang/rust that referenced this pull request Dec 5, 2025
Update cargo submodule

9 commits in bd979347d814dfe03bba124165dbce9554d0b4d8..2c283a9a5c5968eeb9a8f12313f04feb1ff8dfac
2025-12-02 16:03:50 +0000 to 2025-12-04 16:47:28 +0000
- fix(publish): Move `.crate` out of final artifact location (rust-lang/cargo#15915)
- Remove legacy tmpdir support (rust-lang/cargo#16342)
- Run clippy CI on more targets (rust-lang/cargo#16340)
- feat: support for rustdoc mergeable cross-crate info (rust-lang/cargo#16309)
- fix(timings): unlocked -&gt; unblocked (rust-lang/cargo#16337)
- fix(layout): Put examples in their unit dir in new layout (rust-lang/cargo#16335)
- fix(frontmatter): Restrict code fence length  (rust-lang/cargo#16334)
- Update resolver.md: fix compile errors in pseudocode (rust-lang/cargo#16333)
- fix(fingerprint): clean doc dirs for only requested targets (rust-lang/cargo#16331)

This is best being merged before beta cutoff because rust-lang/cargo#16337 is a nightly regression.
@rustbot rustbot added this to the 1.93.0 milestone Dec 5, 2025
github-actions bot pushed a commit to rust-lang/miri that referenced this pull request Dec 5, 2025
Update cargo submodule

9 commits in bd979347d814dfe03bba124165dbce9554d0b4d8..2c283a9a5c5968eeb9a8f12313f04feb1ff8dfac
2025-12-02 16:03:50 +0000 to 2025-12-04 16:47:28 +0000
- fix(publish): Move `.crate` out of final artifact location (rust-lang/cargo#15915)
- Remove legacy tmpdir support (rust-lang/cargo#16342)
- Run clippy CI on more targets (rust-lang/cargo#16340)
- feat: support for rustdoc mergeable cross-crate info (rust-lang/cargo#16309)
- fix(timings): unlocked -&gt; unblocked (rust-lang/cargo#16337)
- fix(layout): Put examples in their unit dir in new layout (rust-lang/cargo#16335)
- fix(frontmatter): Restrict code fence length  (rust-lang/cargo#16334)
- Update resolver.md: fix compile errors in pseudocode (rust-lang/cargo#16333)
- fix(fingerprint): clean doc dirs for only requested targets (rust-lang/cargo#16331)

This is best being merged before beta cutoff because rust-lang/cargo#16337 is a nightly regression.
@weihanglo weihanglo deleted the rustdoc-mergeable-info branch December 6, 2025 04:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-build-execution Area: anything dealing with executing the compiler A-cfg-expr Area: Platform cfg expressions A-documenting-cargo-itself Area: Cargo's documentation A-layout Area: target output directory layout, naming, and organization A-rebuild-detection Area: rebuild detection and fingerprinting A-testing-cargo-itself Area: cargo's tests A-unstable Area: nightly unstable support Command-doc

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants