Skip to content

Conversation

@qwang98
Copy link
Collaborator

@qwang98 qwang98 commented Nov 27, 2025

Reproduces the bug as described in this comment: powdr-labs/openvm#50 (comment)

Note that the CUDA memory error "very unrelatedly" happens on RangeTupleChecker.

It also non-deterministically happens in some runs only (might need to run guest_prove_simple a few times before it's encountered).

Comment on lines +260 to +268
// let mut output = DeviceMatrix::<BabyBear>::with_capacity(height, width);
use openvm_stark_backend::p3_field::FieldAlgebra;
let zeros = vec![BabyBear::ZERO; height * width];
let device_buffer = zeros
.to_device()
.expect("copy zero trace to device failed");
println!("output len: {}", device_buffer.len());
let mut output =
DeviceMatrix::<BabyBear>::new(Arc::new(device_buffer), height, width);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Zero'ing out the output buffer so that we are sure any illegal memory access isn't due to the cells being empty.

Comment on lines 230 to 232
if air_name == "VmAirWrapper<Rv32BaseAluAdapterAir, BaseAluCoreAir<4, 8>" {
return None;
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skip dummy trace generation.

Comment on lines 288 to 290
if *air_name == "VmAirWrapper<Rv32BaseAluAdapterAir, BaseAluCoreAir<4, 8>" {
return (airs, substitutions)
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skip creating Subst, so dummy trace aren't written to APC trace and we have an incorrect APC trace, which should normally panic in prover but in some runs it panics with CUDA memory access at RangeTupleCheckerChipGPU, which is exactly the bug this PR's tries to reproduce.

@qwang98
Copy link
Collaborator Author

qwang98 commented Dec 4, 2025

Close as fixed.

@qwang98 qwang98 closed this Dec 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants