ThreadId generation fallback path: avoid spurious yields#149481
ThreadId generation fallback path: avoid spurious yields#149481bors merged 1 commit intorust-lang:mainfrom
Conversation
|
r? @ChrisDenton rustbot has assigned @ChrisDenton. Use |
library/std/src/thread/id.rs
Outdated
| while COUNTER_LOCKED.compare_exchange_weak(false, true, Ordering::Acquire, Ordering::Relaxed).is_err() { | ||
| // Miri doesn't like it when we yield here as it interferes with deterministically | ||
| // scheduling threads, so use `compare_exchange` to avoid spurious yields. | ||
| while COUNTER_LOCKED.compare_exchange(false, true, Ordering::Acquire, Ordering::Relaxed).is_err() { |
There was a problem hiding this comment.
In this case, I'd use swap(true) here, as it can generate better assembly on platforms that don't have a CAS instruction, but do have swap (like RISC-V, even though it always has 64-bit atomics if I understand correctly).
bc59557 to
1fccfa6
Compare
|
Like this? |
|
Yes, exactly! |
ThreadId generation fallback path: avoid spurious yields Fixes rust-lang/miri#4737 Alternative to rust-lang#149476 Cc `@orlp` `@joboet`
ThreadId generation fallback path: avoid spurious yields Fixes rust-lang/miri#4737 Alternative to #149476 Cc `@orlp` `@joboet`
This comment has been minimized.
This comment has been minimized.
|
💔 Test failed - checks-actions |
1fccfa6 to
14830bf
Compare
|
I build-checked it this time. 🙈 @bors r=joboet |
|
@joboet If there is interest, I also have this lock-free version: Details#![feature(allocator_api)]
use std::alloc::System;
use std::ptr;
use std::sync::atomic::{AtomicPtr, AtomicUsize, Ordering};
struct State {
base_value: u64,
increment: AtomicUsize,
prev: AtomicPtr<State>,
}
impl State {
fn new_ptr(base_value: u64, prev: *mut State) -> *mut State {
Box::into_raw_with_allocator(Box::new_in(
State {
base_value,
increment: AtomicUsize::new(0),
prev: AtomicPtr::new(prev),
},
System,
))
.0
}
unsafe fn drop_ptr(mut state: *mut State) {
unsafe {
while state != ptr::null_mut() {
let next = (*state).prev.load(Ordering::Acquire);
unsafe { drop(Box::from_raw_in(state, System)) };
state = next;
}
}
}
}
// A lock-free 64-bit counter using just AtomicPtr and AtomicUsize.
pub struct LockFreeU64Counter {
state: AtomicPtr<State>,
in_progress_increments: AtomicUsize,
}
impl LockFreeU64Counter {
pub fn new() -> Self {
LockFreeU64Counter {
state: AtomicPtr::new(State::new_ptr(0, ptr::null_mut())),
in_progress_increments: AtomicUsize::new(0),
}
}
/// Increments the counter and returns the old value.
///
/// Returns None if the counter has reached its maximum value.
pub fn increment(&self) -> Option<u64> {
self.in_progress_increments.fetch_add(1, Ordering::Relaxed);
loop {
let current_state_ptr = self.state.load(Ordering::Acquire);
let current_state = unsafe { &*current_state_ptr };
let base_value = current_state.base_value;
let increment = current_state.increment.load(Ordering::Relaxed);
let cur_value = base_value.checked_add(increment as u64)?;
if increment == usize::MAX {
let new_state_ptr = State::new_ptr(cur_value, current_state_ptr);
let cx = self.state.compare_exchange(
current_state_ptr,
new_state_ptr,
Ordering::AcqRel,
Ordering::Relaxed,
);
if cx.is_err() {
unsafe {
(*new_state_ptr).prev.store(ptr::null_mut(), Ordering::Relaxed);
State::drop_ptr(new_state_ptr)
};
}
continue;
}
let cx = current_state.increment.compare_exchange(
increment,
increment + 1,
Ordering::Relaxed,
Ordering::Relaxed,
);
if cx.is_err() {
continue;
}
if self.in_progress_increments.fetch_sub(1, Ordering::AcqRel) == 1 {
if current_state.prev.load(Ordering::Relaxed) != ptr::null_mut() {
let prev = current_state.prev.swap(ptr::null_mut(), Ordering::Acquire);
unsafe { State::drop_ptr(prev) };
}
}
return Some(cur_value);
}
}
}
impl Drop for LockFreeU64Counter {
fn drop(&mut self) {
unsafe { State::drop_ptr(self.state.load(Ordering::Relaxed)) };
}
}EDIT: as ralf points out this has a bug, working on a fix. |
|
@orlp I think that has a potential UAF since you are using |
|
@RalfJung I had already spotted another bug, but yes that is also an issue, let me fix that. |
|
@RalfJung Please see the new implementation of the lock-free counter, which has a completely separate freelist: Details#![feature(allocator_api)]
use std::alloc::System;
use std::cell::UnsafeCell;
use std::ptr;
use std::sync::atomic::{AtomicPtr, AtomicUsize, Ordering};
struct State {
base_value: u64,
increment: AtomicUsize,
// UnsafeCell because the freelist accesses this field exclusively while
// the rest of the state may still be accessed concurrently.
next_free: UnsafeCell<*mut State>,
}
impl State {
fn new_ptr(base_value: u64) -> *mut State {
Box::into_raw_with_allocator(Box::new_in(
State {
base_value,
increment: AtomicUsize::new(0),
next_free: UnsafeCell::new(ptr::null_mut()),
},
System,
))
.0
}
unsafe fn drop_ptr(state: *mut State) {
unsafe { drop(Box::from_raw_in(state, System)); }
}
}
// A lock-free 64-bit counter using just AtomicPtr and AtomicUsize.
pub struct LockFreeU64Counter {
state: AtomicPtr<State>,
in_progress_increments: AtomicUsize,
free_head: AtomicPtr<State>,
}
impl LockFreeU64Counter {
pub fn new() -> Self {
LockFreeU64Counter {
state: AtomicPtr::new(State::new_ptr(0)),
in_progress_increments: AtomicUsize::new(0),
free_head: AtomicPtr::new(ptr::null_mut()),
}
}
/// Increments the counter and returns the old value.
///
/// Returns None if the counter has reached its maximum value.
pub fn increment(&self) -> Option<u64> {
// Acquire to ensure we don't see an old state that was reclaimed.
self.in_progress_increments.fetch_add(1, Ordering::Acquire);
loop {
let current_state_ptr = self.state.load(Ordering::Acquire);
let current_state = unsafe { &*current_state_ptr };
let base_value = current_state.base_value;
let increment = current_state.increment.load(Ordering::Acquire);
let cur_value = base_value.checked_add(increment as u64)?;
if increment == usize::MAX {
let new_state_ptr = State::new_ptr(cur_value);
let cx = self.state.compare_exchange(
current_state_ptr,
new_state_ptr,
Ordering::AcqRel,
Ordering::Relaxed,
);
match cx {
Ok(_) => unsafe { self.add_to_freelist(current_state_ptr) },
Err(_) => unsafe { State::drop_ptr(new_state_ptr) },
}
continue;
}
let cx = current_state.increment.compare_exchange(
increment,
increment + 1,
Ordering::Release,
Ordering::Relaxed,
);
if cx.is_err() {
continue;
}
if self.in_progress_increments.fetch_sub(1, Ordering::AcqRel) == 1 {
// Pass base_value as the limit to avoid reclaiming states that were added after
// we established in_progress_increments == 0, which can still be accessed.
unsafe { self.try_reclaim_freelist(base_value) };
}
return Some(cur_value);
}
}
unsafe fn try_reclaim_freelist(&self, base_value_limit: u64) {
if self.free_head.load(Ordering::Relaxed) == ptr::null_mut() {
return;
}
unsafe {
let mut head = self.free_head.swap(ptr::null_mut(), Ordering::Acquire);
let mut filtered_head = ptr::null_mut();
while head != ptr::null_mut() {
let next = *(*head).next_free.get_mut();
if (*head).base_value < base_value_limit {
State::drop_ptr(head);
head = next;
} else {
*(*head).next_free.get_mut() = filtered_head;
filtered_head = head;
head = next;
}
}
if filtered_head != ptr::null_mut() {
self.add_to_freelist(filtered_head);
}
}
}
unsafe fn add_to_freelist(&self, state: *mut State) {
unsafe {
let mut state_tail = state;
while *(*state_tail).next_free.get_mut() != ptr::null_mut() {
state_tail = *(*state_tail).next_free.get_mut();
}
let mut head = self.free_head.load(Ordering::Acquire);
loop {
*(*state_tail).next_free.get_mut() = head;
match self.free_head.compare_exchange(
head,
state,
Ordering::Release,
Ordering::Acquire,
) {
Ok(_) => break,
Err(new_head) => head = new_head,
}
}
}
}
}
impl Drop for LockFreeU64Counter {
fn drop(&mut self) {
unsafe {
let mut head = *self.free_head.get_mut();
while head != ptr::null_mut() {
let next = *(*head).next_free.get_mut();
State::drop_ptr(head);
head = next;
}
State::drop_ptr(*self.state.get_mut());
}
}
} |
ThreadId generation fallback path: avoid spurious yields Fixes rust-lang/miri#4737 Alternative to #149476 Cc `@orlp` `@joboet`
|
💥 Test timed out |
|
@bors retry |
|
☀️ Test successful - checks-actions |
What is this?This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.Comparing dfe1b8c (parent) -> 9b82a4f (this PR) Test differencesShow 2 test diffs2 doctest diffs were found. These are ignored, as they are noisy. Test dashboardRun cargo run --manifest-path src/ci/citool/Cargo.toml -- \
test-dashboard 9b82a4fffe2b215f488a6dfdc0b508235f37c85c --output-dir test-dashboardAnd then open Job duration changes
How to interpret the job duration changes?Job durations can vary a lot, based on the actual runner instance |
|
Finished benchmarking commit (9b82a4f): comparison URL. Overall result: ✅ improvements - no action needed@rustbot label: -perf-regression Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary 2.5%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (secondary 2.1%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (primary 0.0%, secondary 0.0%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 472.475s -> 474.873s (0.51%) |
ThreadId generation fallback path: avoid spurious yields Fixes rust-lang/miri#4737 Alternative to rust-lang#149476 Cc `@orlp` `@joboet`
Fixes rust-lang/miri#4737
Alternative to #149476
Cc @orlp @joboet