Skip to content

Conversation

@yanglimingcn
Copy link
Contributor

What problem does this PR solve?

Issue Number: #3132

Problem Summary:

What is changed and the side effects?

Changed:

Side effects:

  • Performance effects:

  • Breaking backward compatibility:


Check List:

@yanglimingcn yanglimingcn force-pushed the bugfix/pending_signaled_wwrs_checking branch 2 times, most recently from 3fb155a to bb827cc Compare November 4, 2025 11:42
@wwbmmm wwbmmm requested a review from Copilot November 5, 2025 02:14
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements tracking of pending signaled RDMA send work requests (WRs) to optimize window management. The changes add a counter for signaled WRs and conditionally wake up the writing thread based on this counter to prevent premature thread switching.

  • Adds _pending_signaled_wrs counter to track signaled send WRs
  • Introduces WR ID generation for RDMA work requests
  • Updates window wake-up logic to consider pending signaled WRs

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
src/brpc/rdma/rdma_endpoint.h Adds atomic counter for pending signaled WRs
src/brpc/rdma/rdma_endpoint.cpp Implements WR ID generation, counter management, and conditional window wake-up logic

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

switch (wc.opcode) {
case IBV_WC_SEND: { // send completion
// Do nothing
if (wc.wr_id == 0) {
Copy link

Copilot AI Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition checks if wc.wr_id == 0, but GenerateWrId() explicitly generates non-zero IDs (comment at line 930 states '0 is an invalid Id'). The signaled WRs in CutFromIOBufList at line 880 don't set wr.wr_id, leaving it as 0 after memset at line 808. This logic appears inverted - it should decrement when wc.wr_id != 0 (for SendImm) or the counter should be incremented for signaled sends in CutFromIOBufList that have wr_id == 0.

Suggested change
if (wc.wr_id == 0) {
if (wc.wr_id != 0) {

Copilot uses AI. Check for mistakes.
Copy link
Contributor

@chenBright chenBright left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR causes the RdmaTest.send_rpcs_in_one_qp of brpc_rdma_unittest.cpp to fail.

uint32_t wnd_thresh = _local_window_capacity / 8;
if (_window_size.fetch_add(acks, butil::memory_order_relaxed) >= wnd_thresh
|| acks >= wnd_thresh) {
if (_pending_signaled_wrs.load(butil::memory_order_relaxed) <= 1 &&
Copy link
Contributor

@chenBright chenBright Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When _pending_signaled_wrs is greater than 1, _window_size is not updated, and acks will be lost.


ibv_send_wr wr;
memset(&wr, 0, sizeof(wr));
wr.wr_id = GenerateWrId();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be sufficient to set wr_id to a fixed value for Imm?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, just use a fixed value

@yanglimingcn yanglimingcn force-pushed the bugfix/pending_signaled_wwrs_checking branch 6 times, most recently from d31e8c3 to f7f03e6 Compare November 5, 2025 08:39
return -1;
}

if (signaled) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this line, I'm not sure if send cqe can generate energy after this atomic operation.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

// The number of pending signaled send WRs
butil::atomic<uint16_t> _pending_signaled_wrs;
// The number of pending acks
uint16_t _pending_acks;
Copy link

Copilot AI Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _pending_acks member is not thread-safe (plain uint16_t instead of atomic). In polling mode, multiple poller threads can call HandleCompletion concurrently for the same endpoint, leading to race conditions on lines 997, 1016, and 1018 where this variable is read and written without synchronization.

Suggested change
uint16_t _pending_acks;
butil::atomic<uint16_t> _pending_acks;

Copilot uses AI. Check for mistakes.

ssize_t RdmaEndpoint::HandleCompletion(ibv_wc& wc) {
bool zerocopy = FLAGS_rdma_recv_zerocopy;
uint16_t pending_signaled_wrs = 2;
Copy link

Copilot AI Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable pending_signaled_wrs is initialized to 2, which appears to be a magic number. This initialization value doesn't reflect the actual number of pending signaled WRs and relies on being overwritten in the IBV_WC_SEND case. Consider initializing it to the actual value from _pending_signaled_wrs or using a constant with a descriptive name to clarify its purpose as a sentinel value.

Suggested change
uint16_t pending_signaled_wrs = 2;
uint16_t pending_signaled_wrs = _pending_signaled_wrs.load(butil::memory_order_relaxed);

Copilot uses AI. Check for mistakes.
@yanglimingcn yanglimingcn force-pushed the bugfix/pending_signaled_wwrs_checking branch 4 times, most recently from 583a2af to 078e868 Compare November 8, 2025 04:12
@yanglimingcn yanglimingcn force-pushed the bugfix/pending_signaled_wwrs_checking branch from 078e868 to ab7debe Compare November 8, 2025 12:26
Copy link
Contributor

@chenBright chenBright left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use two windows: one representing the local sending window size and the other representing the remote receiving window size. Sending should only occur if both window values ​​are greater than 0.

See #3145 . We encountered the same problem, and the above method solved it.

<< wc.opcode;
return -1;
}
if (pending_signaled_wrs <= 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that _window_size may still be larger than the actual sq size.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you give an example to describe it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After the client sends _local_window_capacity / 4 + 1 requests, it receives an ACK and a send CQE from the server. At this point, _window_size increases by _local_window_capacity / 4 + 1, and sq size increases by _local_window_capacity / 4.

The issue of inconsistency between _window_size and sq size still exists.

@yanglimingcn
Copy link
Contributor Author

I think we should use two windows: one representing the local sending window size and the other representing the remote receiving window size. Sending should only occur if both window values ​​are greater than 0.

See #3145 . We encountered the same problem, and the above method solved it.

Yes, I think your PR is better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants