-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[improve][pip] PIP-429: Optimize Handling of Compacted Last Entry by Skipping Payload Buffer Parsing #24439
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[improve][pip] PIP-429: Optimize Handling of Compacted Last Entry by Skipping Payload Buffer Parsing #24439
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to make changes on the server side for several reasons.
- Broker can handle this issue well and provide a compatibility solution.
- Add a flag for the valid compacted position can avoid deserializing and uncompressing the entire batch messages metadata header.
- If making this change at the client-side, it increases the complexity for the client, and all language clients need to change.
- It's weird to return EntryBuffer when getting the last message ID.
- Actually, Pulsar only needs to retain valid message data in a compacted entry, but it retains all compacted messages with "empty header" and "empty payload".
Please check the
|
Yes. Unfortunately, it's limited by client side's logic that the default logic to parse the payload buffer looks like: int batchSize = msgMetadata.getNumMessagesInBatch();
for (int i = 0; i < batchSize; i++) {
int batchIndex = i;
final var singleMessageMetadata = parse(payload);
if (singleMessageMetadata.isCompactedOut()) {
break;
}
// Create a message, whose batch index is i, from the payload buffer
}
This makes sense to me. Adding a new field to |
If we purpose are to ensure that the compaction task is successful, we only need to check If we need make the Pulsar reader to read Kafka format data, then we need this change. |
No. We need to ensure the The consumer is able to configure a |
Co-authored-by: Penghui Li <[email protected]>
dao-jun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, good change!
…Skipping Payload Buffer Parsing (apache#24439)
…Skipping Payload Buffer Parsing (apache#24439)
Documentation
docdoc-requireddoc-not-neededdoc-completeMatching PR in forked repository
PR in forked repository: