[Bugfix][Connector-V2] Doris sink check load error before stopLoad to interrupt blocking poll() in RecordBuffer #10083
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Purpose of this pull request
Problem
The
stopBufferData()method inRecordBuffercan block indefinitely in awhileloop waiting for a buffer fromwriteQueue.poll(100ms)to send an EOF marker. ThecheckErrorMessageByStreamLoad()check inside the loop never receives an error signal to interrupt the wait, even when Doris FE has already returned an error response.flink jstack

Root Cause
Buffer pool exhaustion: All buffers are moved from
writeQueuetoreadQueueby the write thread, but the read thread (HTTP upload) doesn't consume them (e.g., due to Doris node restart or network issues). Without consumption, no buffers are recycled back towriteQueue.Error message not propagated: In
DorisStreamLoad.stopLoad(),loading = falseis set immediately, which preventsgetLoadFailedMsg()from checkingpendingLoadFutureand setting the error message. TheendInput()call blocks beforependingLoadFuture.get()can be reached to parse the error response.Circular dependency: Need EOF to finish → need buffer for EOF → buffer unavailable → need error to interrupt → error not propagated → infinite wait.
Solution
checkDone()before callingstopLoad()Does this PR introduce any user-facing change?
no
How was this patch tested?
Check list
New License Guide
incompatible-changes.mdto describe the incompatibility caused by this PR.