-
Notifications
You must be signed in to change notification settings - Fork 23
:fix: improve diff line ending filtering to handle CRLF changes #930
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
:fix: improve diff line ending filtering to handle CRLF changes #930
Conversation
- Enhanced filterLineEndingOnlyChanges to handle block-style diffs where all removed lines appear before added lines - Updated isOnlyLineEndingDiff to handle special diff markers like '\ No newline at end of file' - Added combineIdenticalTrimmedLines function to reduce noise from whitespace-only changes - Added cleanDiff utility that combines all filtering strategies - Added ignoreNewlineAtEof option to diff creation to prevent trailing newline diffs - Improved hasNoMeaningfulDiffContent for better performance These changes prevent noisy diffs when files only differ in line endings or trailing whitespace, particularly important for cross-platform development where CRLF/LF differences are common. Signed-off-by: Ian Bolton <[email protected]>
d28135d to
d9740bd
Compare
Walkthrough
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
shared/src/utils/diffUtils.ts (1)
15-90: Fix pairing logic and hunk-awareness inisOnlyLineEndingDiff(block diffs misclassified).Current logic pairs only the first “-” with the next “+” and skips intervening “-” lines, then resumes at a “+” (Line 86), causing false negatives on common “---+++” blocks and ignoring hunk boundaries. Also, when a diff has no +/- lines (e.g., rename-only), it should not return true unless it’s solely the EOF marker.
Proposed hunk-aware implementation that:
- Pairs removes/adds within each hunk in order.
- Ignores “\ No newline at end of file”.
- Returns false for diffs with no +/- unless the EOF marker is present.
Apply:
export function isOnlyLineEndingDiff(unifiedDiff: string): boolean { - const lines = unifiedDiff.split("\n"); - const changeLines: string[] = []; - const specialMarkers: string[] = []; - - // Collect all +/- lines and special markers - for (const line of lines) { - // Skip diff headers and context markers - if ( - line.startsWith("diff ") || - line.startsWith("index ") || - line.startsWith("--- ") || - line.startsWith("+++ ") || - line.startsWith("@@") || - line.startsWith(" ") - ) { - continue; - } - - // Collect special markers (e.g., "\ No newline at end of file") - if (line.startsWith("\\")) { - specialMarkers.push(line); - continue; - } - - // Collect actual change lines - if (line.startsWith("+") || line.startsWith("-")) { - changeLines.push(line); - } - } - - // If no changes, not a line ending diff - if (changeLines.length === 0) { - // Check if only special markers exist (which might indicate line ending differences) - return specialMarkers.some((marker) => marker.includes("No newline at end of file")); - } - - // Process changes to check if they're only line ending differences - let i = 0; - while (i < changeLines.length) { - const removedLine = changeLines[i]; - - // Must start with - - if (!removedLine.startsWith("-")) { - return false; - } - - // Find the corresponding + line (might not be immediately after) - let j = i + 1; - while (j < changeLines.length && changeLines[j].startsWith("-")) { - j++; - } - - if (j >= changeLines.length || !changeLines[j].startsWith("+")) { - return false; // No matching + line found - } - - const addedLine = changeLines[j]; - const removedContent = removedLine.substring(1); - const addedContent = addedLine.substring(1); - - // Normalize and compare, handling various line ending representations - const normalizedRemoved = normalizeLineEndings(removedContent).trimEnd(); - const normalizedAdded = normalizeLineEndings(addedContent).trimEnd(); - - // If content differs after normalization, it's not just a line ending change - if (normalizedRemoved !== normalizedAdded) { - return false; - } - - // Move to next unprocessed line - i = j + 1; - } - - return true; + const lines = unifiedDiff.split("\n"); + let inHunk = false; + let removed: string[] = []; + let added: string[] = []; + let sawChange = false; + let sawEofMarker = false; + + const flush = (): boolean => { + if (removed.length === 0 && added.length === 0) return true; + if (removed.length !== added.length) return false; + for (let i = 0; i < removed.length; i++) { + const a = normalizeLineEndings(removed[i]).trimEnd(); + const b = normalizeLineEndings(added[i]).trimEnd(); + if (a !== b) return false; + } + removed = []; + added = []; + return true; + }; + + for (const line of lines) { + if (line.startsWith("diff ") || line.startsWith("index ") || line.startsWith("--- ") || line.startsWith("+++ ")) { + if (!flush()) return false; + inHunk = false; + continue; + } + if (line.startsWith("@@")) { + if (!flush()) return false; + inHunk = true; + continue; + } + if (!inHunk) { + if (line.startsWith("\\")) { + if (line.includes("No newline at end of file")) sawEofMarker = true; + } + continue; + } + if (line.startsWith(" ")) { + if (!flush()) return false; + continue; + } + if (line.startsWith("\\")) { + if (line.includes("No newline at end of file")) sawEofMarker = true; + continue; + } + if (line.startsWith("-")) { + removed.push(line.substring(1)); + sawChange = true; + continue; + } + if (line.startsWith("+")) { + added.push(line.substring(1)); + sawChange = true; + continue; + } + } + const onlyLineEndings = flush(); + if (!sawChange) return sawEofMarker; + return onlyLineEndings; }
🧹 Nitpick comments (1)
shared/src/utils/diffUtils.ts (1)
304-329: Avoid duplicate filtering work incleanDiff→hasNoMeaningfulDiffContent.
cleanDifffilters and combines lines, then callshasNoMeaningfulDiffContent, which re-filters internally. Consider adding a variant likehasNoMeaningfulDiffContentFromLines(filteredLines: string[])or letting the existing function accept pre-filtered lines to avoid a second pass. Behavior unchanged; improves perf/readability.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
shared/src/utils/diffUtils.ts(7 hunks)vscode/src/utilities/ModifiedFiles/handleModifiedFile.ts(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Build (windows)
🔇 Additional comments (1)
vscode/src/utilities/ModifiedFiles/handleModifiedFile.ts (1)
76-79: VerifyignoreNewlineAtEofsupport in jsdiff v7.x and v8.x; if it isn’t honored, wrap the generated patch incleanDiff(raw)before use.
| // Helper function to process collected lines | ||
| const processHunkLines = () => { | ||
| if (removedLines.length === 0 || addedLines.length === 0) { | ||
| // No pairs to compare, keep all lines | ||
| removedLines.forEach((item) => filtered.push(diffLines[item.index])); | ||
| addedLines.forEach((item) => filtered.push(diffLines[item.index])); | ||
| } else if (removedLines.length === addedLines.length) { | ||
| // Check if all pairs only differ in line endings | ||
| let allLineEndingChanges = true; | ||
| for (let k = 0; k < removedLines.length; k++) { | ||
| const normalizedRemoved = normalizeLineEndings(removedLines[k].content).trimEnd(); | ||
| const normalizedAdded = normalizeLineEndings(addedLines[k].content).trimEnd(); | ||
| if (normalizedRemoved !== normalizedAdded) { | ||
| allLineEndingChanges = false; | ||
| break; | ||
| } | ||
| } | ||
|
|
||
| if (!allLineEndingChanges) { | ||
| // Not all changes are line-ending only, keep all lines | ||
| removedLines.forEach((item) => filtered.push(diffLines[item.index])); | ||
| addedLines.forEach((item) => filtered.push(diffLines[item.index])); | ||
| } | ||
| // If all are line-ending changes, we skip them (don't add to filtered) | ||
| } else { | ||
| // Different number of removed/added lines, keep all | ||
| removedLines.forEach((item) => filtered.push(diffLines[item.index])); | ||
| addedLines.forEach((item) => filtered.push(diffLines[item.index])); | ||
| } | ||
|
|
||
| // Clear collections | ||
| removedLines.length = 0; | ||
| addedLines.length = 0; | ||
| }; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preserve original line order when keeping changes in filterLineEndingOnlyChanges.
When not filtering a hunk, the current code pushes all removed lines first, then all added (Lines 169-193), which reorders the diff and can produce invalid/unexpected patches. Keep original order by merging and sorting by original indices.
Apply:
const processHunkLines = () => {
if (removedLines.length === 0 || addedLines.length === 0) {
- // No pairs to compare, keep all lines
- removedLines.forEach((item) => filtered.push(diffLines[item.index]));
- addedLines.forEach((item) => filtered.push(diffLines[item.index]));
+ // No pairs to compare, keep all lines in original order
+ const all = [
+ ...removedLines.map((it) => ({ index: it.index, line: diffLines[it.index] })),
+ ...addedLines.map((it) => ({ index: it.index, line: diffLines[it.index] })),
+ ].sort((a, b) => a.index - b.index);
+ all.forEach(({ line }) => filtered.push(line));
} else if (removedLines.length === addedLines.length) {
// Check if all pairs only differ in line endings
let allLineEndingChanges = true;
for (let k = 0; k < removedLines.length; k++) {
const normalizedRemoved = normalizeLineEndings(removedLines[k].content).trimEnd();
const normalizedAdded = normalizeLineEndings(addedLines[k].content).trimEnd();
if (normalizedRemoved !== normalizedAdded) {
allLineEndingChanges = false;
break;
}
}
if (!allLineEndingChanges) {
- // Not all changes are line-ending only, keep all lines
- removedLines.forEach((item) => filtered.push(diffLines[item.index]));
- addedLines.forEach((item) => filtered.push(diffLines[item.index]));
+ // Not all changes are line-ending only, keep all lines in original order
+ const all = [
+ ...removedLines.map((it) => ({ index: it.index, line: diffLines[it.index] })),
+ ...addedLines.map((it) => ({ index: it.index, line: diffLines[it.index] })),
+ ].sort((a, b) => a.index - b.index);
+ all.forEach(({ line }) => filtered.push(line));
}
// If all are line-ending changes, we skip them (don't add to filtered)
} else {
- // Different number of removed/added lines, keep all
- removedLines.forEach((item) => filtered.push(diffLines[item.index]));
- addedLines.forEach((item) => filtered.push(diffLines[item.index]));
+ // Different number of removed/added lines, keep all in original order
+ const all = [
+ ...removedLines.map((it) => ({ index: it.index, line: diffLines[it.index] })),
+ ...addedLines.map((it) => ({ index: it.index, line: diffLines[it.index] })),
+ ].sort((a, b) => a.index - b.index);
+ all.forEach(({ line }) => filtered.push(line));
}
// Clear collections
removedLines.length = 0;
addedLines.length = 0;
};Also applies to: 183-193
🤖 Prompt for AI Agents
In shared/src/utils/diffUtils.ts around lines 165-199, when a hunk is not
filtered the code currently pushes all removedLines then all addedLines which
reorders the original diff; instead merge removedLines and addedLines into a
single array, sort that merged array by item.index (ascending) to restore
original file order, then push items into filtered in that sorted order; apply
the same merge-sort-and-push fix to the other similar block referenced (lines
~183-193), and finally clear the removedLines/addedLines collections as before.
| if ( | ||
| line && | ||
| nextLine && | ||
| line.startsWith("-") && | ||
| nextLine.startsWith("+") && | ||
| line.substring(1).trim() === nextLine.substring(1).trim() | ||
| ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid collapsing leading‑whitespace changes; use trimEnd() instead of trim().
combineIdenticalTrimmedLines currently ignores both leading and trailing whitespace (Line 286), which can hide indentation-only edits. PR intent mentions trailing whitespace. Use trimEnd() to limit the collapse to trailing whitespace-only changes.
Apply:
- line.substring(1).trim() === nextLine.substring(1).trim()
+ line.substring(1).trimEnd() === nextLine.substring(1).trimEnd()📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if ( | |
| line && | |
| nextLine && | |
| line.startsWith("-") && | |
| nextLine.startsWith("+") && | |
| line.substring(1).trim() === nextLine.substring(1).trim() | |
| ) { | |
| if ( | |
| line && | |
| nextLine && | |
| line.startsWith("-") && | |
| nextLine.startsWith("+") && | |
| line.substring(1).trimEnd() === nextLine.substring(1).trimEnd() | |
| ) { |
🤖 Prompt for AI Agents
In shared/src/utils/diffUtils.ts around lines 281 to 287, the comparison
currently uses substring(1).trim() which removes both leading and trailing
whitespace and therefore collapses indentation-only changes; change the calls to
substring(1).trimEnd() for both line and nextLine so only trailing whitespace is
ignored and leading indentation differences are preserved when deciding to
combine lines.
These changes prevent noisy diffs when files only differ in line endings or trailing whitespace, particularly important for cross-platform development where CRLF/LF differences are common.
Summary by CodeRabbit
Bug Fixes
New Features