Skip to content

Conversation

@ibolton336
Copy link
Member

@ibolton336 ibolton336 commented Oct 14, 2025

  • Enhanced filterLineEndingOnlyChanges to handle block-style diffs where all removed lines appear before added lines
  • Updated isOnlyLineEndingDiff to handle special diff markers like '\ No newline at end of file'
  • Added combineIdenticalTrimmedLines function to reduce noise from whitespace-only changes
  • Added cleanDiff utility that combines all filtering strategies
  • Added ignoreNewlineAtEof option to diff creation to prevent trailing newline diffs
  • Improved hasNoMeaningfulDiffContent for better performance

These changes prevent noisy diffs when files only differ in line endings or trailing whitespace, particularly important for cross-platform development where CRLF/LF differences are common.

Summary by CodeRabbit

  • Bug Fixes

    • Diffs now ignore trailing newline-at-EOF differences and special diff markers, reducing noise.
    • Line-ending–only and whitespace-only changes are filtered out so only meaningful content changes appear.
    • More reliable pairing of removed/added lines within hunks prevents false positives.
  • New Features

    • Added automatic diff cleanup that collapses identical +/- pairs into a single context line.
    • Normalization ensures identical files (after trimming/line-ending fixes) produce empty diffs.

@ibolton336 ibolton336 requested a review from a team as a code owner October 14, 2025 19:55
@ibolton336 ibolton336 changed the title fix: improve diff line ending filtering to handle CRLF changes :fix: improve diff line ending filtering to handle CRLF changes Oct 14, 2025
- Enhanced filterLineEndingOnlyChanges to handle block-style diffs where all removed lines appear before added lines
- Updated isOnlyLineEndingDiff to handle special diff markers like '\ No newline at end of file'
- Added combineIdenticalTrimmedLines function to reduce noise from whitespace-only changes
- Added cleanDiff utility that combines all filtering strategies
- Added ignoreNewlineAtEof option to diff creation to prevent trailing newline diffs
- Improved hasNoMeaningfulDiffContent for better performance

These changes prevent noisy diffs when files only differ in line endings or trailing whitespace, particularly important for cross-platform development where CRLF/LF differences are common.

Signed-off-by: Ian Bolton <[email protected]>
@ibolton336 ibolton336 force-pushed the fix/improve-diff-line-ending-filtering branch from d28135d to d9740bd Compare October 14, 2025 19:56
@coderabbitai
Copy link

coderabbitai bot commented Oct 14, 2025

Walkthrough

  • Enhanced diff processing in shared/src/utils/diffUtils.ts:
    • Reworked isOnlyLineEndingDiff to pair removed/added lines, normalize (trim/line-endings), handle special markers (e.g., “\ No newline at end of file”), and compare.
    • Added hunk-wise filtering to remove line-ending-only changes and logic to collect/flush hunk lines.
    • Introduced helpers: combineIdenticalTrimmedLines, cleanDiff, normalizeUnifiedDiff, and updated hasNoMeaningfulDiffContent.
  • Updated vscode/src/utilities/ModifiedFiles/handleModifiedFile.ts to pass ignoreNewlineAtEof: true for new, deleted, and modified file patch generation.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • djzager
  • fabianvf

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Title Check ⚠️ Warning The title clearly summarizes the main change to diff line ending filtering but includes an emoji alias that doesn’t align with the repository’s prescribed prefix conventions and adds unnecessary noise. Please replace the :fix: prefix with the approved patch fix alias :bug: or remove the emoji entirely so the title reads, for example, :bug: improve diff line ending filtering to handle CRLF changes.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed The pull request description offers a concise list of enhancements, explains their purpose, and provides adequate context, meeting the repository’s expectations for a substantive summary.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
shared/src/utils/diffUtils.ts (1)

15-90: Fix pairing logic and hunk-awareness in isOnlyLineEndingDiff (block diffs misclassified).

Current logic pairs only the first “-” with the next “+” and skips intervening “-” lines, then resumes at a “+” (Line 86), causing false negatives on common “---+++” blocks and ignoring hunk boundaries. Also, when a diff has no +/- lines (e.g., rename-only), it should not return true unless it’s solely the EOF marker.

Proposed hunk-aware implementation that:

  • Pairs removes/adds within each hunk in order.
  • Ignores “\ No newline at end of file”.
  • Returns false for diffs with no +/- unless the EOF marker is present.

Apply:

 export function isOnlyLineEndingDiff(unifiedDiff: string): boolean {
-  const lines = unifiedDiff.split("\n");
-  const changeLines: string[] = [];
-  const specialMarkers: string[] = [];
-
-  // Collect all +/- lines and special markers
-  for (const line of lines) {
-    // Skip diff headers and context markers
-    if (
-      line.startsWith("diff ") ||
-      line.startsWith("index ") ||
-      line.startsWith("--- ") ||
-      line.startsWith("+++ ") ||
-      line.startsWith("@@") ||
-      line.startsWith(" ")
-    ) {
-      continue;
-    }
-
-    // Collect special markers (e.g., "\ No newline at end of file")
-    if (line.startsWith("\\")) {
-      specialMarkers.push(line);
-      continue;
-    }
-
-    // Collect actual change lines
-    if (line.startsWith("+") || line.startsWith("-")) {
-      changeLines.push(line);
-    }
-  }
-
-  // If no changes, not a line ending diff
-  if (changeLines.length === 0) {
-    // Check if only special markers exist (which might indicate line ending differences)
-    return specialMarkers.some((marker) => marker.includes("No newline at end of file"));
-  }
-
-  // Process changes to check if they're only line ending differences
-  let i = 0;
-  while (i < changeLines.length) {
-    const removedLine = changeLines[i];
-
-    // Must start with -
-    if (!removedLine.startsWith("-")) {
-      return false;
-    }
-
-    // Find the corresponding + line (might not be immediately after)
-    let j = i + 1;
-    while (j < changeLines.length && changeLines[j].startsWith("-")) {
-      j++;
-    }
-
-    if (j >= changeLines.length || !changeLines[j].startsWith("+")) {
-      return false; // No matching + line found
-    }
-
-    const addedLine = changeLines[j];
-    const removedContent = removedLine.substring(1);
-    const addedContent = addedLine.substring(1);
-
-    // Normalize and compare, handling various line ending representations
-    const normalizedRemoved = normalizeLineEndings(removedContent).trimEnd();
-    const normalizedAdded = normalizeLineEndings(addedContent).trimEnd();
-
-    // If content differs after normalization, it's not just a line ending change
-    if (normalizedRemoved !== normalizedAdded) {
-      return false;
-    }
-
-    // Move to next unprocessed line
-    i = j + 1;
-  }
-
-  return true;
+  const lines = unifiedDiff.split("\n");
+  let inHunk = false;
+  let removed: string[] = [];
+  let added: string[] = [];
+  let sawChange = false;
+  let sawEofMarker = false;
+
+  const flush = (): boolean => {
+    if (removed.length === 0 && added.length === 0) return true;
+    if (removed.length !== added.length) return false;
+    for (let i = 0; i < removed.length; i++) {
+      const a = normalizeLineEndings(removed[i]).trimEnd();
+      const b = normalizeLineEndings(added[i]).trimEnd();
+      if (a !== b) return false;
+    }
+    removed = [];
+    added = [];
+    return true;
+  };
+
+  for (const line of lines) {
+    if (line.startsWith("diff ") || line.startsWith("index ") || line.startsWith("--- ") || line.startsWith("+++ ")) {
+      if (!flush()) return false;
+      inHunk = false;
+      continue;
+    }
+    if (line.startsWith("@@")) {
+      if (!flush()) return false;
+      inHunk = true;
+      continue;
+    }
+    if (!inHunk) {
+      if (line.startsWith("\\")) {
+        if (line.includes("No newline at end of file")) sawEofMarker = true;
+      }
+      continue;
+    }
+    if (line.startsWith(" ")) {
+      if (!flush()) return false;
+      continue;
+    }
+    if (line.startsWith("\\")) {
+      if (line.includes("No newline at end of file")) sawEofMarker = true;
+      continue;
+    }
+    if (line.startsWith("-")) {
+      removed.push(line.substring(1));
+      sawChange = true;
+      continue;
+    }
+    if (line.startsWith("+")) {
+      added.push(line.substring(1));
+      sawChange = true;
+      continue;
+    }
+  }
+  const onlyLineEndings = flush();
+  if (!sawChange) return sawEofMarker;
+  return onlyLineEndings;
 }
🧹 Nitpick comments (1)
shared/src/utils/diffUtils.ts (1)

304-329: Avoid duplicate filtering work in cleanDiffhasNoMeaningfulDiffContent.

cleanDiff filters and combines lines, then calls hasNoMeaningfulDiffContent, which re-filters internally. Consider adding a variant like hasNoMeaningfulDiffContentFromLines(filteredLines: string[]) or letting the existing function accept pre-filtered lines to avoid a second pass. Behavior unchanged; improves perf/readability.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 94e5d81 and d9740bd.

📒 Files selected for processing (2)
  • shared/src/utils/diffUtils.ts (7 hunks)
  • vscode/src/utilities/ModifiedFiles/handleModifiedFile.ts (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build (windows)
🔇 Additional comments (1)
vscode/src/utilities/ModifiedFiles/handleModifiedFile.ts (1)

76-79: Verify ignoreNewlineAtEof support in jsdiff v7.x and v8.x; if it isn’t honored, wrap the generated patch in cleanDiff(raw) before use.

Comment on lines +165 to +199
// Helper function to process collected lines
const processHunkLines = () => {
if (removedLines.length === 0 || addedLines.length === 0) {
// No pairs to compare, keep all lines
removedLines.forEach((item) => filtered.push(diffLines[item.index]));
addedLines.forEach((item) => filtered.push(diffLines[item.index]));
} else if (removedLines.length === addedLines.length) {
// Check if all pairs only differ in line endings
let allLineEndingChanges = true;
for (let k = 0; k < removedLines.length; k++) {
const normalizedRemoved = normalizeLineEndings(removedLines[k].content).trimEnd();
const normalizedAdded = normalizeLineEndings(addedLines[k].content).trimEnd();
if (normalizedRemoved !== normalizedAdded) {
allLineEndingChanges = false;
break;
}
}

if (!allLineEndingChanges) {
// Not all changes are line-ending only, keep all lines
removedLines.forEach((item) => filtered.push(diffLines[item.index]));
addedLines.forEach((item) => filtered.push(diffLines[item.index]));
}
// If all are line-ending changes, we skip them (don't add to filtered)
} else {
// Different number of removed/added lines, keep all
removedLines.forEach((item) => filtered.push(diffLines[item.index]));
addedLines.forEach((item) => filtered.push(diffLines[item.index]));
}

// Clear collections
removedLines.length = 0;
addedLines.length = 0;
};

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Preserve original line order when keeping changes in filterLineEndingOnlyChanges.

When not filtering a hunk, the current code pushes all removed lines first, then all added (Lines 169-193), which reorders the diff and can produce invalid/unexpected patches. Keep original order by merging and sorting by original indices.

Apply:

   const processHunkLines = () => {
     if (removedLines.length === 0 || addedLines.length === 0) {
-      // No pairs to compare, keep all lines
-      removedLines.forEach((item) => filtered.push(diffLines[item.index]));
-      addedLines.forEach((item) => filtered.push(diffLines[item.index]));
+      // No pairs to compare, keep all lines in original order
+      const all = [
+        ...removedLines.map((it) => ({ index: it.index, line: diffLines[it.index] })),
+        ...addedLines.map((it) => ({ index: it.index, line: diffLines[it.index] })),
+      ].sort((a, b) => a.index - b.index);
+      all.forEach(({ line }) => filtered.push(line));
     } else if (removedLines.length === addedLines.length) {
       // Check if all pairs only differ in line endings
       let allLineEndingChanges = true;
       for (let k = 0; k < removedLines.length; k++) {
         const normalizedRemoved = normalizeLineEndings(removedLines[k].content).trimEnd();
         const normalizedAdded = normalizeLineEndings(addedLines[k].content).trimEnd();
         if (normalizedRemoved !== normalizedAdded) {
           allLineEndingChanges = false;
           break;
         }
       }
 
       if (!allLineEndingChanges) {
-        // Not all changes are line-ending only, keep all lines
-        removedLines.forEach((item) => filtered.push(diffLines[item.index]));
-        addedLines.forEach((item) => filtered.push(diffLines[item.index]));
+        // Not all changes are line-ending only, keep all lines in original order
+        const all = [
+          ...removedLines.map((it) => ({ index: it.index, line: diffLines[it.index] })),
+          ...addedLines.map((it) => ({ index: it.index, line: diffLines[it.index] })),
+        ].sort((a, b) => a.index - b.index);
+        all.forEach(({ line }) => filtered.push(line));
       }
       // If all are line-ending changes, we skip them (don't add to filtered)
     } else {
-      // Different number of removed/added lines, keep all
-      removedLines.forEach((item) => filtered.push(diffLines[item.index]));
-      addedLines.forEach((item) => filtered.push(diffLines[item.index]));
+      // Different number of removed/added lines, keep all in original order
+      const all = [
+        ...removedLines.map((it) => ({ index: it.index, line: diffLines[it.index] })),
+        ...addedLines.map((it) => ({ index: it.index, line: diffLines[it.index] })),
+      ].sort((a, b) => a.index - b.index);
+      all.forEach(({ line }) => filtered.push(line));
     }
 
     // Clear collections
     removedLines.length = 0;
     addedLines.length = 0;
   };

Also applies to: 183-193

🤖 Prompt for AI Agents
In shared/src/utils/diffUtils.ts around lines 165-199, when a hunk is not
filtered the code currently pushes all removedLines then all addedLines which
reorders the original diff; instead merge removedLines and addedLines into a
single array, sort that merged array by item.index (ascending) to restore
original file order, then push items into filtered in that sorted order; apply
the same merge-sort-and-push fix to the other similar block referenced (lines
~183-193), and finally clear the removedLines/addedLines collections as before.

Comment on lines +281 to +287
if (
line &&
nextLine &&
line.startsWith("-") &&
nextLine.startsWith("+") &&
line.substring(1).trim() === nextLine.substring(1).trim()
) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid collapsing leading‑whitespace changes; use trimEnd() instead of trim().

combineIdenticalTrimmedLines currently ignores both leading and trailing whitespace (Line 286), which can hide indentation-only edits. PR intent mentions trailing whitespace. Use trimEnd() to limit the collapse to trailing whitespace-only changes.

Apply:

-      line.substring(1).trim() === nextLine.substring(1).trim()
+      line.substring(1).trimEnd() === nextLine.substring(1).trimEnd()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (
line &&
nextLine &&
line.startsWith("-") &&
nextLine.startsWith("+") &&
line.substring(1).trim() === nextLine.substring(1).trim()
) {
if (
line &&
nextLine &&
line.startsWith("-") &&
nextLine.startsWith("+") &&
line.substring(1).trimEnd() === nextLine.substring(1).trimEnd()
) {
🤖 Prompt for AI Agents
In shared/src/utils/diffUtils.ts around lines 281 to 287, the comparison
currently uses substring(1).trim() which removes both leading and trailing
whitespace and therefore collapses indentation-only changes; change the calls to
substring(1).trimEnd() for both line and nextLine so only trailing whitespace is
ignored and leading indentation differences are preserved when deciding to
combine lines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant