Skip to content

Conversation

@Alex-1089
Copy link
Contributor

@Alex-1089 Alex-1089 commented Oct 16, 2025

Summary by CodeRabbit

  • New Features

    • Added a "delta copy" CLI command (aliases: dl, cp) to batch-copy files listed in an input file.
    • Option to write the resulting delta folder path to an output file or stdout.
    • Specify a destination folder, a custom root directory, or auto-create a temporary delta directory. Preserves subdirectory structure and creates needed directories; skips empty/invalid entries and reports errors with exit status and optional debug/trace output.
  • Documentation

    • Release notes updated and package version bumped for the new feature.

@Alex-1089 Alex-1089 requested a review from eeisegn October 16, 2025 14:49
@coderabbitai
Copy link

coderabbitai bot commented Oct 16, 2025

Walkthrough

Adds a new Delta class and a CLI "delta" subcommand group with a "copy" action that reads an input file of source paths, creates a delta directory (optionally under a root), copies listed files preserving structure, emits the delta path (stdout or file), and returns an exit status.

Changes

Cohort / File(s) Summary
CLI Delta Command Handler
src/scanoss/cli.py
Added from scanoss.delta import Delta. Introduced delta subcommand group and copy subcommand (aliases: dl, cp) with --filepath, --folder, --root-dir, and --output options. Added public delta_copy(parser, args) handler that validates filepath, instantiates Delta(..., root_dir=...), calls copy(), handles exceptions (optional traceback when debugging), writes delta path to output (file or stdout), and exits with the returned status. Exposed delta copy in CLI output configuration list.
Delta Directory and Copy Operations
src/scanoss/delta.py
Added Delta class (subclass of ScanossBase) with constructor accepting debug, trace, quiet, filepath, folder, output, root_dir. Implemented Delta.copy(input_file=None) returning (status_code, folder_path), plus helpers create_delta_dir(folder, root_dir) and load_input_file(input_file). Behavior: load input file lines, create or validate delta directory (or temp unique), normalize/validate source paths, create destination directories, copy files preserving directory structure, skip or log invalid/path-escaping entries, and return success/error tuple.
Version and Changelog
src/scanoss/__init__.py, CHANGELOG.md
Bumped __version__ from 1.36.01.37.0. Added CHANGELOG entry [1.37.0] (2025-10-17) documenting "Added delta folder and file copy command" and added compare link for 1.37.0.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant CLI as "scanoss CLI"
    participant Delta
    participant FS as "File System / IO"

    User->>CLI: run "scanoss delta copy" --filepath <file> [--folder <f>] [--root-dir <r>] [--output <o>]
    CLI->>CLI: validate args (filepath required)
    CLI->>Delta: instantiate Delta(debug, trace, quiet, filepath, folder, output, root_dir)
    CLI->>Delta: call copy(input_file)
    Delta->>FS: create_delta_dir(folder, root_dir)
    Delta->>FS: read input file lines
    loop each input path
        Delta->>FS: normalize & validate source path
        alt valid and inside delta root
            Delta->>FS: makedirs destination
            Delta->>FS: copy source -> delta destination
        else skip / log warning
        end
    end
    Delta-->>CLI: return (status_code, delta_folder)
    CLI->>FS: write delta_folder to output (file or stdout)
    CLI->>User: exit with status_code
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐇 I hopped through lists of paths at dawn,
I carved a cozy delta, neat and drawn.
I fetched each file and tucked it in a row,
Kept roots intact where little folders grow.
Hooray — the copy’s done; a carrot for the show!

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The pull request title "SP-3354_delta-scan-setup" is related to the main change in the changeset. The PR introduces a new Delta class with copy functionality and integrates it into the CLI with a new delta copy subcommand, which aligns with the title's reference to delta scan setup. However, the title uses "setup" which is somewhat generic and doesn't clearly convey that the primary change is implementing a new delta folder and file copy command, as documented in the CHANGELOG. A teammate scanning through history would understand the PR involves delta scan functionality, though the specific nature of the copy/folder feature is not immediately apparent from the title alone.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/SP-3354_delta-scan-setup

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7f3f8ac and c7c1e50.

📒 Files selected for processing (2)
  • CHANGELOG.md (2 hunks)
  • src/scanoss/__init__.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • src/scanoss/init.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build
🔇 Additional comments (1)
CHANGELOG.md (1)

12-14: Changelog entry looks good.

The new version entry follows the project's conventions: version is sequential, date format is consistent, description aligns with the PR objectives, and the comparison link is properly formatted.

Also applies to: 691-691


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link

SCANOSS SCAN Completed 🚀

  • Detected components: 1
  • Undeclared components: 0
  • Declared components: 1
  • Detected files: 83
  • Detected files undeclared: 0
  • Detected files declared: 83
  • Licenses detected: 1
  • Licenses detected with copyleft: 0
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
src/scanoss/cli.py (1)

2646-2691: Consider adding validation for the folder argument.

The handler correctly validates the input file and initializes the output file. However, when args.folder is provided, consider validating it's not an absolute path outside the working directory to prevent potential security issues.

Additionally, the error message on line 2688 prints the raw exception object, which may not provide a user-friendly message. Consider formatting it:

-    except Exception as e:
-        print_stderr(e)
+    except Exception as e:
+        print_stderr(f'ERROR: {e}')
src/scanoss/delta.py (2)

30-49: Add class and method docstrings.

The Delta class and __init__ method are missing descriptive docstrings. Comprehensive documentation helps users understand the purpose and usage of the class.

Add docstrings:

 class Delta(ScanossBase):
     """
-
+    Manages delta directory creation and file copying operations.
+    
+    Provides functionality to copy files listed in an input file to a target
+    directory while preserving directory structure, supporting delta/incremental
+    scanning workflows.
     """
     def __init__(
         self,
         debug: bool = False,
         trace: bool = False,
         quiet: bool = False,
         filepath: str = None,
         folder: str = None,
         output: str = None,
     ):
         """
-
+        Initialize the Delta instance.
+        
+        :param debug: Enable debug logging
+        :param trace: Enable trace logging
+        :param quiet: Enable quiet mode (suppress non-error output)
+        :param filepath: Path to input file containing list of files to copy
+        :param folder: Optional target directory for delta files
+        :param output: Optional output file to write delta directory path
         """

101-108: Simplify the conditional logic.

Line 105 uses exist_ok=True immediately after checking that the folder doesn't exist on line 101, which is redundant. Additionally, consider using os.path.join() instead of string concatenation for the prefix in tempfile creation for better cross-platform compatibility.

         if folder and os.path.exists(folder):
             self.print_stderr(f'Folder {folder} already exists')
             return ''
         elif folder:
-            os.makedirs(folder, exist_ok=True)
+            os.makedirs(folder)
         else:
             folder = tempfile.mkdtemp(prefix="delta-", dir='.')
         return folder
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between afc8cb6 and e5c958e.

📒 Files selected for processing (2)
  • src/scanoss/cli.py (5 hunks)
  • src/scanoss/delta.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
src/scanoss/delta.py (1)
src/scanoss/scanossbase.py (2)
  • ScanossBase (28-107)
  • print_to_file_or_stdout (83-94)
src/scanoss/cli.py (2)
src/scanoss/delta.py (2)
  • Delta (30-108)
  • copy (51-88)
src/scanoss/scanossbase.py (1)
  • print_stderr (45-49)
🪛 GitHub Actions: Lint
src/scanoss/delta.py

[error] 24-24: I001 Import block is un-sorted or un-formatted. Organize imports. 2 issues fixable with --fix.

src/scanoss/cli.py

[error] 25-25: I001 Import block is un-sorted or un-formatted. Organize imports. 2 issues fixable with --fix.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build
🔇 Additional comments (4)
src/scanoss/cli.py (2)

923-959: LGTM!

The delta subcommand structure follows the established patterns in the codebase (e.g., component, utils, crypto). The command hierarchy (delta copy) is well-defined with appropriate aliases, help text, and argument configuration.


980-980: LGTM!

The additions correctly integrate the p_copy parser into the existing option groups, ensuring consistency with other subcommands.

Also applies to: 1178-1178

src/scanoss/delta.py (2)

63-65: LGTM!

The input file validation correctly checks for existence before proceeding.


24-26: Fix import sorting issue.

The linter has flagged the import block as improperly sorted. Organize imports with standard library modules sorted alphabetically.

Apply this diff:

 import os
 import shutil
 import tempfile

Note: The imports appear correctly ordered visually. Run ruff check --fix to apply the auto-fix suggested by the linter, which may address whitespace or other formatting concerns.

Likely an incorrect or invalid review comment.

from .spdxlite import SpdxLite
from .threadeddependencies import SCOPE
from .utils.file import validate_json_file
from scanoss.delta import Delta
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix import sorting issue.

The import statement is flagged by the linter as being out of order. Following Python conventions, imports should be organized by standard library, third-party, and local modules, sorted alphabetically within each group.

Apply this diff to fix the import order:

+from .scantype import ScanType
+from .spdxlite import SpdxLite
 from .threadeddependencies import SCOPE
 from .utils.file import validate_json_file
-from scanoss.delta import Delta
+
+from .delta import Delta
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
from scanoss.delta import Delta
from .scantype import ScanType
from .spdxlite import SpdxLite
from .threadeddependencies import SCOPE
from .utils.file import validate_json_file
from .delta import Delta
🤖 Prompt for AI Agents
In src/scanoss/cli.py around line 87, the import "from scanoss.delta import
Delta" is out of order per lint rules; reorder imports into standard library,
third-party, then local modules and alphabetically within each group—place this
local package import with other local imports in the correct alphabetical
position (or move it below third-party imports) so import sorting conforms to
the project's linter.

@github-actions
Copy link

SCANOSS SCAN Completed 🚀

  • Detected components: 1
  • Undeclared components: 0
  • Declared components: 1
  • Detected files: 83
  • Detected files undeclared: 0
  • Detected files declared: 83
  • Licenses detected: 1
  • Licenses detected with copyleft: 0
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
src/scanoss/cli.py (1)

36-36: Import ordering issue remains unresolved.

The previous review comment about import ordering has not been addressed. The from scanoss.delta import Delta import should be placed with other local imports in alphabetical order.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e5c958e and 7ed6040.

📒 Files selected for processing (2)
  • src/scanoss/cli.py (5 hunks)
  • src/scanoss/delta.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/scanoss/delta.py
🧰 Additional context used
🧬 Code graph analysis (1)
src/scanoss/cli.py (2)
src/scanoss/delta.py (2)
  • Delta (31-109)
  • copy (52-89)
src/scanoss/scanossbase.py (1)
  • print_stderr (45-49)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build
🔇 Additional comments (2)
src/scanoss/cli.py (2)

980-980: LGTM!

Correctly adds the p_copy parser to the common options lists for output and debugging flags, following the established pattern.

Also applies to: 1178-1178


2646-2691: LGTM!

The delta_copy handler correctly:

  • Validates required inputs
  • Initializes optional output file
  • Instantiates Delta with CLI arguments
  • Unpacks the (status, folder) tuple from Delta.copy()
  • Handles exceptions with appropriate error reporting

The implementation follows established patterns from similar handlers.

@github-actions
Copy link

SCANOSS SCAN Completed 🚀

  • Detected components: 1
  • Undeclared components: 0
  • Declared components: 1
  • Detected files: 83
  • Detected files undeclared: 0
  • Detected files declared: 83
  • Licenses detected: 1
  • Licenses detected with copyleft: 0
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

@github-actions
Copy link

SCANOSS SCAN Completed 🚀

  • Detected components: 1
  • Undeclared components: 0
  • Declared components: 1
  • Detected files: 83
  • Detected files undeclared: 0
  • Detected files declared: 83
  • Licenses detected: 1
  • Licenses detected with copyleft: 0
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7ed6040 and 5fafc3a.

📒 Files selected for processing (2)
  • src/scanoss/cli.py (6 hunks)
  • src/scanoss/delta.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/scanoss/cli.py
🧰 Additional context used
🧬 Code graph analysis (1)
src/scanoss/delta.py (1)
src/scanoss/scanossbase.py (2)
  • ScanossBase (28-107)
  • print_to_file_or_stdout (83-94)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build
🔇 Additional comments (1)
src/scanoss/delta.py (1)

1-28: LGTM!

License header and imports are appropriate for the functionality.

@github-actions
Copy link

SCANOSS SCAN Completed 🚀

  • Detected components: 1
  • Undeclared components: 0
  • Declared components: 1
  • Detected files: 83
  • Detected files undeclared: 0
  • Detected files declared: 83
  • Licenses detected: 1
  • Licenses detected with copyleft: 0
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
src/scanoss/delta.py (2)

85-134: Strong improvements to security and error handling!

The implementation now includes:

  • UTF-8 encoding specification
  • Comprehensive error handling
  • Path normalization and sanitization
  • Final safety check to prevent directory escapes

The path traversal concerns from previous reviews have been substantially addressed.

Optional simplification: The manual while loop at lines 112-113 for stripping leading .. components is functional but could be simplified:

-                        # Use a normalised source for destination to prevent traversal
-                        # Remove leading path separators and '..' components from destination
-                        safe_dest_path = normalised_source.lstrip(os.sep).lstrip('/')
-                        while safe_dest_path.startswith('..'):
-                            safe_dest_path = safe_dest_path[2:].lstrip(os.sep).lstrip('/')
+                        # Use a normalised source for destination to prevent traversal
+                        # Remove leading path separators and ensure relative path
+                        safe_dest_path = normalised_source.lstrip(os.sep).lstrip('/')
+                        # Reject any remaining paths that start with '..'
+                        if safe_dest_path.startswith('..'):
+                            self.print_stderr(f'WARNING: Path traversal detected in {source_file}, skipping')
+                            continue

Alternatively, you could use os.path.relpath to ensure the path is relative, though the current approach is adequate.


88-88: Consider using strip() for line processing.

Using strip() instead of rstrip('\n\r') would be more robust and handle all types of whitespace and line endings (including potential leading whitespace).

-                    source_file = line.rstrip('\n\r')
+                    source_file = line.strip()
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5fafc3a and b158a6e.

📒 Files selected for processing (1)
  • src/scanoss/delta.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/scanoss/delta.py (1)
src/scanoss/scanossbase.py (2)
  • ScanossBase (28-107)
  • print_to_file_or_stdout (83-94)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build
🔇 Additional comments (2)
src/scanoss/delta.py (2)

31-61: Excellent documentation!

The class and constructor docstrings are now comprehensive and clearly document the purpose, parameters, and behavior. This addresses the previous review feedback effectively.


137-163: Well-structured directory management!

The method properly handles all scenarios:

  • Validates existing folders before creation
  • Includes comprehensive error handling for both explicit and temporary directory creation
  • Clear error messages guide users when operations fail

The previous review feedback regarding redundant exist_ok=True has been successfully addressed.

@github-actions
Copy link

SCANOSS SCAN Completed 🚀

  • Detected components: 1
  • Undeclared components: 0
  • Declared components: 1
  • Detected files: 83
  • Detected files undeclared: 0
  • Detected files declared: 83
  • Licenses detected: 1
  • Licenses detected with copyleft: 0
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b158a6e and 6f5117e.

📒 Files selected for processing (2)
  • src/scanoss/cli.py (6 hunks)
  • src/scanoss/delta.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
src/scanoss/delta.py (1)
src/scanoss/scanossbase.py (3)
  • ScanossBase (28-107)
  • print_to_file_or_stdout (83-94)
  • print_trace (65-70)
src/scanoss/cli.py (2)
src/scanoss/delta.py (2)
  • Delta (31-189)
  • copy (65-127)
src/scanoss/scanossbase.py (1)
  • print_stderr (45-49)
🪛 GitHub Actions: Lint
src/scanoss/delta.py

[error] 39-39: PLR0913 Too many arguments in function definition (7 > 6).

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build
🔇 Additional comments (5)
src/scanoss/cli.py (5)

923-949: LGTM!

The delta subcommand structure is well-organized and follows the established patterns in the codebase. The argument definitions are clear and appropriate for the copy operation.


970-970: Correct integration of output options.

The p_copy parser is properly added to the list of commands that support the --output flag.


1168-1168: Correct integration of debug options.

The p_copy parser is properly added to the list of commands that support debug, trace, and quiet flags.


1189-1190: Correct validation for delta subcommand.

The validation now properly includes 'delta' and 'dl', ensuring that users are prompted with help when running the command without a subcommand.


2637-2680: Well-structured handler function.

The delta_copy handler follows the established patterns in the codebase:

  • Validates required arguments
  • Properly initializes output files
  • Delegates business logic to the Delta class
  • Includes appropriate exception handling with debug tracebacks

Comment on lines 39 to 64
def __init__(
self,
debug: bool = False,
trace: bool = False,
quiet: bool = False,
filepath: str = None,
folder: str = None,
output: str = None,
root_dir: str = '.',
):
"""
Initialise the Delta instance.
:param debug: Enable debug logging
:param trace: Enable trace logging
:param quiet: Enable quiet mode (suppress non-essential output)
:param filepath: Path to input file containing list of files to copy
:param folder: Target delta directory path (auto-generated if not provided)
:param output: Output file path for delta directory location (stdout if not provided)
"""
super().__init__(debug, trace, quiet)
self.filepath = filepath
self.folder = folder
self.output = output
self.root_dir = root_dir if root_dir else '.'

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Consider consolidating parameters to address linter warning.

The linter flags this function for having too many arguments (7 > 6). While not critical, you could optionally consolidate related parameters (e.g., debug, trace, quiet are often bundled in a config object) to improve maintainability.

Minor: Redundant fallback logic.

Line 63 has self.root_dir = root_dir if root_dir else '.', but the parameter already defaults to '.' on line 47. The fallback is unnecessary unless None can be explicitly passed.

🧰 Tools
🪛 GitHub Actions: Lint

[error] 39-39: PLR0913 Too many arguments in function definition (7 > 6).

@github-actions
Copy link

SCANOSS SCAN Completed 🚀

  • Detected components: 1
  • Undeclared components: 0
  • Declared components: 1
  • Detected files: 83
  • Detected files undeclared: 0
  • Detected files declared: 83
  • Licenses detected: 1
  • Licenses detected with copyleft: 0
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/scanoss/delta.py (1)

64-64: Remove redundant fallback logic.

Since root_dir already defaults to '.' in the parameter definition (line 48), the conditional assignment is unnecessary.

Apply this diff:

-        self.root_dir = root_dir if root_dir else '.'
+        self.root_dir = root_dir
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6f5117e and 0070f09.

📒 Files selected for processing (1)
  • src/scanoss/delta.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/scanoss/delta.py (2)
src/scanoss/scanossbase.py (3)
  • ScanossBase (28-107)
  • print_to_file_or_stdout (83-94)
  • print_trace (65-70)
src/scanoss/cli.py (1)
  • print_stderr (92-96)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build
🔇 Additional comments (2)
src/scanoss/delta.py (2)

1-38: LGTM! Well-documented class definition.

The license header, imports, and class docstring are clear and comprehensive. The docstring effectively explains the class purpose and behavior.


130-191: LGTM! Robust error handling and security practices.

Both methods demonstrate solid implementation:

  • create_delta_dir: Properly validates folder existence, handles errors, and creates temporary directories with appropriate error catching.
  • load_input_file: Uses explicit UTF-8 encoding, strips whitespace correctly, and removes leading separators (line 185) to prevent absolute path exploitation.

The comprehensive error handling and return of None on failure enables proper error propagation to callers.

@Alex-1089 Alex-1089 requested a review from agustingroh October 16, 2025 19:18
@github-actions
Copy link

SCANOSS SCAN Completed 🚀

  • Detected components: 1
  • Undeclared components: 0
  • Declared components: 1
  • Detected files: 83
  • Detected files undeclared: 0
  • Detected files declared: 83
  • Licenses detected: 1
  • Licenses detected with copyleft: 0
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

@github-actions
Copy link

SCANOSS SCAN Completed 🚀

  • Detected components: 1
  • Undeclared components: 0
  • Declared components: 1
  • Detected files: 83
  • Detected files undeclared: 0
  • Detected files declared: 83
  • Licenses detected: 1
  • Licenses detected with copyleft: 0
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 30d5abf and e8de7a4.

📒 Files selected for processing (2)
  • src/scanoss/cli.py (6 hunks)
  • src/scanoss/delta.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
src/scanoss/cli.py (2)
src/scanoss/delta.py (2)
  • Delta (31-191)
  • copy (66-128)
src/scanoss/scanossbase.py (1)
  • print_stderr (45-49)
src/scanoss/delta.py (1)
src/scanoss/scanossbase.py (3)
  • ScanossBase (28-107)
  • print_to_file_or_stdout (83-94)
  • print_trace (65-70)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build
🔇 Additional comments (10)
src/scanoss/cli.py (6)

36-36: Import placement looks good.

The Delta import is correctly placed in the local imports section and properly ordered alphabetically with other scanoss module imports.


923-949: Well-structured subcommand setup.

The delta subcommand group and copy subcommand follow the established patterns in the codebase. The argument definitions are clear and properly typed, with appropriate help text for users.


970-970: Proper integration with output options.

Adding p_copy to the output options loop ensures the delta copy command supports the standard --output flag consistently with other commands.


1168-1168: Correct integration with debug options.

Including p_copy in the debug/trace/quiet options loop ensures consistent logging behavior across all commands.


1189-1190: Validation properly updated.

The addition of 'delta' and 'dl' to the subcommand validation check correctly addresses the past review concern. This ensures proper help display when the subcommand is omitted (e.g., scanoss-py delta).


2637-2673: Well-implemented command handler.

The delta_copy handler follows the established pattern used by other inspection and processing commands in the codebase. It properly validates inputs, initializes resources, handles errors, and provides appropriate exit codes.

src/scanoss/delta.py (4)

96-128: Comprehensive security implementation.

The file copying logic includes multiple layers of path traversal protection:

  • Path normalization and validation for '..' sequences
  • Verification that sources exist within the expected root
  • Final safety check ensuring destinations remain within the delta directory

The consistent use of self.root_dir in both dest_path construction (line 109) and abs_folder resolution (line 112) ensures proper path alignment, addressing the critical concern from past reviews.


130-163: Robust directory creation logic.

The create_delta_dir method properly handles both explicit folder creation and automatic temporary directory generation. Error handling is comprehensive, and the method correctly returns None on failure to allow the caller to detect and handle errors appropriately.


165-191: Proper file loading with security considerations.

The load_input_file method correctly:

  • Specifies UTF-8 encoding for cross-platform compatibility
  • Uses rstrip() to handle various line ending formats
  • Strips leading path separators to neutralize absolute paths from the input file
  • Includes proper error handling

This implementation addresses the security and robustness concerns raised in past reviews.


31-191: Well-architected Delta class with strong security posture.

The Delta class implementation demonstrates good software engineering practices:

  • Clear separation of concerns across methods
  • Comprehensive error handling and validation
  • Multiple layers of security checks against path traversal attacks
  • Consistent logging and debugging support
  • Proper integration with the ScanossBase parent class

The implementation successfully addresses all critical and major issues raised in previous reviews.

@github-actions
Copy link

SCANOSS SCAN Completed 🚀

  • Detected components: 1
  • Undeclared components: 0
  • Declared components: 1
  • Detected files: 83
  • Detected files undeclared: 0
  • Detected files declared: 83
  • Licenses detected: 1
  • Licenses detected with copyleft: 0
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (5)
src/scanoss/delta.py (5)

82-84: Validate the input is a file, not just a path that exists.

-        if not os.path.exists(input_file):
+        if not os.path.isfile(input_file):
             self.print_stderr(f'ERROR: Input file {input_file} does not exist')
             return 1, ''

24-28: Clean type hints: use Optional and standard generics.

@@
-import os
+import os
 import shutil
 import tempfile
+from typing import Optional, List
@@
-    def create_delta_dir(self, folder: str, root_dir: str = '.') -> str or None:
+    def create_delta_dir(self, folder: str, root_dir: str = '.') -> Optional[str]:
@@
-    def load_input_file(self, input_file: str) -> list[str] or None:
+    def load_input_file(self, input_file: str) -> Optional[List[str]]:

Also applies to: 130-131, 165-166


94-95: Emit an absolute delta path for clarity/consumers.

-        self.print_to_file_or_stdout(delta_folder, self.output)
+        self.print_to_file_or_stdout(os.path.abspath(os.path.join(self.root_dir, delta_folder)), self.output)

118-127: Optionally continue on per-file copy errors to process remaining entries.

Current behavior aborts the entire run on first copy error. Consider logging and continuing, returning non‑zero if any failure occurred.

-            except (OSError, shutil.Error) as e:
-                self.print_stderr(f'ERROR: Failed to copy {source_file} to {dest_path}: {e}')
-                return 1, ''
+            except (OSError, shutil.Error) as e:
+                self.print_stderr(f'ERROR: Failed to copy {source_file} to {dest_path}: {e}')
+                had_errors = True
@@
-        return 0, delta_folder
+        return (1 if 'had_errors' in locals() and had_errors else 0), delta_folder

180-186: Normalize both path separators when stripping leading separators.

On Windows, inputs may use '/' or ''. Strip both for consistency, or defer all sanitization to the copy() normalization.

-                        if source_file:
-                            # Save the file path without any leading separators
-                            files.append(source_file.lstrip(os.sep))
+                        if source_file:
+                            # Save path without any leading separators (handle '/' and '\')
+                            leading = '/\\' if os.sep == '\\' else os.sep
+                            files.append(source_file.lstrip(leading))
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e8de7a4 and a2cb500.

📒 Files selected for processing (1)
  • src/scanoss/delta.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/scanoss/delta.py (1)
src/scanoss/scanossbase.py (3)
  • ScanossBase (28-107)
  • print_to_file_or_stdout (83-94)
  • print_trace (65-70)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build

Comment on lines +96 to +107
for source_file in files:
# Normalise the source path to handle ".." and redundant separators
normalised_source = os.path.normpath(source_file)
if '..' in normalised_source:
self.print_stderr(f'WARNING: Source path escapes root directory for {source_file}. Skipping.')
continue
# Resolve to the absolute path for source validation
abs_source = os.path.abspath(os.path.join(self.root_dir, normalised_source))
# Check if the source file exists and is a file
if not os.path.exists(abs_source) or not os.path.isfile(abs_source):
self.print_stderr(f'WARNING: File {source_file} does not exist or is not a file, skipping')
continue
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Harden path validation: reject absolute/drive-qualified paths and block symlink escapes; use realpath for copy.

Prevents cross-platform escapes (Windows drive letters) and copying through symlinks that resolve outside root_dir.

@@
-        for source_file in files:
-            # Normalise the source path to handle ".." and redundant separators
-            normalised_source = os.path.normpath(source_file)
-            if '..' in normalised_source:
-                self.print_stderr(f'WARNING: Source path escapes root directory for {source_file}. Skipping.')
-                continue
-            # Resolve to the absolute path for source validation
-            abs_source = os.path.abspath(os.path.join(self.root_dir, normalised_source))
-            # Check if the source file exists and is a file
-            if not os.path.exists(abs_source) or not os.path.isfile(abs_source):
-                self.print_stderr(f'WARNING: File {source_file} does not exist or is not a file, skipping')
-                continue
+        for source_file in files:
+            # Normalize the source path to handle ".." and redundant separators
+            normalised_source = os.path.normpath(source_file)
+            # Reject absolute or drive-qualified paths (cross‑platform)
+            drive, _ = os.path.splitdrive(normalised_source)
+            if os.path.isabs(normalised_source) or drive:
+                self.print_stderr(f'WARNING: Absolute or drive-qualified path {source_file} not allowed. Skipping.')
+                continue
+            # Reject traversal that escapes the root
+            if normalised_source == '..' or normalised_source.startswith('..' + os.sep):
+                self.print_stderr(f'WARNING: Source path escapes root directory for {source_file}. Skipping.')
+                continue
+            # Resolve to the absolute path for source validation
+            abs_source = os.path.abspath(os.path.join(self.root_dir, normalised_source))
+            # Resolve symlinks and ensure the real path stays within root_dir
+            real_root = os.path.realpath(self.root_dir)
+            real_source = os.path.realpath(abs_source)
+            if not real_source.startswith(real_root + os.sep):
+                self.print_stderr(f'WARNING: Resolved source {real_source} escapes root directory. Skipping.')
+                continue
+            # Check if the resolved source is a regular file
+            if not os.path.isfile(real_source):
+                self.print_stderr(f'WARNING: File {source_file} does not exist or is not a file, skipping.')
+                continue
@@
-                self.print_debug(f'Copying {source_file} to {dest_path} ...')
-                shutil.copy(abs_source, dest_path)
+                self.print_debug(f'Copying {source_file} to {dest_path} ...')
+                shutil.copy(real_source, dest_path)

Also applies to: 118-126

@github-actions
Copy link

SCANOSS SCAN Completed 🚀

  • Detected components: 1
  • Undeclared components: 0
  • Declared components: 1
  • Detected files: 83
  • Detected files undeclared: 0
  • Detected files declared: 83
  • Licenses detected: 1
  • Licenses detected with copyleft: 0
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
src/scanoss/delta.py (1)

96-107: Harden source-path validation (absolute/drive paths, traversal, symlinks) and copy the real target.

Current check if '..' in normalised_source can false‑positive and misses absolute/drive-qualified paths; symlink escapes aren’t blocked. Use segment-based traversal checks, reject abs/drive paths, verify resolved real path stays within root_dir, and copy from real_source.

         for source_file in files:
             # Normalise the source path to handle ".." and redundant separators
             normalised_source = os.path.normpath(source_file)
-            if '..' in normalised_source:
-                self.print_stderr(f'WARNING: Source path escapes root directory for {source_file}. Skipping.')
-                continue
-            # Resolve to the absolute path for source validation
-            abs_source = os.path.abspath(os.path.join(self.root_dir, normalised_source))
-            # Check if the source file exists and is a file
-            if not os.path.exists(abs_source) or not os.path.isfile(abs_source):
-                self.print_stderr(f'WARNING: File {source_file} does not exist or is not a file, skipping')
-                continue
+            # Reject absolute or drive-qualified paths (cross-platform)
+            drive, _ = os.path.splitdrive(normalised_source)
+            if os.path.isabs(normalised_source) or drive:
+                self.print_stderr(f'WARNING: Absolute or drive-qualified path {source_file} not allowed. Skipping.')
+                continue
+            # Reject traversal that escapes the root (segment-aware)
+            if normalised_source == '..' or normalised_source.startswith('..' + os.sep):
+                self.print_stderr(f'WARNING: Source path escapes root directory for {source_file}. Skipping.')
+                continue
+            # Resolve and validate within root_dir, following symlinks
+            real_root = os.path.realpath(self.root_dir)
+            abs_source = os.path.abspath(os.path.join(self.root_dir, normalised_source))
+            real_source = os.path.realpath(abs_source)
+            if not real_source.startswith(real_root + os.sep):
+                self.print_stderr(f'WARNING: Resolved source {real_source} escapes root directory. Skipping.')
+                continue
+            # Check that the resolved source is a regular file
+            if not os.path.isfile(real_source):
+                self.print_stderr(f'WARNING: File {source_file} does not exist or is not a file, skipping')
+                continue
@@
-                shutil.copy(abs_source, dest_path)
+                shutil.copy(real_source, dest_path)

Also applies to: 124-124

🧹 Nitpick comments (5)
src/scanoss/delta.py (5)

53-59: Docstring: document root_dir.

__init__ accepts root_dir but the docstring doesn’t document it. Add a brief description.

         :param output: Output file path for the delta directory location (stdout if not provided).
+        :param root_dir: Root directory under which the delta directory is created and from which
+            relative source paths are resolved.

64-64: Normalize root_dir once to avoid CWD drift.

Make root_dir absolute during init to prevent subtle path mix-ups later.

-        self.root_dir = root_dir if root_dir else '.'
+        self.root_dir = os.path.abspath(root_dir) if root_dir else os.getcwd()

82-84: Stronger input validation: ensure it’s a regular file.

exists() allows directories. Use isfile() for clarity.

-        if not os.path.exists(input_file):
+        if not os.path.isfile(input_file):
             self.print_stderr(f'ERROR: Input file {input_file} does not exist')
             return 1, ''

95-128: Don’t abort on first copy failure; continue and return non‑zero if any errors.

Fail-fast stops processing good files. Continue, log errors, and reflect a nonzero exit code at the end.

-        # Process each file and copy it to the delta dir
-        for source_file in files:
+        # Process each file and copy it to the delta dir
+        had_errors = False
+        for source_file in files:
@@
             except (OSError, shutil.Error) as e:
                 self.print_stderr(f'ERROR: Failed to copy {source_file} to {dest_path}: {e}')
-                return 1, ''
-        return 0, delta_folder
+                had_errors = True
+                continue
+        return (1, delta_folder) if had_errors else (0, delta_folder)

183-189: Trim robustly and strip both separators for cross‑platform inputs.

Handle CRLF and leading whitespace with strip(), and strip both '/' and '' to neutralize absolute paths in mixed inputs.

-                    for line in f:
-                        source_file = line.rstrip()
+                    for line in f:
+                        source_file = line.strip()
                         if source_file:
                             # Save the file path without any leading separators
-                            files.append(source_file.lstrip(os.sep))
+                            files.append(source_file.lstrip('/\\'))
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a2cb500 and e3cfeb4.

📒 Files selected for processing (1)
  • src/scanoss/delta.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/scanoss/delta.py (1)
src/scanoss/scanossbase.py (3)
  • ScanossBase (28-107)
  • print_to_file_or_stdout (83-94)
  • print_trace (65-70)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build

Comment on lines +142 to +156
if folder:
# Resolve a relative folder under root_dir so checks/creation apply to the right place
resolved = folder if os.path.isabs(folder) else os.path.join(root_dir, folder)
resolved = os.path.normpath(resolved)
# Validate the target directory doesn't already exist and create it
if os.path.exists(resolved):
self.print_stderr(f'ERROR: Folder {resolved} already exists.')
return None
else:
try:
self.print_debug(f'Creating delta directory {resolved}...')
os.makedirs(resolved)
except (OSError, IOError) as e:
self.print_stderr(f'ERROR: Failed to create directory {resolved}: {e}')
return None
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Bug: create_delta_dir returns the un-resolved folder when provided; return the resolved path.

When folder is relative, you create <root_dir>/<folder> but return folder, which makes printed/output path inconsistent with the actual directory used.

         if folder:
             # Resolve a relative folder under root_dir so checks/creation apply to the right place
             resolved = folder if os.path.isabs(folder) else os.path.join(root_dir, folder)
             resolved = os.path.normpath(resolved)
@@
                 try:
                     self.print_debug(f'Creating delta directory {resolved}...')
                     os.makedirs(resolved)
                 except (OSError, IOError) as e:
                     self.print_stderr(f'ERROR: Failed to create directory {resolved}: {e}')
                     return None
+                # Ensure we return the resolved path
+                folder = resolved
@@
-        return os.path.normpath(folder)
+        return os.path.normpath(folder)

Also applies to: 166-166

🤖 Prompt for AI Agents
In src/scanoss/delta.py around lines 142 to 156 (and also at line 166), the
function resolves a relative folder into a full path (root_dir + folder) and
creates that directory, but still returns the original un-resolved folder which
causes inconsistent/incorrect paths; change the function to return the resolved
(normalized) path variable instead of the original folder variable and ensure
any subsequent return at line 166 also returns the resolved path so callers
always receive the actual created directory path.

@github-actions
Copy link

SCANOSS SCAN Completed 🚀

  • Detected components: 1
  • Undeclared components: 0
  • Declared components: 1
  • Detected files: 83
  • Detected files undeclared: 0
  • Detected files declared: 83
  • Licenses detected: 1
  • Licenses detected with copyleft: 0
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

@github-actions
Copy link

SCANOSS SCAN Completed 🚀

  • Detected components: 1
  • Undeclared components: 0
  • Declared components: 1
  • Detected files: 83
  • Detected files undeclared: 0
  • Detected files declared: 83
  • Licenses detected: 1
  • Licenses detected with copyleft: 0
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (3)
src/scanoss/delta.py (3)

96-107: Add symlink validation to prevent directory escape.

The current validation doesn't resolve symlinks, which means a symbolic link could point outside root_dir and bypass the path safety checks.

Apply this diff to resolve and validate symlinks:

         # Resolve to the absolute path for source validation
         abs_source = os.path.abspath(os.path.join(self.root_dir, normalised_source))
+        # Resolve symlinks and ensure the real path stays within root_dir
+        real_root = os.path.realpath(self.root_dir)
+        real_source = os.path.realpath(abs_source)
+        if not real_source.startswith(real_root + os.sep) and real_source != real_root:
+            self.print_stderr(f'WARNING: Resolved source {real_source} escapes root directory. Skipping.')
+            continue
         # Check if the source file exists and is a file
-        if not os.path.exists(abs_source) or not os.path.isfile(abs_source):
+        if not os.path.isfile(real_source):
             self.print_stderr(f'WARNING: File {source_file} does not exist or is not a file, skipping')
             continue

Then update line 124 to copy the resolved path:

             self.print_debug(f'Copying {source_file} to {dest_path} ...')
-            shutil.copy(abs_source, dest_path)
+            shutil.copy(real_source, dest_path)

96-107: Refine path traversal check and add absolute path rejection.

Line 99's substring check for '..' will incorrectly reject legitimate filenames like "foo..bar.txt". The check should verify that the normalized path starts with '..' rather than merely containing it.

Additionally, absolute paths in the input should be explicitly rejected before normalization to prevent confusion.

Apply this diff:

     for source_file in files:
         # Normalise the source path to handle ".." and redundant separators
         normalised_source = os.path.normpath(source_file)
+        # Reject absolute paths
+        if os.path.isabs(normalised_source):
+            self.print_stderr(f'WARNING: Absolute path {source_file} not allowed. Skipping.')
+            continue
-        if '..' in normalised_source:
+        # Reject paths that escape the root directory
+        if normalised_source == '..' or normalised_source.startswith('..' + os.sep):
             self.print_stderr(f'WARNING: Source path escapes root directory for {source_file}. Skipping.')
             continue
         # Resolve to the absolute path for source validation
         abs_source = os.path.abspath(os.path.join(self.root_dir, normalised_source))

142-168: CRITICAL: Return the resolved path, not the original folder argument.

When a relative folder is provided, the code resolves it to an absolute path (line 144), creates the directory at that resolved location (line 153), but then returns the original folder variable at line 168 instead of the resolved path. This creates an inconsistency where callers receive a path that doesn't match the actual directory that was created.

Impact: The copy method receives an incorrect path from create_delta_dir, which could lead to files being copied to the wrong location or path validation failures.

Apply this diff:

     if folder:
         # Resolve a relative folder under root_dir so checks/creation apply to the right place
         resolved = folder if os.path.isabs(folder) else os.path.join(root_dir, folder)
         resolved = os.path.normpath(resolved)
         # Validate the target directory doesn't already exist and create it
         if os.path.exists(resolved):
             self.print_stderr(f'ERROR: Folder {resolved} already exists.')
             return None
         else:
             try:
                 self.print_debug(f'Creating delta directory {resolved}...')
                 os.makedirs(resolved)
             except (OSError, IOError) as e:
                 self.print_stderr(f'ERROR: Failed to create directory {resolved}: {e}')
                 return None
+            # Update folder to the resolved path for consistent return
+            folder = resolved
     else:
         # Create a unique temporary directory in the given root directory
         try:
             self.print_debug(f'Creating temporary delta directory in {root_dir} ...')
             folder = tempfile.mkdtemp(prefix="delta-", dir=root_dir)
             if folder:
                 folder = folder.removeprefix(root_dir).lstrip(os.sep)  # Strip the root from the temp folder
             self.print_debug(f'Created temporary delta directory: {folder}')
         except (OSError, IOError) as e:
             self.print_stderr(f'ERROR: Failed to create temporary directory in {root_dir}: {e}')
             return None
     return folder
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c22bf5b and 154e6ff.

📒 Files selected for processing (1)
  • src/scanoss/delta.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/scanoss/delta.py (1)
src/scanoss/scanossbase.py (3)
  • ScanossBase (28-107)
  • print_to_file_or_stdout (83-94)
  • print_trace (65-70)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build

return 1, ''
return 0, delta_folder

def create_delta_dir(self, folder: str, root_dir: str = '.') -> str or None:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Use proper union type syntax for return annotations.

The return type str or None is not valid type annotation syntax. Use str | None (Python 3.10+) or Optional[str] from typing.

Apply this diff:

-    def create_delta_dir(self, folder: str, root_dir: str = '.') -> str or None:
+    def create_delta_dir(self, folder: str, root_dir: str = '.') -> str | None:
🤖 Prompt for AI Agents
In src/scanoss/delta.py around line 130, the return annotation uses invalid
syntax ("str or None"); change the signature to use a proper union type like
"Optional[str]" or "str | None" (prefer Optional[str] for older Python
compatibility), and if you choose Optional[str] add "from typing import
Optional" to the module imports; update the function signature accordingly and
ensure any type-checking or docstrings reflect the new annotation.

@github-actions
Copy link

SCANOSS SCAN Completed 🚀

  • Detected components: 1
  • Undeclared components: 0
  • Declared components: 1
  • Detected files: 83
  • Detected files undeclared: 0
  • Detected files declared: 83
  • Licenses detected: 1
  • Licenses detected with copyleft: 0
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (4)
src/scanoss/delta.py (4)

131-131: Fix type hint: use a proper union for the return type.

-    def create_delta_dir(self, folder: str, root_dir: str = '.') -> str or None:
+    def create_delta_dir(self, folder: str, root_dir: str = '.') -> Optional[str]:

143-170: Return the resolved directory path when a folder is provided.

Currently you create “resolved” but return the original “folder”, which can be relative and inconsistent with the actual on-disk path and the printed location. Return the resolved path.

@@
         if folder:
             # Resolve a relative folder under root_dir so checks/creation apply to the right place
             resolved = folder if os.path.isabs(folder) else os.path.join(root_dir, folder)
             resolved = os.path.normpath(resolved)
@@
             else:
                 try:
                     self.print_debug(f'Creating delta directory {resolved}...')
                     os.makedirs(resolved)
                 except (OSError, IOError) as e:
                     self.print_stderr(f'ERROR: Failed to create directory {resolved}: {e}')
                     return None
+                # Ensure we return the resolved path
+                folder = resolved
@@
-        return folder
+        return os.path.normpath(folder)

97-108: Block absolute/drive-qualified inputs and symlink escapes; refine traversal check.

Current logic allows drive-qualified paths on Windows and follows symlinks that resolve outside root_dir. The traversal check also false-positives on filenames containing “..”.

Harden validation and use realpath for enforcement.

@@
-        for source_file in files:
-            # Normalise the source path to handle ".." and redundant separators
-            normalised_source = os.path.normpath(source_file)
-            if '..' in normalised_source:
-                self.print_stderr(f'WARNING: Source path escapes root directory for {source_file}. Skipping.')
-                continue
-            # Resolve to the absolute path for source validation
-            abs_source = os.path.abspath(os.path.join(self.root_dir, normalised_source))
-            # Check if the source file exists and is a file
-            if not os.path.exists(abs_source) or not os.path.isfile(abs_source):
-                self.print_stderr(f'WARNING: File {source_file} does not exist or is not a file, skipping')
-                continue
+        for source_file in files:
+            # Normalize and harden against traversal/absolute/drive-qualified inputs
+            normalised_source = os.path.normpath(source_file)
+            drive, _ = os.path.splitdrive(normalised_source)
+            if os.path.isabs(normalised_source) or drive:
+                self.print_stderr(f'WARNING: Absolute or drive-qualified path {source_file} not allowed. Skipping.')
+                continue
+            if normalised_source == '..' or normalised_source.startswith('..' + os.sep):
+                self.print_stderr(f'WARNING: Source path escapes root directory for {source_file}. Skipping.')
+                continue
+            # Resolve real paths and ensure source stays within root_dir (blocks symlink escapes)
+            real_root = os.path.realpath(self.root_dir)
+            abs_source = os.path.abspath(os.path.join(self.root_dir, normalised_source))
+            real_source = os.path.realpath(abs_source)
+            if not real_source.startswith(real_root + os.sep):
+                self.print_stderr(f'WARNING: Resolved source {real_source} escapes root directory. Skipping.')
+                continue
+            if not os.path.isfile(real_source):
+                self.print_stderr(f'WARNING: File {source_file} does not exist or is not a file, skipping')
+                continue

124-125: Copy from the resolved real path to avoid symlink surprises.

-                self.print_debug(f'Copying {source_file} to {dest_path} ...')
-                shutil.copy(abs_source, dest_path)
+                self.print_debug(f'Copying {source_file} to {dest_path} ...')
+                shutil.copy(real_source, dest_path)
🧹 Nitpick comments (4)
src/scanoss/delta.py (4)

110-117: Ensure dest_path can’t be misread as absolute due to alt separators.

Also strip alt separators when building the destination component.

-            dest_path = os.path.normpath(os.path.join(self.root_dir, delta_folder, normalised_source.lstrip(os.sep)))
+            dest_path = os.path.normpath(
+                os.path.join(self.root_dir, delta_folder, normalised_source.lstrip('/\\'))
+            )

171-197: Normalize input lines and strip both separators cross‑platform.

Handle CRLF/leading whitespace robustly and remove both “/” and “\” as leading separators.

@@
-                with open(input_file, 'r', encoding='utf-8') as f:
+                with open(input_file, 'r', encoding='utf-8') as f:
                     for line in f:
-                        source_file = line.rstrip()
+                        source_file = line.strip()
                         if source_file:
                             # Save the file path without any leading separators
-                            files.append(source_file.lstrip(os.sep))
+                            leading_seps = os.sep + (os.path.altsep or '')
+                            files.append(source_file.lstrip(leading_seps))

95-129: Don’t abort on first copy failure; continue and return partial success code.

Fail-fast leaves a half-populated delta; prefer copying as much as possible and reporting a non-zero status if any error occurred.

@@
-        # Process each file and copy it to the delta dir
-        for source_file in files:
+        # Process each file and copy it to the delta dir
+        errors = 0
+        for source_file in files:
@@
-            except (OSError, shutil.Error) as e:
-                self.print_stderr(f'ERROR: Failed to copy {source_file} to {dest_path}: {e}')
-                return 1, ''
-        return 0, delta_folder
+            except (OSError, shutil.Error) as e:
+                self.print_stderr(f'ERROR: Failed to copy {source_file} to {dest_path}: {e}')
+                errors += 1
+                continue
+        return (1 if errors else 0), delta_folder

41-50: Optional: consolidate flags into a config to satisfy linter PLR0913.

Consider a small config dataclass or a single options dict for debug/trace/quiet/output to reduce parameter count.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 154e6ff and 8427495.

📒 Files selected for processing (1)
  • src/scanoss/delta.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/scanoss/delta.py (1)
src/scanoss/scanossbase.py (3)
  • ScanossBase (28-107)
  • print_to_file_or_stdout (83-94)
  • print_trace (65-70)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build

@github-actions
Copy link

SCANOSS SCAN Completed 🚀

  • Detected components: 1
  • Undeclared components: 0
  • Declared components: 1
  • Detected files: 83
  • Detected files undeclared: 0
  • Detected files declared: 83
  • Licenses detected: 1
  • Licenses detected with copyleft: 0
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

@eeisegn eeisegn requested a review from matiasdaloia October 17, 2025 13:08
@github-actions
Copy link

SCANOSS SCAN Completed 🚀

  • Detected components: 1
  • Undeclared components: 0
  • Declared components: 1
  • Detected files: 82
  • Detected files undeclared: 0
  • Detected files declared: 82
  • Licenses detected: 1
  • Licenses detected with copyleft: 0
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

@matiasdaloia
Copy link
Contributor

Hi @Alex-1089 could you add a brief guide on what/how i should test this pr?

@Alex-1089 Alex-1089 merged commit ff6b8f7 into main Oct 17, 2025
6 checks passed
@Alex-1089 Alex-1089 deleted the feat/SP-3354_delta-scan-setup branch October 17, 2025 14:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants