Feat/added code quality format to convert subcomand and added new match summary into inspect subcommand #157

agustingroh · 2025-10-22T20:20:24Z

What's Changed

Added

Added glc-codequality format to convert subcomand
Added inspect gitlab matches subcommand to generate GitLab-compatible Markdown match summary from SCANOSS scan results
Added utility modules for shared functionality (markdown_utils.py and file_utils.py)

Changed

Refactored table generation utilities into shared markdown_utils module
Refactored JSON file loading into shared file_utils module

Summary by CodeRabbit

New Features
- GitLab Code Quality output and a CLI command to generate GitLab component match summaries (line-range and output options); Markdown match-summary report generator.
Documentation
- Updated help/docs with GitLab integration examples, new conversion pathway, Folder-Scan usage, and raw-results conversion updates.
Refactor
- Table generation moved to shared utilities; JSON loading delegated to a helper; unified output formatting interfaces.
Bug Fixes
- Safer variable initialization to avoid exceptions during folder hashing.
Chores
- Added Unreleased [1.39.0] entry and manifest link to the changelog.

coderabbitai · 2025-10-22T20:20:52Z

Warning

Rate limit exceeded

@agustingroh has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 11 minutes and 11 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 26eb9b8 and 585bdde.

📒 Files selected for processing (16)

CHANGELOG.md (2 hunks)
CLIENT_HELP.md (4 hunks)
src/scanoss/cli.py (9 hunks)
src/scanoss/gitlabqualityreport.py (1 hunks)
src/scanoss/inspection/dependency_track/project_violation.py (3 hunks)
src/scanoss/inspection/policy_check.py (0 hunks)
src/scanoss/inspection/raw/component_summary.py (2 hunks)
src/scanoss/inspection/raw/copyleft.py (3 hunks)
src/scanoss/inspection/raw/license_summary.py (3 hunks)
src/scanoss/inspection/raw/match_summary.py (1 hunks)
src/scanoss/inspection/raw/raw_base.py (2 hunks)
src/scanoss/inspection/raw/undeclared_component.py (3 hunks)
src/scanoss/inspection/utils/file_utils.py (1 hunks)
src/scanoss/inspection/utils/markdown_utils.py (1 hunks)
src/scanoss/scanners/folder_hasher.py (1 hunks)
src/scanoss/utils/scanoss_scan_results_utils.py (1 hunks)

Walkthrough

Adds GitLab Code Quality JSON output and a GitLab matches Markdown inspector; extracts JSON loading, table generation, and line-range parsing into new utilities; replaces instance table generators with shared helpers; updates CLI wiring, docs, and small robustness/typing fixes.

Changes

Cohort / File(s)	Summary
CLI & GitLab integration `src/scanoss/cli.py`, `src/scanoss/gitlabqualityreport.py`	Add `glc-codequality` format; new `GitLabQualityReport` to produce GitLab Code Quality JSON; add `inspect gitlab matches` subcommand and handler `inspect_gitlab_matches`; CLI wiring and error handling.
GitLab match summary report `src/scanoss/inspection/raw/match_summary.py`	New `MatchSummary` dataclasses and class to load SCANOSS results, validate, build file/snippet match summaries, and emit Markdown (stdout/file).
Utility extraction `src/scanoss/inspection/utils/markdown_utils.py`, `src/scanoss/inspection/utils/file_utils.py`, `src/scanoss/utils/scanoss_scan_results_utils.py`	Add `generate_table` and `generate_jira_table`; add `load_json_file` for validated JSON loading; add `get_lines` for line-range parsing.
Table generator replacements `src/scanoss/inspection/raw/copyleft.py`, `src/scanoss/inspection/raw/undeclared_component.py`, `src/scanoss/inspection/dependency_track/project_violation.py`	Replace instance `generate_table`/`generate_jira_table` calls with imports from `markdown_utils`; note `project_violation.py` contains a likely typo `qgenerate_jira_table`.
Raw input loading centralization `src/scanoss/inspection/raw/raw_base.py`	Replace manual JSON file I/O with `load_json_file` helper and standardized parse/error handling.
Formatter stubs & output pathing `src/scanoss/inspection/raw/component_summary.py`, `src/scanoss/inspection/raw/license_summary.py`	Add `_json`, `_markdown`, `_jira_markdown` stubs and `_format` method; route `run()` output through `_format` before writing.
Minor robustness `src/scanoss/scanners/folder_hasher.py`	Initialize `full_file_path` before loop to avoid uninitialized-variable in exception paths.
Docs / changelog `CHANGELOG.md`, `CLIENT_HELP.md`	Add 1.39.0 entry, document `glc-codequality`, `gitlab matches` inspector, new conversion examples, and new folder-scan options.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant CLI
    participant MatchSummary
    participant GitLabQR
    participant FileUtils
    participant MarkdownUtils
    participant Output

    rect rgba(240,249,255,1)
    Note over CLI,MatchSummary: Inspect GitLab matches flow
    User->>CLI: inspect gitlab matches --input results.json [--output out.md]
    CLI->>MatchSummary: instantiate(args)
    MatchSummary->>FileUtils: load_json_file(results.json)
    FileUtils-->>MatchSummary: parsed JSON
    MatchSummary->>MatchSummary: validate & build summaries
    MatchSummary->>MarkdownUtils: generate_table(...) for sections
    MarkdownUtils-->>MatchSummary: markdown string
    MatchSummary->>Output: print_to_file_or_stdout(markdown)
    end

    rect rgba(255,247,240,1)
    Note over CLI,GitLabQR: Convert → GitLab Code Quality JSON
    User->>CLI: convert --input results.json --format glc-codequality [--output out.json]
    CLI->>GitLabQR: produce_from_file(results.json, output)
    GitLabQR->>FileUtils: load_json_file(results.json)
    FileUtils-->>GitLabQR: parsed JSON
    GitLabQR->>GitLabQR: map results → CodeQuality items
    GitLabQR->>Output: write JSON (stdout/file)
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Areas needing extra attention:
- GitLabQualityReport mapping (fingerprint, severity, location/lines) and exact GitLab Code Quality JSON schema.
- MatchSummary validation, URL/line-range construction, and Markdown rendering correctness.
- The project_violation.py reference to qgenerate_jira_table (likely typo).
- New centralized load_json_file error propagation and callers' assumptions.
- CLI subcommand wiring and argument validation for inspect gitlab matches.

Possibly related PRs

[SP-2587] Add directory simhash, modify concatenated names to remove extensions #119 — touches src/scanoss/scanners/folder_hasher.py; related at the folder_hasher change level.
feat: ES-163 Add folder hashing support #98 — earlier folder_hasher additions; related to the robustness tweak.
Feat/dep track inspection #140 — refactors inspection formatting and table generation; related to replacing instance table generators with markdown_utils.

Suggested labels

enhancement

Suggested reviewers

isasmendiagus
eeisegn
matiasdaloia

Poem

🐰 A hop of JSON, tables and light,
New match lines stitched through day and night.
CLI whispers, reports set free,
Markdown blooms, glc sings with glee.
The rabbit smiles: "Ship it, whee!"

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The pull request title "Feat/added code quality format to convert subcomand and added new match summary into inspect subcommand" directly addresses the two primary changes in the PR: the addition of the `glc-codequality` format to the convert subcommand and the new `inspect gitlab matches` subcommand. The title is specific and communicates the main features added without being vague or generic. However, the title contains a typo ("subcomand" instead of "subcommand"), uses a prefixed convention ("Feat/"), and doesn't mention the utility module refactoring—though the latter is appropriately secondary to the main feature additions. A teammate reviewing git history would clearly understand the purpose of this PR from the title alone.
Docstring Coverage	✅ Passed	Docstring coverage is 80.00% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2025-10-22T20:20:59Z

SCANOSS SCAN Completed 🚀

Detected components: 1
Undeclared components: 0
Declared components: 1
Detected files: 86
Detected files undeclared: 0
Detected files declared: 86
Licenses detected: 1
Licenses detected with copyleft: 0
Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

github-actions · 2025-10-22T20:44:37Z

SCANOSS SCAN Completed 🚀

Detected components: 1
Undeclared components: 0
Declared components: 1
Detected files: 85
Detected files undeclared: 0
Detected files declared: 85
Licenses detected: 1
Licenses detected with copyleft: 0
Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

github-actions · 2025-10-22T20:56:44Z

SCANOSS SCAN Completed 🚀

Detected components: 1
Undeclared components: 0
Declared components: 1
Detected files: 85
Detected files undeclared: 0
Detected files declared: 85
Licenses detected: 1
Licenses detected with copyleft: 0
Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

github-actions · 2025-10-22T20:59:20Z

SCANOSS SCAN Completed 🚀

Detected components: 1
Undeclared components: 0
Declared components: 1
Detected files: 85
Detected files undeclared: 0
Detected files declared: 85
Licenses detected: 1
Licenses detected with copyleft: 0
Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

src/scanoss/inspection/raw/match_summary.py

src/scanoss/cli.py

github-actions · 2025-10-23T10:27:51Z

SCANOSS SCAN Completed 🚀

Detected components: 1
Undeclared components: 0
Declared components: 1
Detected files: 76
Detected files undeclared: 0
Detected files declared: 76
Licenses detected: 1
Licenses detected with copyleft: 0
Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

github-actions · 2025-10-23T10:37:47Z

SCANOSS SCAN Completed 🚀

Detected components: 1
Undeclared components: 0
Declared components: 1
Detected files: 67
Detected files undeclared: 0
Detected files declared: 67
Licenses detected: 1
Licenses detected with copyleft: 0
Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

github-actions · 2025-10-23T12:09:31Z

SCANOSS SCAN Completed 🚀

Detected components: 1
Undeclared components: 0
Declared components: 1
Detected files: 74
Detected files undeclared: 0
Detected files declared: 74
Licenses detected: 1
Licenses detected with copyleft: 0
Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

coderabbitai

Actionable comments posted: 8

🧹 Nitpick comments (4)

src/scanoss/inspection/utils/markdown_utils.py (1)

52-52: Consider removing the unused centered_columns parameter.

The centered_columns parameter is declared but never used in the function body. If Jira tables don't support column centering, consider removing this parameter to avoid confusion.

src/scanoss/inspection/raw/match_summary.py (1)

174-179: Redundant purl validation checks.

Lines 174-179 contain redundant validation. The check on line 177 (if not len(result.get('purl')) > 0) is redundant with line 174 (if not result.get('purl')) since an empty list is falsy in Python.

Consider consolidating to a single, clearer check:

-                if not result.get('purl'):
-                    self.print_debug(f'ERROR: No purl found for file {file_name}')
-                    continue
-                if not len(result.get('purl')) > 0:
+                if not result.get('purl') or len(result.get('purl')) == 0:
                     self.print_debug(f'ERROR: No purl found for file {file_name}')
                     continue

Or even simpler, since the first check catches both None and empty list:

                 if not result.get('purl'):
                     self.print_debug(f'ERROR: No purl found for file {file_name}')
                     continue
-                if not len(result.get('purl')) > 0:
-                    self.print_debug(f'ERROR: No purl found for file {file_name}')
-                    continue

src/scanoss/gitlabqualityreport.py (2)

58-64: Redundant assignment in constructor.

Line 64 (self.debug = debug) is redundant since the parent class ScanossBase.__init__(debug) already sets self.debug.

Remove the redundant assignment:

     def __init__(self, debug: bool = False, output_file: str = None):
         """
         Initialise the GitLabCodeQuality class
         """
         super().__init__(debug)
         self.output_file = output_file
-        self.debug = debug

92-103: Consider using context manager for file handling.

The file handling could be improved by using a context manager consistently, though the current implementation works correctly.

     def _write_output(self, data: list[CodeQuality], output_file: str = None) -> bool:
         """Write the Gitlab Code Quality Report to output."""
         try:
             json_data = [item.to_dict() for item in data]
-            file = open(output_file, 'w') if output_file else sys.stdout
-            print(json.dumps(json_data, indent=2), file=file)
-            if output_file:
-                file.close()
+            if output_file:
+                with open(output_file, 'w') as file:
+                    print(json.dumps(json_data, indent=2), file=file)
+            else:
+                print(json.dumps(json_data, indent=2), file=sys.stdout)
             return True
         except Exception as e:
             self.print_stderr(f'Error writing output: {str(e)}')
             return False

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 099ede5 and 1e03685.

📒 Files selected for processing (13)

CHANGELOG.md (2 hunks)
CLIENT_HELP.md (4 hunks)
src/scanoss/__init__.py (1 hunks)
src/scanoss/cli.py (8 hunks)
src/scanoss/gitlabqualityreport.py (1 hunks)
src/scanoss/inspection/dependency_track/project_violation.py (3 hunks)
src/scanoss/inspection/policy_check.py (0 hunks)
src/scanoss/inspection/raw/copyleft.py (3 hunks)
src/scanoss/inspection/raw/match_summary.py (1 hunks)
src/scanoss/inspection/raw/raw_base.py (2 hunks)
src/scanoss/inspection/raw/undeclared_component.py (3 hunks)
src/scanoss/inspection/utils/file_utils.py (1 hunks)
src/scanoss/inspection/utils/markdown_utils.py (1 hunks)

💤 Files with no reviewable changes (1)

src/scanoss/inspection/policy_check.py

🧰 Additional context used

🧬 Code graph analysis (7)

src/scanoss/inspection/dependency_track/project_violation.py (2)

src/scanoss/inspection/utils/markdown_utils.py (2)

generate_jira_table (52-63)

generate_table (25-50)

src/scanoss/inspection/raw/copyleft.py (1)

_md_summary_generator (126-166)

src/scanoss/gitlabqualityreport.py (2)

src/scanoss/scanossbase.py (1)

ScanossBase (28-107)

src/scanoss/inspection/raw/match_summary.py (1)

_get_lines (96-112)

src/scanoss/inspection/raw/copyleft.py (2)

src/scanoss/inspection/utils/markdown_utils.py (2)

generate_jira_table (52-63)

generate_table (25-50)

src/scanoss/inspection/dependency_track/project_violation.py (1)

_md_summary_generator (380-431)

src/scanoss/inspection/raw/raw_base.py (2)

src/scanoss/inspection/utils/file_utils.py (1)

load_json_file (29-44)

src/scanoss/cli.py (1)

print_stderr (94-98)

src/scanoss/inspection/raw/undeclared_component.py (1)

src/scanoss/inspection/utils/markdown_utils.py (2)

generate_jira_table (52-63)

generate_table (25-50)

src/scanoss/inspection/raw/match_summary.py (3)

src/scanoss/scanossbase.py (2)

ScanossBase (28-107)

print_to_file_or_stdout (83-94)

src/scanoss/inspection/utils/file_utils.py (1)

load_json_file (29-44)

src/scanoss/inspection/utils/markdown_utils.py (1)

generate_table (25-50)

src/scanoss/cli.py (2)

src/scanoss/inspection/raw/match_summary.py (2)

MatchSummary (60-277)

run (259-277)

src/scanoss/gitlabqualityreport.py (2)

GitLabQualityReport (52-155)

produce_from_file (140-155)

🪛 LanguageTool

CHANGELOG.md

[grammar] ~14-~14: Ensure spelling is correct
Context: ...ded glc-codequality format to convert subcomand - Added inspect gitlab matches subcomman...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

🔇 Additional comments (27)

src/scanoss/__init__.py (1)

25-25: LGTM! Version bump aligns with new features.

The version bump to 1.38.0 correctly reflects the addition of new features (GitLab Code Quality support, new inspect commands, and utility modules) per semantic versioning conventions.

src/scanoss/inspection/dependency_track/project_violation.py (3)

31-31: LGTM! Refactor to shared markdown utilities.

The import of generate_table and generate_jira_table from the new markdown_utils module centralizes table generation logic, eliminating duplication across the codebase.

199-199: LGTM! Delegating to shared utility.

The call to generate_table (module-level helper) replaces the previous instance method, correctly delegating table generation to the centralized utility.

211-211: LGTM! Delegating to shared utility.

The call to generate_jira_table (module-level helper) replaces the previous instance method, correctly delegating Jira table generation to the centralized utility.

CHANGELOG.md (1)

708-708: LGTM! Manifest link added.

The manifest link for version 1.38.0 correctly points to the comparison between 1.37.1 and 1.38.0.

src/scanoss/inspection/raw/undeclared_component.py (3)

30-30: LGTM! Refactor to shared markdown utilities.

The import of table generation helpers from markdown_utils aligns with the broader refactoring to centralize Markdown formatting logic.

197-197: LGTM! Using shared utility.

Correctly delegates Markdown table generation to the module-level generate_table helper.

215-215: LGTM! Using shared utility.

Correctly delegates Jira Markdown table generation to the module-level generate_jira_table helper.

src/scanoss/inspection/raw/copyleft.py (3)

30-30: LGTM! Refactor to shared markdown utilities.

Consistent with the broader refactoring effort to centralize table generation logic in the markdown_utils module.

115-115: LGTM! Using shared utility.

Correctly delegates to the module-level generate_table helper.

124-124: LGTM! Using shared utility.

Correctly delegates to the module-level generate_jira_table helper.

src/scanoss/inspection/raw/raw_base.py (2)

30-30: LGTM! Refactor to shared file utilities.

The import of load_json_file from the new file_utils module centralizes JSON loading logic, reducing code duplication.

315-319: LGTM! Simplified JSON loading with shared utility.

The refactored _load_input_file method correctly delegates JSON loading to the load_json_file helper while preserving error handling behavior (returning None on exceptions).

CLIENT_HELP.md (4)

241-241: LGTM! Documentation updated for new formats.

The updated description correctly reflects the addition of GitLab Code Quality Report format alongside existing formats.

252-256: LGTM! Clear example for GitLab Code Quality Report conversion.

The command example clearly demonstrates how to convert scan results to GitLab Code Quality Report format using the new glc-codequality format option.

419-419: LGTM! Inspect commands list updated.

The addition of "GitLab Components Match Summary (gitlab matches)" correctly documents the new inspect subcommand.

539-543: LGTM! Clear example for GitLab match summary.

The command example clearly demonstrates how to generate a GitLab-compatible Markdown match summary from SCANOSS scan results.

src/scanoss/inspection/utils/markdown_utils.py (1)

25-50: LGTM! Well-structured markdown table generator.

The function correctly handles centered columns using markdown syntax and properly constructs the table with headers, separators, and data rows.

src/scanoss/cli.py (5)

43-43: LGTM! Imports are correctly added.

The new imports for MatchSummary and GitLabQualityReport are properly placed and follow the existing import organization.

Also applies to: 77-77

288-288: LGTM! Format choice correctly added.

The glc-codequality format has been properly added to the choices list, consistent with the naming convention used in the convert handler.

799-873: LGTM! Well-structured GitLab subcommand configuration.

The GitLab integration subcommand is properly configured with:

Clear hierarchical structure (inspect → gitlab → matches)

Appropriate required and optional arguments

Comprehensive help text and descriptions

Proper handler function binding

1680-1684: LGTM! Convert handler properly implemented.

The GitLab Code Quality format handler follows the same pattern as existing format handlers and correctly instantiates and calls the GitLabQualityReport class.

1958-2008: LGTM! Comprehensive handler implementation.

The inspect_gitlab_matches handler is well-implemented with:

Proper output file initialization

Clear parameter passing to MatchSummary

Robust error handling with debug support

Excellent documentation explaining the command's purpose and parameters

src/scanoss/inspection/raw/match_summary.py (3)

32-58: LGTM! Well-defined data structures.

The dataclasses provide clear, well-documented structures for organizing match results. The separation between MatchSummaryItem and ComponentMatchSummary supports clean categorization of match types.

200-257: LGTM! Markdown generation is well-structured.

The markdown generation creates a clean, collapsible report structure. The use of HTML <details> tags for collapsible sections is appropriate for GitLab/GitHub markdown rendering.

Note: Empty match lists will generate tables with headers only, which is acceptable behavior.

259-277: LGTM! Clean orchestration of the workflow.

The run method provides a clear entry point and properly orchestrates the entire summary generation process.

src/scanoss/gitlabqualityreport.py (1)

9-39: LGTM! Well-structured dataclasses for GitLab Code Quality format.

The dataclasses correctly model the GitLab Code Quality report structure with proper nesting and a clean to_dict method for JSON serialization.

CHANGELOG.md

src/scanoss/gitlabqualityreport.py

src/scanoss/inspection/raw/match_summary.py

src/scanoss/inspection/utils/file_utils.py

github-actions · 2025-10-23T17:22:37Z

SCANOSS SCAN Completed 🚀

Detected components: 1
Undeclared components: 0
Declared components: 1
Detected files: 77
Detected files undeclared: 0
Detected files declared: 77
Licenses detected: 1
Licenses detected with copyleft: 0
Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

coderabbitai

Actionable comments posted: 3

♻️ Duplicate comments (1)

src/scanoss/inspection/utils/file_utils.py (1)
25-31: Use precise typing for JSON payload.

Prefer Dict[str, Any] to document structure clearly; aligns with prior feedback.
-from typing import Any, Dict
+from typing import Any, Dict
@@
-def load_json_file(file_path: str) -> dict:
+def load_json_file(file_path: str) -> Dict[str, Any]:
     """
     Load the file
@@
-      Returns:
-          Dict[str, Any]: The parsed JSON data
+      Returns:
+          Dict[str, Any]: The parsed JSON data
Also applies to: 29-37

🧹 Nitpick comments (5)

src/scanoss/inspection/utils/file_utils.py (1)

25-26: Harden file reading and JSON error handling.

Specify UTF‑8 and catch JSONDecodeError/UnicodeDecodeError explicitly.

 import json
 import os
+from json import JSONDecodeError
@@
-    with open(file_path, 'r') as jsonfile:
+    with open(file_path, 'r', encoding='utf-8') as jsonfile:
         try:
             return json.load(jsonfile)
-        except Exception as e:
-            raise ValueError(f'ERROR: Problem parsing input JSON: {e}')
+        except (JSONDecodeError, UnicodeDecodeError) as e:
+            raise ValueError(f'Problem parsing input JSON: {e}') from e

Also applies to: 40-44

src/scanoss/gitlabqualityreport.py (2)

25-32: Fix import order (I001).

Sort/ group stdlib vs local imports.

-import json
-import os
-import sys
-from dataclasses import dataclass
-
-from .scanossbase import ScanossBase
-from .utils import scanoss_scan_results_utils
+from dataclasses import dataclass
+import json
+import os
+import sys
+
+from .scanossbase import ScanossBase
+from .utils import scanoss_scan_results_utils

105-117: Use context manager and propagate write result.

Close files reliably and return the writer’s status.

-    def _write_output(self, data: list[CodeQuality], output_file: str = None) -> bool:
+    def _write_output(self, data: list[CodeQuality], output_file: str = None) -> bool:
@@
-            json_data = [item.to_dict() for item in data]
-            file = open(output_file, 'w') if output_file else sys.stdout
-            print(json.dumps(json_data, indent=2), file=file)
-            if output_file:
-                file.close()
+            json_data = [item.to_dict() for item in data]
+            payload = json.dumps(json_data, indent=2)
+            if output_file:
+                with open(output_file, 'w', encoding='utf-8') as fh:
+                    print(payload, file=fh)
+            else:
+                print(payload, file=sys.stdout)
             return True
@@
-        self._write_output(data=code_quality,output_file=output_file)
-        return True
+        return self._write_output(data=code_quality, output_file=output_file)

Also applies to: 118-134

src/scanoss/inspection/raw/match_summary.py (2)

25-31: Fix import order and spacing (I001).

Sort imports and remove double space.

-from dataclasses import dataclass
-
-from ...scanossbase import ScanossBase
-from ..utils.file_utils import load_json_file
-from ..utils.markdown_utils import generate_table
-from ...utils import  scanoss_scan_results_utils
+from dataclasses import dataclass
+
+from ...scanossbase import ScanossBase
+from ..utils.file_utils import load_json_file
+from ..utils.markdown_utils import generate_table
+from ...utils import scanoss_scan_results_utils

137-182: Reduce branches in validation (PLR0912) by extracting a validator.

Consolidate required‑field checks to pass linter and simplify flow.

-        for file_name, results in scan_results.items():
-            for result in results:
-                # Validate required fields - skip invalid results with debug messages
-                if not result.get('id'):
-                    self.print_debug(f'ERROR: No id found for file {file_name}')
-                    continue
-                if result.get('id') == "none":  # Skip non-matches
-                    continue
-                if not result.get('lines'):
-                    self.print_debug(f'ERROR: No lines found for file {file_name}')
-                    continue
-                if not result.get('purl'):
-                    self.print_debug(f'ERROR: No purl found for file {file_name}')
-                    continue
-                if not len(result.get('purl')) > 0:
-                    self.print_debug(f'ERROR: No purl found for file {file_name}')
-                    continue
-                if not result.get('licenses'):
-                    self.print_debug(f'ERROR: No licenses found for file {file_name}')
-                    continue
-                if len(result.get('licenses')) == 0:
-                    self.print_debug(f'ERROR: Empty licenses list for file {file_name}')
-                    continue
-                if not result.get('version'):
-                    self.print_debug(f'ERROR: No version found for file {file_name}')
-                    continue
-                if not result.get('matched'):
-                    self.print_debug(f'ERROR: No matched found for file {file_name}')
-                    continue
-                if not result.get('url'):
-                    self.print_debug(f'ERROR: No url found for file {file_name}')
-                    continue
+        for file_name, results in scan_results.items():
+            for result in results:
+                # Skip non-matches early
+                if not result.get('id') or result.get('id') == "none":
+                    if not result.get('id'):
+                        self.print_debug(f'ERROR: No id found for file {file_name}')
+                    continue
+                # Required presence checks
+                required_fields = ['lines', 'purl', 'licenses', 'version', 'matched', 'url']
+                missing = [k for k in required_fields if not result.get(k)]
+                if missing:
+                    self.print_debug(f"ERROR: Missing {', '.join(missing)} for file {file_name}")
+                    continue
+                # Non-empty list checks
+                if not result.get('purl') or len(result['purl']) == 0:
+                    self.print_debug(f'ERROR: Empty purl list for file {file_name}')
+                    continue
+                if not result.get('licenses') or len(result['licenses']) == 0:
+                    self.print_debug(f'ERROR: Empty licenses list for file {file_name}')
+                    continue

If preferred, extract this into a small _is_valid_result(file_name, result) -> bool helper to further reduce branches.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1e03685 and e370ba8.

📒 Files selected for processing (5)

CHANGELOG.md (2 hunks)
src/scanoss/gitlabqualityreport.py (1 hunks)
src/scanoss/inspection/raw/match_summary.py (1 hunks)
src/scanoss/inspection/utils/file_utils.py (1 hunks)
src/scanoss/utils/scanoss_scan_results_utils.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

CHANGELOG.md

🧰 Additional context used

🧬 Code graph analysis (2)

src/scanoss/gitlabqualityreport.py (2)

src/scanoss/scanossbase.py (1)

ScanossBase (28-107)

src/scanoss/utils/scanoss_scan_results_utils.py (1)

get_lines (25-41)

src/scanoss/inspection/raw/match_summary.py (4)

src/scanoss/scanossbase.py (3)

ScanossBase (28-107)

print_stdout (73-81)

print_to_file_or_stdout (83-94)

src/scanoss/inspection/utils/file_utils.py (1)

load_json_file (29-44)

src/scanoss/inspection/utils/markdown_utils.py (1)

generate_table (25-50)

src/scanoss/utils/scanoss_scan_results_utils.py (1)

get_lines (25-41)

🪛 GitHub Actions: Lint

src/scanoss/gitlabqualityreport.py

[error] 25-25: I001 Import block is un-sorted or un-formatted. Organize imports.

src/scanoss/inspection/raw/match_summary.py

[error] 25-25: I001 Import block is un-sorted or un-formatted. Organize imports.

[error] 137-137: PLR0912 Too many branches (14 > 12).

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: build

🔇 Additional comments (1)

src/scanoss/gitlabqualityreport.py (1)

79-103: Remove or complete the diff; Lines class lacks end field.

The proposed diff attempts to add end=end_line to the Lines() instantiation, but the Lines dataclass (line 34–36) only defines a begin: int field. Applying this diff as-is will raise a TypeError at runtime.

Either:

Add end: int field to the Lines dataclass definition and update its to_dict() serialization, OR

Revert the diff change that adds end=end_line to Lines().

The other refactoring suggestions (type hint syntax, variable rename, check_name change) are reasonable, but the diff must be corrected before merging.

Likely an incorrect or invalid review comment.

src/scanoss/gitlabqualityreport.py

src/scanoss/inspection/raw/match_summary.py

src/scanoss/utils/scanoss_scan_results_utils.py

github-actions · 2025-10-23T17:31:21Z

SCANOSS SCAN Completed 🚀

Detected components: 1
Undeclared components: 0
Declared components: 1
Detected files: 69
Detected files undeclared: 0
Detected files declared: 69
Licenses detected: 1
Licenses detected with copyleft: 0
Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

src/scanoss/inspection/raw/match_summary.py (1)

111-130: Guard missing prefix and anchor only the first range; preserve original ranges.

Avoid “None/…” URLs, and don’t create a cross‑range anchor; show original ranges string.

-        if result.get('id') == "snippet":
+        if result.get('id') == "snippet":
             # Snippet match: create URL with line range anchor
-            lines = scanoss_scan_results_utils.get_lines(result.get('lines'))
-            end_line = lines[len(lines) - 1] if len(lines) > 1 else lines[0]
-            file_url = f"{self.line_range_prefix}/{file_name}#L{lines[0]}-L{end_line}"
+            ranges_str = str(result.get('lines'))
+            first_range = ranges_str.split(',')[0].strip() if ranges_str else ""
+            anchor = ""
+            if first_range:
+                if '-' in first_range:
+                    start_s, end_s = (p.strip() for p in first_range.split('-', 1))
+                    anchor = f"#L{int(start_s)}-L{int(end_s)}"
+                else:
+                    anchor = f"#L{int(first_range)}"
+            file_url = f"{self.line_range_prefix}/{file_name}{anchor}" if self.line_range_prefix else f"{file_name}{anchor}"
             return MatchSummaryItem(
-                file_url=file_url,
+                file_url=file_url,
                 file=file_name,
-                license=result.get('licenses')[0].get('name'),
-                similarity=result.get('matched'),
+                license=result.get('licenses')[0].get('name'),
+                similarity=str(result.get('matched')),
                 purl=result.get('purl')[0],
                 purl_url=result.get('url'),
                 version=result.get('version'),
-                lines=f"{lines[0]}-{lines[len(lines) - 1] if len(lines) > 1 else lines[0]}"
+                lines=ranges_str
             )
-        # File match: create URL without line range
+        # File match: create URL without line range
         return MatchSummaryItem(
             file=file_name,
-            file_url=f"{self.line_range_prefix}/{file_name}",
+            file_url=f"{self.line_range_prefix}/{file_name}" if self.line_range_prefix else file_name,
             license=result.get('licenses')[0].get('name'),
-            similarity=result.get('matched'),
+            similarity=str(result.get('matched')),
             purl=result.get('purl')[0],
             purl_url=result.get('url'),
             version=result.get('version'),
             lines="all"
         )

🧹 Nitpick comments (3)

CHANGELOG.md (1)

14-16: Tighten wording and pluralization for clarity.

Use “to the … subcommand” and plural “summaries” to match the command name.

-- Added `glc-codequality` format to convert subcommand
+- Added `glc-codequality` format to the `convert` subcommand
- - Added `inspect gitlab matches` subcommand to generate GitLab-compatible Markdown match summary from SCANOSS scan results
+ - Added `inspect gitlab matches` subcommand to generate GitLab-compatible Markdown match summaries from SCANOSS scan results

src/scanoss/inspection/utils/file_utils.py (1)

29-44: Improve typing, encoding, and error handling.

More precise types, UTF‑8 decoding, and clearer JSON error help robustness.

+from typing import Any, Dict
@@
-def load_json_file(file_path: str) -> dict:
+def load_json_file(file_path: str) -> Dict[str, Any]:
@@
-    if not os.path.exists(file_path):
+    if not os.path.isfile(file_path):
         raise ValueError(f'The file "{file_path}" does not exist.')
-    with open(file_path, 'r') as jsonfile:
-        try:
-            return json.load(jsonfile)
-        except Exception as e:
-            raise ValueError(f'ERROR: Problem parsing input JSON: {e}')
+    try:
+        with open(file_path, 'r', encoding='utf-8') as jsonfile:
+            return json.load(jsonfile)
+    except json.JSONDecodeError as e:
+        # include location details when available
+        raise ValueError(f'ERROR: Problem parsing input JSON at line {e.lineno}, column {e.colno}: {e.msg}') from e

src/scanoss/inspection/raw/match_summary.py (1)

219-247: Center the Similarity column for readability.

Minor presentation tweak for Markdown tables.

-        file_match_table = generate_table(file_match_headers, file_match_rows)
+        file_match_table = generate_table(file_match_headers, file_match_rows, centered_columns=[2])
@@
-        snippet_match_table = generate_table(snippet_match_headers, snippet_match_rows)
+        snippet_match_table = generate_table(snippet_match_headers, snippet_match_rows, centered_columns=[2])

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e370ba8 and acb0e86.

📒 Files selected for processing (5)

CHANGELOG.md (2 hunks)
src/scanoss/gitlabqualityreport.py (1 hunks)
src/scanoss/inspection/raw/match_summary.py (1 hunks)
src/scanoss/inspection/utils/file_utils.py (1 hunks)
src/scanoss/utils/scanoss_scan_results_utils.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

src/scanoss/utils/scanoss_scan_results_utils.py
src/scanoss/gitlabqualityreport.py

🧰 Additional context used

🧬 Code graph analysis (1)

src/scanoss/inspection/raw/match_summary.py (4)

src/scanoss/scanossbase.py (3)

ScanossBase (28-107)

print_stdout (73-81)

print_to_file_or_stdout (83-94)

src/scanoss/inspection/utils/file_utils.py (1)

load_json_file (29-44)

src/scanoss/inspection/utils/markdown_utils.py (1)

generate_table (25-50)

src/scanoss/utils/scanoss_scan_results_utils.py (1)

get_lines (25-41)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: build

src/scanoss/inspection/raw/match_summary.py

coderabbitai

Actionable comments posted: 4

♻️ Duplicate comments (5)

src/scanoss/gitlabqualityreport.py (3)

34-36: Add end field to Lines dataclass.

The Lines dataclass is missing an end field. GitLab Code Quality (Code Climate) format supports both begin and end line numbers, and including the end line improves the precision and usefulness of the report. This was flagged in a previous review but has not been fully addressed.

Apply this diff to add the end field:

 @dataclass
 class Lines:
     begin: int
+    end: int

79-119: Fix spacing and populate end field in Lines construction.

Two issues need to be addressed:

Spacing inconsistencies: Lines 93-94 have extra spaces around = operators (should be lines=Lines( and begin=0 without extra spaces)
Missing end field: After adding the end field to the Lines dataclass, both Lines constructions (for 'file' and 'snippet' matches) need to populate it

Apply this diff:

     def _get_code_quality(self, file_name: str, result: dict) -> CodeQuality or None:
         if not result.get('file_hash'):
             self.print_debug(f"Warning: no hash found for result: {result}")
             return None
 
         if result.get('id') == 'file':
             description = f"File match found in: {file_name}"
             return CodeQuality(
                 description=description,
                 check_name=file_name,
                 fingerprint=result.get('file_hash'),
                 severity="info",
                 location=Location(
                     path=file_name,
-                    lines = Lines(
-                        begin= 0
+                    lines=Lines(
+                        begin=0,
+                        end=0
                     )
                 )
             )
 
         if not result.get('lines'):
             self.print_debug(f"Warning: No lines found for result: {result}")
             return None
         lines = scanoss_scan_results_utils.get_lines(result.get('lines'))
         if len(lines) == 0:
             self.print_debug(f"Warning: empty lines for result: {result}")
             return None
         end_line = lines[len(lines) - 1] if len(lines) > 1 else lines[0]
         description = f"Snippet found in: {file_name} - lines {lines[0]}-{end_line}"
         return CodeQuality(
             description=description,
             check_name=file_name,
             fingerprint=result.get('file_hash'),
             severity="info",
             location=Location(
                 path=file_name,
                 lines=Lines(
-                    begin=lines[0]
+                    begin=lines[0],
+                    end=end_line
                 )
             )
         )

51-64: Update to_dict to include end line.

Once the end field is added to the Lines dataclass, the to_dict method should serialize both begin and end to the output JSON for completeness.

Apply this diff after adding the end field to Lines:

     def to_dict(self):
         """Convert to dictionary for JSON serialization."""
         return {
             "description": self.description,
             "check_name": self.check_name,
             "fingerprint": self.fingerprint,
             "severity": self.severity,
             "location": {
                 "path": self.location.path,
                 "lines": {
-                    "begin": self.location.lines.begin
+                    "begin": self.location.lines.begin,
+                    "end": self.location.lines.end
                 }
             }
         }

src/scanoss/inspection/raw/match_summary.py (2)

111-125: Past issues remain unaddressed: None in URLs and line range reconstruction.

Despite being marked as addressed in previous commits, the following issues are still present:

Lines 115: When self.line_range_prefix is None, the URL becomes "None/filename#L..." instead of being handled gracefully.
Line 124: Line ranges are reconstructed as "{start}-{end}", which loses comma-separated ranges. For example, "10-20,25-30" becomes "10-30", incorrectly implying a continuous range.

Apply this diff to fix both issues:

     if result.get('id') == "snippet":
         # Snippet match: create URL with line range anchor
         lines = scanoss_scan_results_utils.get_lines(result.get('lines'))
         end_line = lines[len(lines) - 1] if len(lines) > 1 else lines[0]
-        file_url = f"{self.line_range_prefix}/{file_name}#L{lines[0]}-L{end_line}"
+        file_url = f"{self.line_range_prefix}/{file_name}#L{lines[0]}-L{end_line}" if self.line_range_prefix else file_name
         return MatchSummaryItem(
             file_url=file_url,
             file=file_name,
             license=result.get('licenses')[0].get('name'),
             similarity=result.get('matched'),
             purl=result.get('purl')[0],
             purl_url=result.get('url'),
             version=result.get('version'),
-            lines=f"{lines[0]}-{lines[len(lines) - 1] if len(lines) > 1 else lines[0]}"
+            lines=result.get('lines')  # Preserve original format
         )
     # File match: create URL without line range
     return MatchSummaryItem(
         file=file_name,
-        file_url=f"{self.line_range_prefix}/{file_name}",
+        file_url=f"{self.line_range_prefix}/{file_name}" if self.line_range_prefix else file_name,
         license=result.get('licenses')[0].get('name'),

146-154: Past issue remains: 'lines' required for all matches.

Despite being marked as addressed in a previous commit, line 148 still includes 'lines' in the mandatory validations. This incorrectly rejects valid file-level matches that don't have a 'lines' field, since only snippet matches require line information.

Apply this diff:

     validations = [
         ('id', 'No id found'),
-        ('lines', 'No lines found'),
         ('purl', 'No purl found'),
         ('licenses', 'No licenses found'),
         ('version', 'No version found'),
         ('matched', 'No matched found'),
         ('url', 'No url found'),
     ]
     
     for field, error_msg in validations:
         if not result.get(field):
             self.print_debug(f'ERROR: {error_msg} for file {file_name}')
             return False
+    
+    # 'lines' is required only for snippet matches
+    if result.get('id') == "snippet" and not result.get('lines'):
+        self.print_debug(f'ERROR: No lines found for snippet match in file {file_name}')
+        return False

🧹 Nitpick comments (7)

src/scanoss/inspection/raw/component_summary.py (2)

33-74: Reconsider the premature abstraction and inconsistent type signatures.

The _json method currently returns data unchanged, adding an extra layer without transformation. The stub methods _markdown and _jira_markdown accept list[T] while _json accepts dict[str, Any], creating a type signature mismatch that suggests these methods don't share a unified interface.

Consider deferring these formatters until the actual formatting logic is needed, or establish a consistent signature across all three methods.

135-137: The _json formatter adds no value currently.

Since _json returns its input unchanged, calling it before json.dumps adds an unnecessary function call. While this may be preparing for future formatting logic, the current implementation provides no benefit.

src/scanoss/inspection/raw/license_summary.py (2)

41-79: Identical pattern to ComponentSummary with the same concerns.

This file mirrors the formatter infrastructure added to component_summary.py, including the pass-through _json method and unimplemented stub methods with inconsistent type signatures (dict[str, Any] vs list[T]). The same refactoring considerations apply here.

179-181: Pass-through _json adds no transformation.

Like in component_summary.py, the _json call is currently a no-op. The code is clear and correct, but the extra indirection provides no immediate benefit.
src/scanoss/gitlabqualityreport.py (1)
134-150: Fix spacing after comma in method call.

Line 149 is missing a space after the comma in the method call parameters.

Apply this diff:
-        self._write_output(data=code_quality,output_file=output_file)
+        self._write_output(data=code_quality, output_file=output_file)
src/scanoss/inspection/raw/match_summary.py (2)
114-114: Use idiomatic Python indexing for last element.

Replace lines[len(lines) - 1] with lines[-1] for more concise and Pythonic code.

Apply this diff:
-        end_line = lines[len(lines) - 1] if len(lines) > 1 else lines[0]
+        end_line = lines[-1] if len(lines) > 1 else lines[0]
         file_url = f"{self.line_range_prefix}/{file_name}#L{lines[0]}-L{end_line}"
         return MatchSummaryItem(
             ...
-            lines=f"{lines[0]}-{lines[len(lines) - 1] if len(lines) > 1 else lines[0]}"
+            lines=f"{lines[0]}-{lines[-1] if len(lines) > 1 else lines[0]}"
Also applies to: 124-124

162-167: Remove redundant list length validation.

Lines 162-167 check if purl and licenses lists are empty, but this is redundant. Line 157 already checks if not result.get(field), and in Python, empty lists are falsy, so if not [] evaluates to True and the validation would already fail at line 159.

Apply this diff:
     for field, error_msg in validations:
         if not result.get(field):
             self.print_debug(f'ERROR: {error_msg} for file {file_name}')
             return False
-
-    # Additional validation for non-empty lists
-    if len(result.get('purl')) == 0:
-        self.print_debug(f'ERROR: No purl found for file {file_name}')
-        return False
-    if len(result.get('licenses')) == 0:
-        self.print_debug(f'ERROR: Empty licenses list for file {file_name}')
-        return False
-
     return True

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 11afe36 and 7befd76.

📒 Files selected for processing (4)

src/scanoss/gitlabqualityreport.py (1 hunks)
src/scanoss/inspection/raw/component_summary.py (2 hunks)
src/scanoss/inspection/raw/license_summary.py (3 hunks)
src/scanoss/inspection/raw/match_summary.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (4)

src/scanoss/inspection/raw/license_summary.py (5)

src/scanoss/inspection/raw/component_summary.py (3)

_json (33-50)

_markdown (52-62)

_jira_markdown (64-74)

src/scanoss/inspection/policy_check.py (3)

_json (101-112)

_markdown (115-125)

_jira_markdown (128-138)

src/scanoss/inspection/raw/copyleft.py (3)

_json (91-106)

_markdown (108-115)

_jira_markdown (117-124)

src/scanoss/inspection/raw/undeclared_component.py (3)

_json (166-181)

_markdown (183-199)

_jira_markdown (201-217)

src/scanoss/scanossbase.py (1)

print_to_file_or_stdout (83-94)

src/scanoss/inspection/raw/component_summary.py (4)

src/scanoss/inspection/raw/raw_base.py (1)

RawBase (54-424)

src/scanoss/inspection/raw/license_summary.py (3)

_json (41-55)

_markdown (57-67)

_jira_markdown (69-79)

src/scanoss/inspection/policy_check.py (3)

_json (101-112)

_markdown (115-125)

_jira_markdown (128-138)

src/scanoss/scanossbase.py (1)

print_to_file_or_stdout (83-94)

src/scanoss/gitlabqualityreport.py (2)

src/scanoss/scanossbase.py (1)

ScanossBase (28-107)

src/scanoss/utils/scanoss_scan_results_utils.py (1)

get_lines (25-41)

src/scanoss/inspection/raw/match_summary.py (4)

src/scanoss/scanossbase.py (3)

ScanossBase (28-107)

print_stdout (73-81)

print_to_file_or_stdout (83-94)

src/scanoss/inspection/utils/file_utils.py (1)

load_json_file (29-44)

src/scanoss/inspection/utils/markdown_utils.py (1)

generate_table (25-50)

src/scanoss/utils/scanoss_scan_results_utils.py (1)

get_lines (25-41)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: build

🔇 Additional comments (9)

src/scanoss/gitlabqualityreport.py (5)

1-23: LGTM! License header properly added.

The SPDX MIT license header has been correctly added as requested in previous reviews.

25-31: LGTM! Imports are well-organized.

The imports are appropriate and follow standard conventions.

66-76: LGTM! Class initialization is correct.

The initialization properly calls the parent class constructor and follows expected patterns.

121-132: LGTM! Output writing is properly implemented.

The method correctly serializes CodeQuality objects to JSON and handles both file and stdout output with appropriate error handling.

170-185: LGTM! File reading and delegation are properly implemented.

The method correctly validates the file path, reads the content, and delegates to _produce_from_str with appropriate error handling. The docstring has been properly updated from previous reviews.

src/scanoss/inspection/raw/match_summary.py (4)

1-23: License header is now present.

The MIT license header properly addresses the previous review comment.

171-202: Clean processing logic with proper validation and categorization.

The method correctly:

Loads and parses the JSON file

Skips non-matches

Validates required fields before processing

Categorizes matches by type (file vs snippet)

205-265: Well-structured Markdown generation with clean table formatting.

The method properly:

Returns early when no matches exist

Generates separate tables for file and snippet matches using the shared utility

Creates collapsible sections for better readability

Builds clickable links for files and PURLs

267-287: Clean orchestration of the workflow.

The method properly coordinates the entire process:

Loads and processes matches

Generates Markdown

Handles the no-matches case with a user-friendly message

Outputs to the configured destination

src/scanoss/gitlabqualityreport.py

src/scanoss/inspection/raw/component_summary.py

src/scanoss/inspection/raw/license_summary.py

src/scanoss/inspection/raw/match_summary.py

github-actions · 2025-10-26T14:05:47Z

SCANOSS SCAN Completed 🚀

Detected components: 1
Undeclared components: 0
Declared components: 1
Detected files: 75
Detected files undeclared: 0
Detected files declared: 75
Licenses detected: 1
Licenses detected with copyleft: 0
Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

github-actions · 2025-10-26T14:08:07Z

SCANOSS SCAN Completed 🚀

Detected components: 1
Undeclared components: 0
Declared components: 1
Detected files: 77
Detected files undeclared: 0
Detected files declared: 77
Licenses detected: 1
Licenses detected with copyleft: 0
Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (7)

src/scanoss/inspection/raw/license_summary.py (1)

26-26: Use built‑in generics and fix spacing (duplicate of prior nit).

Prefer PEP 585 generics and correct spacing; also drop Dict import.
-from typing import Any, Dict
+from typing import Any
@@
-    def _json(self, data: Dict[str,Any]) -> Dict[str, Any]:
+    def _json(self, data: dict[str, Any]) -> dict[str, Any]:
Also applies to: 41-41

src/scanoss/inspection/raw/component_summary.py (1)

33-33: Fix type annotation syntax and use built‑in generics (duplicate of prior nit).

Remove the space before '[' and prefer built‑ins.
-    def _json(self, data: dict [str,Any]) -> dict[str,Any]:
+    def _json(self, data: dict[str, Any]) -> dict[str, Any]:

src/scanoss/cli.py (1)

2005-2016: Validate input path and show scoped help on missing args.

Add clear error messages and file existence/readability checks; show help for the exact subcommand. This prevents obscure tracebacks and improves UX.

-    if args.input is None:
-        parser.parse_args([args.subparser, '-h'])
-        sys.exit(1)
-
-    if args.line_range_prefix is None:
-        parser.parse_args([args.subparser, '-h'])
-        sys.exit(1)
+    # Required args (parser enforces, but keep defensive with scoped help)
+    if not args.input:
+        print_stderr('ERROR: Input file is required for GitLab matches summary')
+        parser.parse_args([args.subparser, args.subparsercmd, args.subparser_subcmd, '-h'])
+        sys.exit(1)
+    if not args.line_range_prefix:
+        print_stderr('ERROR: --line-range-prefix is required for GitLab file links')
+        parser.parse_args([args.subparser, args.subparsercmd, args.subparser_subcmd, '-h'])
+        sys.exit(1)
+    # Validate input file path
+    if not os.path.exists(args.input) or not os.path.isfile(args.input) or not os.access(args.input, os.R_OK):
+        print_stderr(f'ERROR: Input file does not exist, is not a file, or is not readable: {args.input}')
+        sys.exit(1)

src/scanoss/gitlabqualityreport.py (4)

34-37: Include end in line range model.

Expose both begin/end as supported by GitLab Code Quality (Code Climate).

 @dataclass
 class Lines:
-    begin: int
+    begin: int
+    end: int

51-64: Emit both begin and end in JSON.

Update serializer to include end.

         return {
             "description": self.description,
             "check_name": self.check_name,
             "fingerprint": self.fingerprint,
             "severity": self.severity,
             "location": {
                 "path": self.location.path,
-                "lines": {
-                    "begin": self.location.lines.begin
-                }
+                "lines": {
+                    "begin": self.location.lines.begin,
+                    "end": self.location.lines.end
+                }
             }
         }

99-120: Populate end line for snippets and fix spacing.

Ensure both begin and end are present.

-        lines = scanoss_scan_results_utils.get_lines(result.get('lines'))
+        lines = scanoss_scan_results_utils.get_lines(result.get('lines'))
         if len(lines) == 0:
             self.print_debug(f"Warning: empty lines for result: {result}")
             return None
         end_line = lines[len(lines) - 1] if len(lines) > 1 else lines[0]
         description = f"Snippet found in: {file_name} - lines {lines[0]}-{end_line}"
         return CodeQuality(
@@
-            location=Location(
-                path=file_name,
-                lines=Lines(
-                    begin=lines[0]
-                )
-            )
+            location=Location(
+                path=file_name,
+                lines=Lines(
+                    begin=lines[0],
+                    end=end_line
+                )
+            )
         )

152-158: Docstring typo.

Fix “Reportoutput” → “Report output”.

-        Produce Gitlab Code Quality Reportoutput from input JSON string
+        Produce GitLab Code Quality Report output from input JSON string

🧹 Nitpick comments (11)

src/scanoss/inspection/raw/license_summary.py (3)

42-54: Docstring mismatch: method is implemented as pass‑through.

Update wording to reflect behavior (returns input unchanged and ready for serialization).

-        """
-        Format license summary data as JSON (not yet implemented).
-
-        This method is intended to return the license summary data in JSON structure
-        for serialization. The data should include license information with copyleft
-        analysis and license statistics.
-
-        :param data: List of license summary items to format
-        :return: Dictionary containing license summary information including:
-                 - licenses: List of detected licenses with SPDX IDs, URLs, and copyleft status
-                 - detectedLicenses: Total number of unique licenses
-                 - detectedLicensesWithCopyleft: Count of licenses marked as copyleft
-        """
+        """
+        Format license summary data as JSON (pass‑through).
+
+        Returns the provided license summary dictionary unchanged for JSON serialization.
+
+        :param data: Dictionary with license summary information, including:
+                     - 'licenses': list of detected licenses with SPDX IDs, URLs, and copyleft
+                     - 'detectedLicenses': total unique licenses
+                     - 'detectedLicensesWithCopyleft': count of copyleft licenses
+        """

57-79: Fail fast for unimplemented formatters.

Raise NotImplementedError instead of silently returning None.

-    def _markdown(self, data: list[T]) -> Dict[str, Any]:
+    def _markdown(self, data: list[T]) -> dict[str, Any]:
@@
-        pass
+        raise NotImplementedError("LicenseSummary._markdown is not implemented yet.")
@@
-    def _jira_markdown(self, data: list[T]) -> Dict[str, Any]:
+    def _jira_markdown(self, data: list[T]) -> dict[str, Any]:
@@
-        pass
+        raise NotImplementedError("LicenseSummary._jira_markdown is not implemented yet.")

179-181: Minor consistency: variable name and stable JSON output.

Align with ComponentSummary and reduce diff churn by sorting keys.

-        # TODO: Implement formatter to support dynamic outputs
-        json_format = self._json(license_summary)
-        self.print_to_file_or_stdout(json.dumps(json_format, indent=2), self.output)
+        # TODO: Implement formatter to support dynamic outputs
+        json_data = self._json(license_summary)
+        self.print_to_file_or_stdout(json.dumps(json_data, indent=2, sort_keys=True), self.output)

src/scanoss/inspection/raw/component_summary.py (2)

52-74: Fail fast for unimplemented Markdown formatters.

Raise NotImplementedError; also standardize return annotation to built‑ins.

-    def _markdown(self, data: list[T]) -> Dict[str, Any]:
+    def _markdown(self, data: list[T]) -> dict[str, Any]:
@@
-        pass
+        raise NotImplementedError("ComponentSummary._markdown is not implemented yet.")
@@
-    def _jira_markdown(self, data: list[T]) -> Dict[str, Any]:
+    def _jira_markdown(self, data: list[T]) -> dict[str, Any]:
@@
-        pass
+        raise NotImplementedError("ComponentSummary._jira_markdown is not implemented yet.")

135-137: Stable JSON output.

Add sort_keys for deterministic output ordering.

-        self.print_to_file_or_stdout(json.dumps(json_data, indent=2), self.output)
+        self.print_to_file_or_stdout(json.dumps(json_data, indent=2, sort_keys=True), self.output)

src/scanoss/cli.py (4)

843-848: Use a simple short option for line-range prefix.

-lpr is unconventional; prefer a single-letter short flag for usability, e.g. -L.

-    p_gl_inspect_matches.add_argument(
-        '-lpr',
+    p_gl_inspect_matches.add_argument(
+        '-L',
         '--line-range-prefix',
         required=True,
         type=str,

1275-1276: Tighten inspect help guard conditional.

args.subparser in 'inspect' relies on substring matching; use explicit tuple to avoid surprises.

-    elif (args.subparser in 'inspect') and (args.subparsercmd in ('raw', 'dt', 'glc', 'gitlab')) and (args.subparser_subcmd is None):
+    elif (args.subparser in ('inspect', 'insp', 'ins')) and (args.subparsercmd in ('raw', 'dt', 'glc', 'gitlab')) and (args.subparser_subcmd is None):

1696-1701: Nit: brand casing.

Prefer “GitLab” in user messages.

-        if not args.quiet:
-            print_stderr('Producing Gitlab code quality report...')
+        if not args.quiet:
+            print_stderr('Producing GitLab Code Quality report...')

1976-2003: Docstring grammar and clarity.

Tighten phrasing and align with command name.

-    Handle GitLab matches the summary inspection command.
+    Handle GitLab matches summary inspection command.
@@
-    This command processes SCANOSS scan output and creates human-readable Markdown.
+    This command processes SCANOSS scan output and creates a human‑readable Markdown report.

src/scanoss/gitlabqualityreport.py (2)

92-97: Normalize spacing and populate end for file-level matches.

Also set a sensible end (0 or 1) consistently.

-                location=Location(
-                    path=file_name,
-                    lines = Lines(
-                        begin= 0
-                    )
-                )
+                location=Location(
+                    path=file_name,
+                    lines=Lines(
+                        begin=0,
+                        end=0
+                    )
+                )

149-150: Minor: spacing around arguments.

PEP 8: add a space after the comma.

-        self._write_output(data=code_quality,output_file=output_file)
+        self._write_output(data=code_quality, output_file=output_file)

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7befd76 and 80a0004.

📒 Files selected for processing (5)

src/scanoss/cli.py (9 hunks)
src/scanoss/gitlabqualityreport.py (1 hunks)
src/scanoss/inspection/raw/component_summary.py (2 hunks)
src/scanoss/inspection/raw/license_summary.py (3 hunks)
src/scanoss/inspection/raw/match_summary.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

src/scanoss/inspection/raw/match_summary.py

🧰 Additional context used

🧬 Code graph analysis (4)

src/scanoss/inspection/raw/license_summary.py (3)

src/scanoss/inspection/raw/component_summary.py (3)

_json (33-50)

_markdown (52-62)

_jira_markdown (64-74)

src/scanoss/inspection/policy_check.py (3)

_json (101-112)

_markdown (115-125)

_jira_markdown (128-138)

src/scanoss/scanossbase.py (1)

print_to_file_or_stdout (83-94)

src/scanoss/cli.py (2)

src/scanoss/inspection/raw/match_summary.py (2)

MatchSummary (63-287)

run (267-287)

src/scanoss/gitlabqualityreport.py (2)

GitLabQualityReport (66-185)

produce_from_file (170-185)

src/scanoss/inspection/raw/component_summary.py (5)

src/scanoss/inspection/raw/license_summary.py (3)

_json (41-55)

_markdown (57-67)

_jira_markdown (69-79)

src/scanoss/inspection/policy_check.py (3)

_json (101-112)

_markdown (115-125)

_jira_markdown (128-138)

src/scanoss/inspection/raw/copyleft.py (3)

_json (91-106)

_markdown (108-115)

_jira_markdown (117-124)

src/scanoss/inspection/raw/undeclared_component.py (3)

_json (166-181)

_markdown (183-199)

_jira_markdown (201-217)

src/scanoss/scanossbase.py (1)

print_to_file_or_stdout (83-94)

src/scanoss/gitlabqualityreport.py (2)

src/scanoss/scanossbase.py (1)

ScanossBase (28-107)

src/scanoss/utils/scanoss_scan_results_utils.py (1)

get_lines (25-41)

🔇 Additional comments (1)

src/scanoss/cli.py (1)

288-289: Format choice LGTM.

New choice 'glc-codequality' is correctly wired.

github-actions · 2025-10-26T14:11:56Z

SCANOSS SCAN Completed 🚀

Detected components: 1
Undeclared components: 0
Declared components: 1
Detected files: 71
Detected files undeclared: 0
Detected files declared: 71
Licenses detected: 1
Licenses detected with copyleft: 0
Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

github-actions · 2025-10-26T14:42:42Z

SCANOSS SCAN Completed 🚀

Detected components: 1
Undeclared components: 0
Declared components: 1
Detected files: 70
Detected files undeclared: 0
Detected files declared: 70
Licenses detected: 1
Licenses detected with copyleft: 0
Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

github-actions · 2025-10-26T14:45:48Z

SCANOSS SCAN Completed 🚀

Detected components: 1
Undeclared components: 0
Declared components: 1
Detected files: 70
Detected files undeclared: 0
Detected files declared: 70
Licenses detected: 1
Licenses detected with copyleft: 0
Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

coderabbitai

Actionable comments posted: 6

♻️ Duplicate comments (8)

src/scanoss/inspection/raw/component_summary.py (1)

33-33: Type annotation spacing issue (already flagged).

The type annotation still has the spacing issue previously identified: dict[str,Any] should be dict[str, Any] (space after comma, no space before bracket).

Apply this diff to fix the spacing:
-    def _json(self, data: dict[str,Any]) -> dict[str,Any]:
+    def _json(self, data: dict[str, Any]) -> dict[str, Any]:

src/scanoss/inspection/raw/license_summary.py (1)

41-41: Fix inconsistent type annotation spacing within the same signature.

The parameter type uses dict[str,Any] (no space after comma) while the return type uses dict[str, Any] (with space). This inconsistency violates PEP 8 style guidelines.

Apply this diff to fix the spacing:
-    def _json(self, data: dict[str,Any]) -> dict[str, Any]:
+    def _json(self, data: dict[str, Any]) -> dict[str, Any]:

src/scanoss/inspection/raw/match_summary.py (3)

127-135: Guard file URL when prefix is absent and normalize slash.

Avoid emitting “None/…” and double slashes.

-        return MatchSummaryItem(
-            file=file_name,
-            file_url=f"{self.line_range_prefix}/{file_name}",
+        prefix = (self.line_range_prefix.rstrip('/') if self.line_range_prefix else '')
+        return MatchSummaryItem(
+            file=file_name,
+            file_url=f"{prefix}/{file_name}" if prefix else file_name,
             license=result.get('licenses')[0].get('name'),
             similarity=result.get('matched'),
             purl=result.get('purl')[0],
             purl_url=result.get('url'),
             version=result.get('version'),
             lines="all"
         )

111-125: Robust snippet URL and lines: guard prefix, anchor first range only, preserve original ranges, handle malformed lines.

Avoid “None/…” URLs and invalid anchors across disjoint ranges; keep the original ranges string for display and tolerate malformed input.

-        if result.get('id') == "snippet":
-            # Snippet match: create URL with line range anchor
-            lines = scanoss_scan_results_utils.get_lines(result.get('lines'))
-            end_line = lines[len(lines) - 1] if len(lines) > 1 else lines[0]
-            file_url = f"{self.line_range_prefix}/{file_name}#L{lines[0]}-L{end_line}"
-            return MatchSummaryItem(
-                file_url=file_url,
-                file=file_name,
-                license=result.get('licenses')[0].get('name'),
-                similarity=result.get('matched'),
-                purl=result.get('purl')[0],
-                purl_url=result.get('url'),
-                version=result.get('version'),
-                lines=f"{lines[0]}-{lines[len(lines) - 1] if len(lines) > 1 else lines[0]}"
-            )
+        if result.get('id') == "snippet":
+            # Build a URL anchoring only the first contiguous range; preserve the original ranges string for display
+            ranges_str = str(result.get('lines') or '').strip()
+            first_range = ranges_str.split(',', 1)[0].strip() if ranges_str else ''
+            try:
+                if first_range and '-' in first_range:
+                    start_s, end_s = (p.strip() for p in first_range.split('-', 1))
+                    start, end = int(start_s), int(end_s)
+                    anchor = f"#L{start}-L{end}"
+                elif first_range:
+                    start = int(first_range)
+                    anchor = f"#L{start}"
+                else:
+                    start = 1
+                    anchor = f"#L{start}"
+            except ValueError as e:
+                self.print_debug(f'ERROR: Malformed line range "{ranges_str}" for file {file_name}: {e}')
+                start, anchor = 1, "#L1"
+            prefix = (self.line_range_prefix.rstrip('/') if self.line_range_prefix else '')
+            file_url = f"{prefix}/{file_name}{anchor}" if prefix else file_name
+            return MatchSummaryItem(
+                file_url=file_url,
+                file=file_name,
+                license=result.get('licenses')[0].get('name'),
+                similarity=result.get('matched'),
+                purl=result.get('purl')[0],
+                purl_url=result.get('url'),
+                version=result.get('version'),
+                lines=ranges_str or str(start)
+            )

146-169: Do not require ‘lines’ for file matches; validate it only for snippet. Also avoid truthiness checks for numeric fields.

Current validation drops valid file matches and may reject matched=0.

-        validations = [
-            ('id', 'No id found'),
-            ('lines', 'No lines found'),
-            ('purl', 'No purl found'),
-            ('licenses', 'No licenses found'),
-            ('version', 'No version found'),
-            ('matched', 'No matched found'),
-            ('url', 'No url found'),
-        ]
+        validations = [
+            ('id', 'No id found'),
+            ('purl', 'No purl found'),
+            ('licenses', 'No licenses found'),
+            ('version', 'No version found'),
+            ('matched', 'No matched found'),
+            ('url', 'No url found'),
+        ]
 
-        for field, error_msg in validations:
-            if not result.get(field):
+        for field, error_msg in validations:
+            if field not in result or result.get(field) is None:
                 self.print_debug(f'ERROR: {error_msg} for file {file_name}')
                 return False
+        # 'lines' required only for snippet matches
+        if result.get('id') == "snippet" and not result.get('lines'):
+            self.print_debug(f'ERROR: No lines found for snippet match in file {file_name}')
+            return False
 
         # Additional validation for non-empty lists
         if len(result.get('purl')) == 0:
             self.print_debug(f'ERROR: No purl found for file {file_name}')
             return False
         if len(result.get('licenses')) == 0:
             self.print_debug(f'ERROR: Empty licenses list for file {file_name}')
             return False

src/scanoss/gitlabqualityreport.py (2)

34-64: Include ‘end’ in Lines and emit it in JSON.

Improves fidelity with Code Climate/GitLab expectations.

 @dataclass
 class Lines:
-    begin: int
+    begin: int
+    end: int
@@
         return {
             "description": self.description,
             "check_name": self.check_name,
             "fingerprint": self.fingerprint,
             "severity": self.severity,
             "location": {
                 "path": self.location.path,
-                "lines": {
-                    "begin": self.location.lines.begin
-                }
+                "lines": {
+                    "begin": self.location.lines.begin,
+                    "end": self.location.lines.end
+                }
             }
         }

99-119: Handle malformed line strings; include end in snippet location.

Avoid crashes on bad ranges and improve location detail.

-        if not result.get('lines'):
+        if not result.get('lines'):
             self.print_debug(f"Warning: No lines found for result: {result}")
             return None
-        lines = scanoss_scan_results_utils.get_lines(result.get('lines'))
+        try:
+            lines = scanoss_scan_results_utils.get_lines(result.get('lines'))
+        except Exception as e:
+            self.print_debug(f"Warning: Malformed line range for {file_name}: {e} ({result.get('lines')})")
+            return None
         if len(lines) == 0:
             self.print_debug(f"Warning: empty lines for result: {result}")
             return None
         end_line = lines[len(lines) - 1] if len(lines) > 1 else lines[0]
         description = f"Snippet found in: {file_name} - lines {lines[0]}-{end_line}"
         return CodeQuality(
             description=description,
             check_name=file_name,
             fingerprint=result.get('file_hash'),
             severity="info",
             location=Location(
                 path=file_name,
-                lines=Lines(
-                    begin=lines[0]
-                )
+                lines=Lines(begin=lines[0], end=end_line)
             )
         )

src/scanoss/cli.py (1)

1978-2040: Validate input path and show nested help; keep user-facing errors friendly.

Add existence check and display the correct subcommand help.

-    if args.input is None:
-        parser.parse_args([args.subparser, '-h'])
-        sys.exit(1)
+    if not args.input:
+        print_stderr('ERROR: Input file is required for GitLab matches summary')
+        parser.parse_args([args.subparser, args.subparsercmd, 'matches', '-h'])
+        sys.exit(1)
+    if not os.path.exists(args.input) or not os.path.isfile(args.input):
+        print_stderr(f'ERROR: Input file does not exist or is not a file: {args.input}')
+        sys.exit(1)
 
-    if args.line_range_prefix is None:
-        parser.parse_args([args.subparser, '-h'])
-        sys.exit(1)
+    # line_range_prefix is optional; MatchSummary will handle absence (links without prefix)
+    # if you want to enforce it, keep the next block and adjust help-path as below
+    # if args.line_range_prefix is None:
+    #     print_stderr('ERROR: --line-range-prefix is required')
+    #     parser.parse_args([args.subparser, args.subparsercmd, 'matches', '-h'])
+    #     sys.exit(1)

To verify repo-wide, run:

#!/bin/bash
# Check all CLI help fallbacks and input validations related to gitlab matches
rg -nC2 -e 'inspect_gitlab_matches\(' -e 'subparser.*glc|gitlab' --type=py

🧹 Nitpick comments (3)

src/scanoss/inspection/raw/component_summary.py (1)

52-74: Verify parameter types when implementing these methods.

The _markdown and _jira_markdown methods accept list[T] parameters, while _json accepts dict[str, Any]. This inconsistency suggests these methods may expect different data structures (perhaps extracting the components list from the summary dict). Ensure the parameter types align with the intended usage when these placeholder methods are implemented.
src/scanoss/inspection/raw/license_summary.py (1)
26-28: Consider defining TypeVar T locally instead of importing from policy_check.

Importing T from ..policy_check creates an unusual dependency. TypeVars are typically defined within the module that uses them or in a dedicated types module. This approach creates tight coupling to policy_check solely for a type annotation.

Consider defining T locally:
 from typing import Any
 
-from ..policy_check import T
 from .raw_base import RawBase
+
+from typing import TypeVar
+T = TypeVar('T')
src/scanoss/cli.py (1)
832-858: Make --line-range-prefix optional to match docs and broader use-cases.

Docs say links are included “when a line range prefix is provided.” Keep optional in CLI and handle None in MatchSummary (guard already suggested).
-    p_gl_inspect_matches.add_argument(
+    p_gl_inspect_matches.add_argument(
         '-lpr',
         '--line-range-prefix',
-        required=True,
+        required=False,
         type=str,
         help='Base URL prefix for GitLab file links with line ranges (e.g., https://gitlab.com/org/project/-/blob/main)'
     )

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 80a0004 and 26eb9b8.

📒 Files selected for processing (5)

src/scanoss/cli.py (9 hunks)
src/scanoss/gitlabqualityreport.py (1 hunks)
src/scanoss/inspection/raw/component_summary.py (2 hunks)
src/scanoss/inspection/raw/license_summary.py (3 hunks)
src/scanoss/inspection/raw/match_summary.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (5)

src/scanoss/inspection/raw/match_summary.py (4)

src/scanoss/scanossbase.py (3)

ScanossBase (28-107)

print_stdout (73-81)

print_to_file_or_stdout (83-94)

src/scanoss/inspection/utils/file_utils.py (1)

load_json_file (29-44)

src/scanoss/inspection/utils/markdown_utils.py (1)

generate_table (25-50)

src/scanoss/utils/scanoss_scan_results_utils.py (1)

get_lines (25-41)

src/scanoss/cli.py (3)

src/scanoss/inspection/raw/match_summary.py (2)

MatchSummary (63-287)

run (267-287)

src/scanoss/gitlabqualityreport.py (2)

GitLabQualityReport (66-185)

produce_from_file (170-185)

src/scanoss/scanossbase.py (1)

print_stderr (45-49)

src/scanoss/inspection/raw/license_summary.py (3)

src/scanoss/inspection/raw/component_summary.py (6)

_json (33-50)

_markdown (52-62)

_jira_markdown (64-74)

_format (132-135)

run (137-142)

_get_components (113-130)

src/scanoss/inspection/policy_check.py (4)

_json (101-112)

_markdown (115-125)

_jira_markdown (128-138)

run (83-98)

src/scanoss/scanossbase.py (1)

print_to_file_or_stdout (83-94)

src/scanoss/gitlabqualityreport.py (2)

src/scanoss/scanossbase.py (1)

ScanossBase (28-107)

src/scanoss/utils/scanoss_scan_results_utils.py (1)

get_lines (25-41)

src/scanoss/inspection/raw/component_summary.py (3)

src/scanoss/inspection/raw/raw_base.py (2)

RawBase (54-424)

_get_components (83-102)

src/scanoss/inspection/raw/license_summary.py (6)

_json (41-55)

_markdown (57-67)

_jira_markdown (69-79)

_format (176-179)

run (181-186)

_get_components (156-174)

src/scanoss/scanossbase.py (1)

print_to_file_or_stdout (83-94)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: build

🔇 Additional comments (8)

src/scanoss/inspection/raw/component_summary.py (3)

24-27: Imports look good.

The new imports of Any and T are appropriate and necessary for the type annotations in the new formatting methods.

132-135: Good refactoring for extensibility.

The _format method centralizes the formatting logic and sets up a good foundation for supporting multiple output formats in the future. The TODO comment appropriately tracks the remaining work.

137-142: LGTM - Clean refactoring.

The refactoring to use _format method improves code organization and prepares the class for supporting multiple output formats. The change maintains existing behavior while improving maintainability.

src/scanoss/inspection/raw/license_summary.py (4)

57-67: LGTM!

The stub method with clear documentation is appropriate for future implementation.

69-79: LGTM!

The stub method with clear documentation is appropriate for future implementation.

176-179: LGTM!

The _format method provides a clean abstraction layer for future format support. The TODO comment appropriately signals the planned extension for additional output formats.

181-186: LGTM!

The refactoring to use _format improves consistency with other summary classes and provides better extensibility for supporting multiple output formats in the future.

src/scanoss/cli.py (1)

288-291: Add glc-codequality format: LGTM.

New convert choice is correctly wired.

src/scanoss/cli.py

src/scanoss/gitlabqualityreport.py

src/scanoss/inspection/raw/license_summary.py

src/scanoss/inspection/raw/match_summary.py

…ct command Implements GitLab-compatible Markdown summary reports for SCANOSS matches with the following changes: - Add new 'inspect gitlab matches' CLI command to generate match summaries - Create MatchSummary class to process and format scan results with collapsible sections - Extract table generation utilities into shared markdown_utils module - Extract JSON file loading into shared file_utils module - Add support for GitLab file links with line range anchors

…at and GitLab matches summary features in CLIENT_HELP.md

…roved formatting

github-actions · 2025-10-26T15:00:29Z

SCANOSS SCAN Completed 🚀

Detected components: 1
Undeclared components: 0
Declared components: 1
Detected files: 80
Detected files undeclared: 0
Detected files declared: 80
Licenses detected: 1
Licenses detected with copyleft: 0
Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

github-actions · 2025-10-26T15:01:50Z

SCANOSS SCAN Completed 🚀

Detected components: 1
Undeclared components: 0
Declared components: 1
Detected files: 68
Detected files undeclared: 0
Detected files declared: 68
Licenses detected: 1
Licenses detected with copyleft: 0
Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

…ry commands

github-actions · 2025-10-26T15:03:10Z

SCANOSS SCAN Completed 🚀

Detected components: 1
Undeclared components: 0
Declared components: 1
Detected files: 70
Detected files undeclared: 0
Detected files declared: 70
Licenses detected: 1
Licenses detected with copyleft: 0
Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

matiasdaloia · 2025-10-27T11:21:30Z

src/scanoss/inspection/raw/component_summary.py

+        This method returns the component summary data in its original JSON structure
+        without any transformation. The data can be directly serialized to JSON format.
+
+        :param data: Dictionary containing component summary information including:


In theory we are moving towards google's way of writing python docstrings

You can see examples in folder_hasher.py

def hash_directory(self, path: str) -> dict: """ Generate the folder hashing request structure from a directory path. This method builds a directory tree (DirectoryNode) and computes the associated hash data for the folder. Args: path (str): The root directory path. Returns: dict: The folder hash request structure. """ root_node = self._build_root_node(path) tree = self._hash_calc_from_node(root_node) self.tree = tree return tree

matiasdaloia · 2025-10-27T11:25:31Z

src/scanoss/inspection/raw/license_summary.py

+        analysis and license statistics.
+
+        :param data: List of license summary items to format
+        :return: Dictionary containing license summary information including:


If we know the output schema of this, maybe could be nice to add typings

matiasdaloia · 2025-10-27T11:28:33Z

src/scanoss/inspection/raw/match_summary.py

+            file_url=f"{self.line_range_prefix}/{file_name}",
+            license=result.get('licenses')[0].get('name'),
+            similarity=result.get('matched'),
+            purl=result.get('purl')[0],


We should check if licenses is not an empty list before trying to access [0] otherwise we'll have a runtime error

Same thing for result.get('purl')[0] which can be an empty list

matiasdaloia · 2025-10-27T11:30:26Z

src/scanoss/inspection/raw/match_summary.py

+        if result.get('id') == "snippet":
+            # Snippet match: create URL with line range anchor
+            lines = scanoss_scan_results_utils.get_lines(result.get('lines'))
+            end_line = lines[len(lines) - 1] if len(lines) > 1 else lines[0]


wouldn't we need to point to the first line matched?

matiasdaloia · 2025-10-27T11:31:24Z

src/scanoss/inspection/raw/match_summary.py

+        Skips invalid or incomplete results with debug messages.
+        """
+        # Load scan results from JSON file
+        scan_results = load_json_file(self.scanoss_results_path)


we need to handle possible ValueError here if the path doesn't exist

matiasdaloia · 2025-10-27T11:32:03Z

src/scanoss/inspection/raw/match_summary.py

+        """
+
+        if len(gitlab_matches_summary.files) == 0 and len(gitlab_matches_summary.snippet) == 0:
+            return ""


we could add a debug log here

matiasdaloia · 2025-10-27T11:35:39Z

src/scanoss/inspection/raw/match_summary.py

+        # Format matches as GitLab-compatible Markdown
+        matches_md = self._markdown(matches)
+        if matches_md == "":
+            self.print_stdout("No matches found.")


I wonder if there are no matches, we're just presenting a plain text. Wouldn't be better to also give it md format? So it looks nicer when displaying on the web

matiasdaloia · 2025-10-27T11:43:49Z

src/scanoss/inspection/utils/file_utils.py

+import os
+
+
+def load_json_file(file_path: str) -> dict:


We already have a util for this in src/scanoss/utils/file.py that handles different types of errors, and returns data already parsed as dict

matiasdaloia · 2025-10-27T11:49:21Z

src/scanoss/cli.py

        '-f',
        type=str,
-        choices=['cyclonedx', 'spdxlite', 'csv'],
+        choices=['cyclonedx', 'spdxlite', 'csv', 'glc-codequality'],


for me it's not clear what glc-codequality is unless i have context, but even so if you search glc code quality in google, it shows results about mercedes benz glc model

i would use something like gitlab-codequality or something clearer

matiasdaloia · 2025-10-27T11:55:38Z

src/scanoss/cli.py

+        '--line-range-prefix',
+        required=True,
+        type=str,
+        help='Base URL prefix for GitLab file links with line ranges (e.g., https://gitlab.com/org/project/-/blob/main)'


sounds confusing --line-range-prefix for the actual base url

agustingroh requested a review from eeisegn October 22, 2025 20:20

agustingroh force-pushed the feat/implement-gitlab--code-quality-report branch from 0a2581c to 5f85c67 Compare October 22, 2025 20:43

agustingroh force-pushed the feat/implement-gitlab--code-quality-report branch from 5f85c67 to 619900e Compare October 22, 2025 20:56

agustingroh force-pushed the feat/implement-gitlab--code-quality-report branch from 619900e to 6cee353 Compare October 22, 2025 20:58

eeisegn reviewed Oct 23, 2025

View reviewed changes

src/scanoss/inspection/raw/match_summary.py Show resolved Hide resolved

eeisegn reviewed Oct 23, 2025

View reviewed changes

src/scanoss/cli.py Outdated Show resolved Hide resolved

eeisegn reviewed Oct 23, 2025

View reviewed changes

src/scanoss/cli.py Outdated Show resolved Hide resolved

eeisegn reviewed Oct 23, 2025

View reviewed changes

src/scanoss/cli.py Outdated Show resolved Hide resolved

agustingroh force-pushed the feat/implement-gitlab--code-quality-report branch from 6cee353 to 748a91c Compare October 23, 2025 10:27

agustingroh force-pushed the feat/implement-gitlab--code-quality-report branch from 748a91c to 6c17e2f Compare October 23, 2025 10:37

agustingroh requested review from eeisegn, isasmendiagus and matiasdaloia October 23, 2025 10:39

agustingroh marked this pull request as ready for review October 23, 2025 15:51

coderabbitai bot reviewed Oct 23, 2025

View reviewed changes

src/scanoss/gitlabqualityreport.py Show resolved Hide resolved

src/scanoss/inspection/raw/match_summary.py Show resolved Hide resolved

src/scanoss/utils/scanoss_scan_results_utils.py Show resolved Hide resolved

agustingroh force-pushed the feat/implement-gitlab--code-quality-report branch from e370ba8 to acb0e86 Compare October 23, 2025 17:30

agustingroh changed the title ~~Feat/added code quality format to convert subcomand and added new match summary into inspect subcomand~~ Feat/added code quality format to convert subcomand and added new match summary into inspect subcommand Oct 23, 2025

coderabbitai bot reviewed Oct 23, 2025

View reviewed changes

src/scanoss/inspection/raw/match_summary.py Show resolved Hide resolved

agustingroh force-pushed the feat/implement-gitlab--code-quality-report branch from acb0e86 to 4e872a7 Compare October 24, 2025 15:25

coderabbitai bot reviewed Oct 26, 2025

View reviewed changes

agustingroh force-pushed the feat/implement-gitlab--code-quality-report branch from 7befd76 to 80a0004 Compare October 26, 2025 14:05

agustingroh force-pushed the feat/implement-gitlab--code-quality-report branch from 80a0004 to 1ff52e8 Compare October 26, 2025 14:07

agustingroh requested a review from eeisegn October 26, 2025 14:07

coderabbitai bot reviewed Oct 26, 2025

View reviewed changes

agustingroh force-pushed the feat/implement-gitlab--code-quality-report branch from 1ff52e8 to 66938e5 Compare October 26, 2025 14:11

agustingroh force-pushed the feat/implement-gitlab--code-quality-report branch from 66938e5 to f72e7fb Compare October 26, 2025 14:42

agustingroh force-pushed the feat/implement-gitlab--code-quality-report branch from f72e7fb to 26eb9b8 Compare October 26, 2025 14:45

coderabbitai bot reviewed Oct 26, 2025

View reviewed changes

agustingroh and others added 4 commits October 26, 2025 11:59

chore:SP-3582 Adds documentation for the new GitLab Code Quality form…

16ea4c2

…at and GitLab matches summary features in CLIENT_HELP.md

chore:SP-3589 Enhanced the match summary with clickable links and imp…

b36b74e

…roved formatting

fixed exception print variable bug

2d48fe3

agustingroh force-pushed the feat/implement-gitlab--code-quality-report branch from 26eb9b8 to 38aebf0 Compare October 26, 2025 14:59

agustingroh force-pushed the feat/implement-gitlab--code-quality-report branch from 38aebf0 to 4b6597a Compare October 26, 2025 15:01

chore:implements Rawbase methods on ComponentSummary and LicenseSumma…

585bdde

…ry commands

agustingroh force-pushed the feat/implement-gitlab--code-quality-report branch from 4b6597a to 585bdde Compare October 26, 2025 15:02

eeisegn approved these changes Oct 27, 2025

View reviewed changes

matiasdaloia reviewed Oct 27, 2025

View reviewed changes

agustingroh merged commit 5650419 into main Oct 27, 2025
6 checks passed

agustingroh deleted the feat/implement-gitlab--code-quality-report branch October 27, 2025 14:38

coderabbitai bot mentioned this pull request Oct 28, 2025

Chore/refactor inspect module #161

Merged

Feat/added code quality format to convert subcomand and added new match summary into inspect subcommand #157

Feat/added code quality format to convert subcomand and added new match summary into inspect subcommand #157

Uh oh!

Conversation

agustingroh commented Oct 22, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What's Changed

Added

Changed

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

github-actions bot commented Oct 22, 2025

SCANOSS SCAN Completed 🚀

Uh oh!

github-actions bot commented Oct 22, 2025

SCANOSS SCAN Completed 🚀

Uh oh!

github-actions bot commented Oct 22, 2025

SCANOSS SCAN Completed 🚀

Uh oh!

github-actions bot commented Oct 22, 2025

SCANOSS SCAN Completed 🚀

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Oct 23, 2025

SCANOSS SCAN Completed 🚀

Uh oh!

github-actions bot commented Oct 23, 2025

SCANOSS SCAN Completed 🚀

Uh oh!

github-actions bot commented Oct 23, 2025

SCANOSS SCAN Completed 🚀

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Oct 23, 2025

SCANOSS SCAN Completed 🚀

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Oct 23, 2025

SCANOSS SCAN Completed 🚀

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

agustingroh commented Oct 22, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 22, 2025 •

edited

Loading