-
Notifications
You must be signed in to change notification settings - Fork 24
Feat/added code quality format to convert subcomand and added new match summary into inspect subcommand #157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Warning Rate limit exceeded@agustingroh has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 11 minutes and 11 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (16)
WalkthroughAdds GitLab Code Quality JSON output and a GitLab matches Markdown inspector; extracts JSON loading, table generation, and line-range parsing into new utilities; replaces instance table generators with shared helpers; updates CLI wiring, docs, and small robustness/typing fixes. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant CLI
participant MatchSummary
participant GitLabQR
participant FileUtils
participant MarkdownUtils
participant Output
rect rgba(240,249,255,1)
Note over CLI,MatchSummary: Inspect GitLab matches flow
User->>CLI: inspect gitlab matches --input results.json [--output out.md]
CLI->>MatchSummary: instantiate(args)
MatchSummary->>FileUtils: load_json_file(results.json)
FileUtils-->>MatchSummary: parsed JSON
MatchSummary->>MatchSummary: validate & build summaries
MatchSummary->>MarkdownUtils: generate_table(...) for sections
MarkdownUtils-->>MatchSummary: markdown string
MatchSummary->>Output: print_to_file_or_stdout(markdown)
end
rect rgba(255,247,240,1)
Note over CLI,GitLabQR: Convert → GitLab Code Quality JSON
User->>CLI: convert --input results.json --format glc-codequality [--output out.json]
CLI->>GitLabQR: produce_from_file(results.json, output)
GitLabQR->>FileUtils: load_json_file(results.json)
FileUtils-->>GitLabQR: parsed JSON
GitLabQR->>GitLabQR: map results → CodeQuality items
GitLabQR->>Output: write JSON (stdout/file)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
0a2581c to
5f85c67
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
5f85c67 to
619900e
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
619900e to
6cee353
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
6cee353 to
748a91c
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
748a91c to
6c17e2f
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 8
🧹 Nitpick comments (4)
src/scanoss/inspection/utils/markdown_utils.py (1)
52-52: Consider removing the unusedcentered_columnsparameter.The
centered_columnsparameter is declared but never used in the function body. If Jira tables don't support column centering, consider removing this parameter to avoid confusion.src/scanoss/inspection/raw/match_summary.py (1)
174-179: Redundant purl validation checks.Lines 174-179 contain redundant validation. The check on line 177 (
if not len(result.get('purl')) > 0) is redundant with line 174 (if not result.get('purl')) since an empty list is falsy in Python.Consider consolidating to a single, clearer check:
- if not result.get('purl'): - self.print_debug(f'ERROR: No purl found for file {file_name}') - continue - if not len(result.get('purl')) > 0: + if not result.get('purl') or len(result.get('purl')) == 0: self.print_debug(f'ERROR: No purl found for file {file_name}') continueOr even simpler, since the first check catches both None and empty list:
if not result.get('purl'): self.print_debug(f'ERROR: No purl found for file {file_name}') continue - if not len(result.get('purl')) > 0: - self.print_debug(f'ERROR: No purl found for file {file_name}') - continuesrc/scanoss/gitlabqualityreport.py (2)
58-64: Redundant assignment in constructor.Line 64 (
self.debug = debug) is redundant since the parent classScanossBase.__init__(debug)already setsself.debug.Remove the redundant assignment:
def __init__(self, debug: bool = False, output_file: str = None): """ Initialise the GitLabCodeQuality class """ super().__init__(debug) self.output_file = output_file - self.debug = debug
92-103: Consider using context manager for file handling.The file handling could be improved by using a context manager consistently, though the current implementation works correctly.
def _write_output(self, data: list[CodeQuality], output_file: str = None) -> bool: """Write the Gitlab Code Quality Report to output.""" try: json_data = [item.to_dict() for item in data] - file = open(output_file, 'w') if output_file else sys.stdout - print(json.dumps(json_data, indent=2), file=file) - if output_file: - file.close() + if output_file: + with open(output_file, 'w') as file: + print(json.dumps(json_data, indent=2), file=file) + else: + print(json.dumps(json_data, indent=2), file=sys.stdout) return True except Exception as e: self.print_stderr(f'Error writing output: {str(e)}') return False
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (13)
CHANGELOG.md(2 hunks)CLIENT_HELP.md(4 hunks)src/scanoss/__init__.py(1 hunks)src/scanoss/cli.py(8 hunks)src/scanoss/gitlabqualityreport.py(1 hunks)src/scanoss/inspection/dependency_track/project_violation.py(3 hunks)src/scanoss/inspection/policy_check.py(0 hunks)src/scanoss/inspection/raw/copyleft.py(3 hunks)src/scanoss/inspection/raw/match_summary.py(1 hunks)src/scanoss/inspection/raw/raw_base.py(2 hunks)src/scanoss/inspection/raw/undeclared_component.py(3 hunks)src/scanoss/inspection/utils/file_utils.py(1 hunks)src/scanoss/inspection/utils/markdown_utils.py(1 hunks)
💤 Files with no reviewable changes (1)
- src/scanoss/inspection/policy_check.py
🧰 Additional context used
🧬 Code graph analysis (7)
src/scanoss/inspection/dependency_track/project_violation.py (2)
src/scanoss/inspection/utils/markdown_utils.py (2)
generate_jira_table(52-63)generate_table(25-50)src/scanoss/inspection/raw/copyleft.py (1)
_md_summary_generator(126-166)
src/scanoss/gitlabqualityreport.py (2)
src/scanoss/scanossbase.py (1)
ScanossBase(28-107)src/scanoss/inspection/raw/match_summary.py (1)
_get_lines(96-112)
src/scanoss/inspection/raw/copyleft.py (2)
src/scanoss/inspection/utils/markdown_utils.py (2)
generate_jira_table(52-63)generate_table(25-50)src/scanoss/inspection/dependency_track/project_violation.py (1)
_md_summary_generator(380-431)
src/scanoss/inspection/raw/raw_base.py (2)
src/scanoss/inspection/utils/file_utils.py (1)
load_json_file(29-44)src/scanoss/cli.py (1)
print_stderr(94-98)
src/scanoss/inspection/raw/undeclared_component.py (1)
src/scanoss/inspection/utils/markdown_utils.py (2)
generate_jira_table(52-63)generate_table(25-50)
src/scanoss/inspection/raw/match_summary.py (3)
src/scanoss/scanossbase.py (2)
ScanossBase(28-107)print_to_file_or_stdout(83-94)src/scanoss/inspection/utils/file_utils.py (1)
load_json_file(29-44)src/scanoss/inspection/utils/markdown_utils.py (1)
generate_table(25-50)
src/scanoss/cli.py (2)
src/scanoss/inspection/raw/match_summary.py (2)
MatchSummary(60-277)run(259-277)src/scanoss/gitlabqualityreport.py (2)
GitLabQualityReport(52-155)produce_from_file(140-155)
🪛 LanguageTool
CHANGELOG.md
[grammar] ~14-~14: Ensure spelling is correct
Context: ...ded glc-codequality format to convert subcomand - Added inspect gitlab matches subcomman...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
🔇 Additional comments (27)
src/scanoss/__init__.py (1)
25-25: LGTM! Version bump aligns with new features.The version bump to 1.38.0 correctly reflects the addition of new features (GitLab Code Quality support, new inspect commands, and utility modules) per semantic versioning conventions.
src/scanoss/inspection/dependency_track/project_violation.py (3)
31-31: LGTM! Refactor to shared markdown utilities.The import of
generate_tableandgenerate_jira_tablefrom the newmarkdown_utilsmodule centralizes table generation logic, eliminating duplication across the codebase.
199-199: LGTM! Delegating to shared utility.The call to
generate_table(module-level helper) replaces the previous instance method, correctly delegating table generation to the centralized utility.
211-211: LGTM! Delegating to shared utility.The call to
generate_jira_table(module-level helper) replaces the previous instance method, correctly delegating Jira table generation to the centralized utility.CHANGELOG.md (1)
708-708: LGTM! Manifest link added.The manifest link for version 1.38.0 correctly points to the comparison between 1.37.1 and 1.38.0.
src/scanoss/inspection/raw/undeclared_component.py (3)
30-30: LGTM! Refactor to shared markdown utilities.The import of table generation helpers from
markdown_utilsaligns with the broader refactoring to centralize Markdown formatting logic.
197-197: LGTM! Using shared utility.Correctly delegates Markdown table generation to the module-level
generate_tablehelper.
215-215: LGTM! Using shared utility.Correctly delegates Jira Markdown table generation to the module-level
generate_jira_tablehelper.src/scanoss/inspection/raw/copyleft.py (3)
30-30: LGTM! Refactor to shared markdown utilities.Consistent with the broader refactoring effort to centralize table generation logic in the
markdown_utilsmodule.
115-115: LGTM! Using shared utility.Correctly delegates to the module-level
generate_tablehelper.
124-124: LGTM! Using shared utility.Correctly delegates to the module-level
generate_jira_tablehelper.src/scanoss/inspection/raw/raw_base.py (2)
30-30: LGTM! Refactor to shared file utilities.The import of
load_json_filefrom the newfile_utilsmodule centralizes JSON loading logic, reducing code duplication.
315-319: LGTM! Simplified JSON loading with shared utility.The refactored
_load_input_filemethod correctly delegates JSON loading to theload_json_filehelper while preserving error handling behavior (returningNoneon exceptions).CLIENT_HELP.md (4)
241-241: LGTM! Documentation updated for new formats.The updated description correctly reflects the addition of GitLab Code Quality Report format alongside existing formats.
252-256: LGTM! Clear example for GitLab Code Quality Report conversion.The command example clearly demonstrates how to convert scan results to GitLab Code Quality Report format using the new
glc-codequalityformat option.
419-419: LGTM! Inspect commands list updated.The addition of "GitLab Components Match Summary (
gitlab matches)" correctly documents the new inspect subcommand.
539-543: LGTM! Clear example for GitLab match summary.The command example clearly demonstrates how to generate a GitLab-compatible Markdown match summary from SCANOSS scan results.
src/scanoss/inspection/utils/markdown_utils.py (1)
25-50: LGTM! Well-structured markdown table generator.The function correctly handles centered columns using markdown syntax and properly constructs the table with headers, separators, and data rows.
src/scanoss/cli.py (5)
43-43: LGTM! Imports are correctly added.The new imports for
MatchSummaryandGitLabQualityReportare properly placed and follow the existing import organization.Also applies to: 77-77
288-288: LGTM! Format choice correctly added.The
glc-codequalityformat has been properly added to the choices list, consistent with the naming convention used in the convert handler.
799-873: LGTM! Well-structured GitLab subcommand configuration.The GitLab integration subcommand is properly configured with:
- Clear hierarchical structure (inspect → gitlab → matches)
- Appropriate required and optional arguments
- Comprehensive help text and descriptions
- Proper handler function binding
1680-1684: LGTM! Convert handler properly implemented.The GitLab Code Quality format handler follows the same pattern as existing format handlers and correctly instantiates and calls the
GitLabQualityReportclass.
1958-2008: LGTM! Comprehensive handler implementation.The
inspect_gitlab_matcheshandler is well-implemented with:
- Proper output file initialization
- Clear parameter passing to
MatchSummary- Robust error handling with debug support
- Excellent documentation explaining the command's purpose and parameters
src/scanoss/inspection/raw/match_summary.py (3)
32-58: LGTM! Well-defined data structures.The dataclasses provide clear, well-documented structures for organizing match results. The separation between
MatchSummaryItemandComponentMatchSummarysupports clean categorization of match types.
200-257: LGTM! Markdown generation is well-structured.The markdown generation creates a clean, collapsible report structure. The use of HTML
<details>tags for collapsible sections is appropriate for GitLab/GitHub markdown rendering.Note: Empty match lists will generate tables with headers only, which is acceptable behavior.
259-277: LGTM! Clean orchestration of the workflow.The
runmethod provides a clear entry point and properly orchestrates the entire summary generation process.src/scanoss/gitlabqualityreport.py (1)
9-39: LGTM! Well-structured dataclasses for GitLab Code Quality format.The dataclasses correctly model the GitLab Code Quality report structure with proper nesting and a clean
to_dictmethod for JSON serialization.
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (1)
src/scanoss/inspection/utils/file_utils.py (1)
25-31: Use precise typing for JSON payload.Prefer Dict[str, Any] to document structure clearly; aligns with prior feedback.
-from typing import Any, Dict +from typing import Any, Dict @@ -def load_json_file(file_path: str) -> dict: +def load_json_file(file_path: str) -> Dict[str, Any]: """ Load the file @@ - Returns: - Dict[str, Any]: The parsed JSON data + Returns: + Dict[str, Any]: The parsed JSON dataAlso applies to: 29-37
🧹 Nitpick comments (5)
src/scanoss/inspection/utils/file_utils.py (1)
25-26: Harden file reading and JSON error handling.Specify UTF‑8 and catch JSONDecodeError/UnicodeDecodeError explicitly.
import json import os +from json import JSONDecodeError @@ - with open(file_path, 'r') as jsonfile: + with open(file_path, 'r', encoding='utf-8') as jsonfile: try: return json.load(jsonfile) - except Exception as e: - raise ValueError(f'ERROR: Problem parsing input JSON: {e}') + except (JSONDecodeError, UnicodeDecodeError) as e: + raise ValueError(f'Problem parsing input JSON: {e}') from eAlso applies to: 40-44
src/scanoss/gitlabqualityreport.py (2)
25-32: Fix import order (I001).Sort/ group stdlib vs local imports.
-import json -import os -import sys -from dataclasses import dataclass - -from .scanossbase import ScanossBase -from .utils import scanoss_scan_results_utils +from dataclasses import dataclass +import json +import os +import sys + +from .scanossbase import ScanossBase +from .utils import scanoss_scan_results_utils
105-117: Use context manager and propagate write result.Close files reliably and return the writer’s status.
- def _write_output(self, data: list[CodeQuality], output_file: str = None) -> bool: + def _write_output(self, data: list[CodeQuality], output_file: str = None) -> bool: @@ - json_data = [item.to_dict() for item in data] - file = open(output_file, 'w') if output_file else sys.stdout - print(json.dumps(json_data, indent=2), file=file) - if output_file: - file.close() + json_data = [item.to_dict() for item in data] + payload = json.dumps(json_data, indent=2) + if output_file: + with open(output_file, 'w', encoding='utf-8') as fh: + print(payload, file=fh) + else: + print(payload, file=sys.stdout) return True @@ - self._write_output(data=code_quality,output_file=output_file) - return True + return self._write_output(data=code_quality, output_file=output_file)Also applies to: 118-134
src/scanoss/inspection/raw/match_summary.py (2)
25-31: Fix import order and spacing (I001).Sort imports and remove double space.
-from dataclasses import dataclass - -from ...scanossbase import ScanossBase -from ..utils.file_utils import load_json_file -from ..utils.markdown_utils import generate_table -from ...utils import scanoss_scan_results_utils +from dataclasses import dataclass + +from ...scanossbase import ScanossBase +from ..utils.file_utils import load_json_file +from ..utils.markdown_utils import generate_table +from ...utils import scanoss_scan_results_utils
137-182: Reduce branches in validation (PLR0912) by extracting a validator.Consolidate required‑field checks to pass linter and simplify flow.
- for file_name, results in scan_results.items(): - for result in results: - # Validate required fields - skip invalid results with debug messages - if not result.get('id'): - self.print_debug(f'ERROR: No id found for file {file_name}') - continue - if result.get('id') == "none": # Skip non-matches - continue - if not result.get('lines'): - self.print_debug(f'ERROR: No lines found for file {file_name}') - continue - if not result.get('purl'): - self.print_debug(f'ERROR: No purl found for file {file_name}') - continue - if not len(result.get('purl')) > 0: - self.print_debug(f'ERROR: No purl found for file {file_name}') - continue - if not result.get('licenses'): - self.print_debug(f'ERROR: No licenses found for file {file_name}') - continue - if len(result.get('licenses')) == 0: - self.print_debug(f'ERROR: Empty licenses list for file {file_name}') - continue - if not result.get('version'): - self.print_debug(f'ERROR: No version found for file {file_name}') - continue - if not result.get('matched'): - self.print_debug(f'ERROR: No matched found for file {file_name}') - continue - if not result.get('url'): - self.print_debug(f'ERROR: No url found for file {file_name}') - continue + for file_name, results in scan_results.items(): + for result in results: + # Skip non-matches early + if not result.get('id') or result.get('id') == "none": + if not result.get('id'): + self.print_debug(f'ERROR: No id found for file {file_name}') + continue + # Required presence checks + required_fields = ['lines', 'purl', 'licenses', 'version', 'matched', 'url'] + missing = [k for k in required_fields if not result.get(k)] + if missing: + self.print_debug(f"ERROR: Missing {', '.join(missing)} for file {file_name}") + continue + # Non-empty list checks + if not result.get('purl') or len(result['purl']) == 0: + self.print_debug(f'ERROR: Empty purl list for file {file_name}') + continue + if not result.get('licenses') or len(result['licenses']) == 0: + self.print_debug(f'ERROR: Empty licenses list for file {file_name}') + continueIf preferred, extract this into a small
_is_valid_result(file_name, result) -> boolhelper to further reduce branches.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
CHANGELOG.md(2 hunks)src/scanoss/gitlabqualityreport.py(1 hunks)src/scanoss/inspection/raw/match_summary.py(1 hunks)src/scanoss/inspection/utils/file_utils.py(1 hunks)src/scanoss/utils/scanoss_scan_results_utils.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- CHANGELOG.md
🧰 Additional context used
🧬 Code graph analysis (2)
src/scanoss/gitlabqualityreport.py (2)
src/scanoss/scanossbase.py (1)
ScanossBase(28-107)src/scanoss/utils/scanoss_scan_results_utils.py (1)
get_lines(25-41)
src/scanoss/inspection/raw/match_summary.py (4)
src/scanoss/scanossbase.py (3)
ScanossBase(28-107)print_stdout(73-81)print_to_file_or_stdout(83-94)src/scanoss/inspection/utils/file_utils.py (1)
load_json_file(29-44)src/scanoss/inspection/utils/markdown_utils.py (1)
generate_table(25-50)src/scanoss/utils/scanoss_scan_results_utils.py (1)
get_lines(25-41)
🪛 GitHub Actions: Lint
src/scanoss/gitlabqualityreport.py
[error] 25-25: I001 Import block is un-sorted or un-formatted. Organize imports.
src/scanoss/inspection/raw/match_summary.py
[error] 25-25: I001 Import block is un-sorted or un-formatted. Organize imports.
[error] 137-137: PLR0912 Too many branches (14 > 12).
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (1)
src/scanoss/gitlabqualityreport.py (1)
79-103: Remove or complete the diff; Lines class lacks end field.The proposed diff attempts to add
end=end_lineto theLines()instantiation, but theLinesdataclass (line 34–36) only defines abegin: intfield. Applying this diff as-is will raise aTypeErrorat runtime.Either:
- Add
end: intfield to theLinesdataclass definition and update itsto_dict()serialization, OR- Revert the diff change that adds
end=end_linetoLines().The other refactoring suggestions (type hint syntax, variable rename, check_name change) are reasonable, but the diff must be corrected before merging.
Likely an incorrect or invalid review comment.
e370ba8 to
acb0e86
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
src/scanoss/inspection/raw/match_summary.py (1)
111-130: Guard missing prefix and anchor only the first range; preserve original ranges.Avoid “None/…” URLs, and don’t create a cross‑range anchor; show original ranges string.
- if result.get('id') == "snippet": + if result.get('id') == "snippet": # Snippet match: create URL with line range anchor - lines = scanoss_scan_results_utils.get_lines(result.get('lines')) - end_line = lines[len(lines) - 1] if len(lines) > 1 else lines[0] - file_url = f"{self.line_range_prefix}/{file_name}#L{lines[0]}-L{end_line}" + ranges_str = str(result.get('lines')) + first_range = ranges_str.split(',')[0].strip() if ranges_str else "" + anchor = "" + if first_range: + if '-' in first_range: + start_s, end_s = (p.strip() for p in first_range.split('-', 1)) + anchor = f"#L{int(start_s)}-L{int(end_s)}" + else: + anchor = f"#L{int(first_range)}" + file_url = f"{self.line_range_prefix}/{file_name}{anchor}" if self.line_range_prefix else f"{file_name}{anchor}" return MatchSummaryItem( - file_url=file_url, + file_url=file_url, file=file_name, - license=result.get('licenses')[0].get('name'), - similarity=result.get('matched'), + license=result.get('licenses')[0].get('name'), + similarity=str(result.get('matched')), purl=result.get('purl')[0], purl_url=result.get('url'), version=result.get('version'), - lines=f"{lines[0]}-{lines[len(lines) - 1] if len(lines) > 1 else lines[0]}" + lines=ranges_str ) - # File match: create URL without line range + # File match: create URL without line range return MatchSummaryItem( file=file_name, - file_url=f"{self.line_range_prefix}/{file_name}", + file_url=f"{self.line_range_prefix}/{file_name}" if self.line_range_prefix else file_name, license=result.get('licenses')[0].get('name'), - similarity=result.get('matched'), + similarity=str(result.get('matched')), purl=result.get('purl')[0], purl_url=result.get('url'), version=result.get('version'), lines="all" )
🧹 Nitpick comments (3)
CHANGELOG.md (1)
14-16: Tighten wording and pluralization for clarity.Use “to the … subcommand” and plural “summaries” to match the command name.
-- Added `glc-codequality` format to convert subcommand +- Added `glc-codequality` format to the `convert` subcommand - - Added `inspect gitlab matches` subcommand to generate GitLab-compatible Markdown match summary from SCANOSS scan results + - Added `inspect gitlab matches` subcommand to generate GitLab-compatible Markdown match summaries from SCANOSS scan resultssrc/scanoss/inspection/utils/file_utils.py (1)
29-44: Improve typing, encoding, and error handling.More precise types, UTF‑8 decoding, and clearer JSON error help robustness.
+from typing import Any, Dict @@ -def load_json_file(file_path: str) -> dict: +def load_json_file(file_path: str) -> Dict[str, Any]: @@ - if not os.path.exists(file_path): + if not os.path.isfile(file_path): raise ValueError(f'The file "{file_path}" does not exist.') - with open(file_path, 'r') as jsonfile: - try: - return json.load(jsonfile) - except Exception as e: - raise ValueError(f'ERROR: Problem parsing input JSON: {e}') + try: + with open(file_path, 'r', encoding='utf-8') as jsonfile: + return json.load(jsonfile) + except json.JSONDecodeError as e: + # include location details when available + raise ValueError(f'ERROR: Problem parsing input JSON at line {e.lineno}, column {e.colno}: {e.msg}') from esrc/scanoss/inspection/raw/match_summary.py (1)
219-247: Center the Similarity column for readability.Minor presentation tweak for Markdown tables.
- file_match_table = generate_table(file_match_headers, file_match_rows) + file_match_table = generate_table(file_match_headers, file_match_rows, centered_columns=[2]) @@ - snippet_match_table = generate_table(snippet_match_headers, snippet_match_rows) + snippet_match_table = generate_table(snippet_match_headers, snippet_match_rows, centered_columns=[2])
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
CHANGELOG.md(2 hunks)src/scanoss/gitlabqualityreport.py(1 hunks)src/scanoss/inspection/raw/match_summary.py(1 hunks)src/scanoss/inspection/utils/file_utils.py(1 hunks)src/scanoss/utils/scanoss_scan_results_utils.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- src/scanoss/utils/scanoss_scan_results_utils.py
- src/scanoss/gitlabqualityreport.py
🧰 Additional context used
🧬 Code graph analysis (1)
src/scanoss/inspection/raw/match_summary.py (4)
src/scanoss/scanossbase.py (3)
ScanossBase(28-107)print_stdout(73-81)print_to_file_or_stdout(83-94)src/scanoss/inspection/utils/file_utils.py (1)
load_json_file(29-44)src/scanoss/inspection/utils/markdown_utils.py (1)
generate_table(25-50)src/scanoss/utils/scanoss_scan_results_utils.py (1)
get_lines(25-41)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
acb0e86 to
4e872a7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
♻️ Duplicate comments (5)
src/scanoss/gitlabqualityreport.py (3)
34-36: Addendfield to Lines dataclass.The
Linesdataclass is missing anendfield. GitLab Code Quality (Code Climate) format supports bothbeginandendline numbers, and including the end line improves the precision and usefulness of the report. This was flagged in a previous review but has not been fully addressed.Apply this diff to add the
endfield:@dataclass class Lines: begin: int + end: int
79-119: Fix spacing and populateendfield in Lines construction.Two issues need to be addressed:
- Spacing inconsistencies: Lines 93-94 have extra spaces around
=operators (should belines=Lines(andbegin=0without extra spaces)- Missing end field: After adding the
endfield to theLinesdataclass, both Lines constructions (for 'file' and 'snippet' matches) need to populate itApply this diff:
def _get_code_quality(self, file_name: str, result: dict) -> CodeQuality or None: if not result.get('file_hash'): self.print_debug(f"Warning: no hash found for result: {result}") return None if result.get('id') == 'file': description = f"File match found in: {file_name}" return CodeQuality( description=description, check_name=file_name, fingerprint=result.get('file_hash'), severity="info", location=Location( path=file_name, - lines = Lines( - begin= 0 + lines=Lines( + begin=0, + end=0 ) ) ) if not result.get('lines'): self.print_debug(f"Warning: No lines found for result: {result}") return None lines = scanoss_scan_results_utils.get_lines(result.get('lines')) if len(lines) == 0: self.print_debug(f"Warning: empty lines for result: {result}") return None end_line = lines[len(lines) - 1] if len(lines) > 1 else lines[0] description = f"Snippet found in: {file_name} - lines {lines[0]}-{end_line}" return CodeQuality( description=description, check_name=file_name, fingerprint=result.get('file_hash'), severity="info", location=Location( path=file_name, lines=Lines( - begin=lines[0] + begin=lines[0], + end=end_line ) ) )
51-64: Updateto_dictto includeendline.Once the
endfield is added to theLinesdataclass, theto_dictmethod should serialize bothbeginandendto the output JSON for completeness.Apply this diff after adding the
endfield toLines:def to_dict(self): """Convert to dictionary for JSON serialization.""" return { "description": self.description, "check_name": self.check_name, "fingerprint": self.fingerprint, "severity": self.severity, "location": { "path": self.location.path, "lines": { - "begin": self.location.lines.begin + "begin": self.location.lines.begin, + "end": self.location.lines.end } } }src/scanoss/inspection/raw/match_summary.py (2)
111-125: Past issues remain unaddressed: None in URLs and line range reconstruction.Despite being marked as addressed in previous commits, the following issues are still present:
- Lines 115: When
self.line_range_prefixisNone, the URL becomes"None/filename#L..."instead of being handled gracefully.- Line 124: Line ranges are reconstructed as
"{start}-{end}", which loses comma-separated ranges. For example,"10-20,25-30"becomes"10-30", incorrectly implying a continuous range.Apply this diff to fix both issues:
if result.get('id') == "snippet": # Snippet match: create URL with line range anchor lines = scanoss_scan_results_utils.get_lines(result.get('lines')) end_line = lines[len(lines) - 1] if len(lines) > 1 else lines[0] - file_url = f"{self.line_range_prefix}/{file_name}#L{lines[0]}-L{end_line}" + file_url = f"{self.line_range_prefix}/{file_name}#L{lines[0]}-L{end_line}" if self.line_range_prefix else file_name return MatchSummaryItem( file_url=file_url, file=file_name, license=result.get('licenses')[0].get('name'), similarity=result.get('matched'), purl=result.get('purl')[0], purl_url=result.get('url'), version=result.get('version'), - lines=f"{lines[0]}-{lines[len(lines) - 1] if len(lines) > 1 else lines[0]}" + lines=result.get('lines') # Preserve original format ) # File match: create URL without line range return MatchSummaryItem( file=file_name, - file_url=f"{self.line_range_prefix}/{file_name}", + file_url=f"{self.line_range_prefix}/{file_name}" if self.line_range_prefix else file_name, license=result.get('licenses')[0].get('name'),
146-154: Past issue remains: 'lines' required for all matches.Despite being marked as addressed in a previous commit, line 148 still includes
'lines'in the mandatory validations. This incorrectly rejects valid file-level matches that don't have a'lines'field, since only snippet matches require line information.Apply this diff:
validations = [ ('id', 'No id found'), - ('lines', 'No lines found'), ('purl', 'No purl found'), ('licenses', 'No licenses found'), ('version', 'No version found'), ('matched', 'No matched found'), ('url', 'No url found'), ] for field, error_msg in validations: if not result.get(field): self.print_debug(f'ERROR: {error_msg} for file {file_name}') return False + + # 'lines' is required only for snippet matches + if result.get('id') == "snippet" and not result.get('lines'): + self.print_debug(f'ERROR: No lines found for snippet match in file {file_name}') + return False
🧹 Nitpick comments (7)
src/scanoss/inspection/raw/component_summary.py (2)
33-74: Reconsider the premature abstraction and inconsistent type signatures.The
_jsonmethod currently returns data unchanged, adding an extra layer without transformation. The stub methods_markdownand_jira_markdownacceptlist[T]while_jsonacceptsdict[str, Any], creating a type signature mismatch that suggests these methods don't share a unified interface.Consider deferring these formatters until the actual formatting logic is needed, or establish a consistent signature across all three methods.
135-137: The _json formatter adds no value currently.Since
_jsonreturns its input unchanged, calling it beforejson.dumpsadds an unnecessary function call. While this may be preparing for future formatting logic, the current implementation provides no benefit.src/scanoss/inspection/raw/license_summary.py (2)
41-79: Identical pattern to ComponentSummary with the same concerns.This file mirrors the formatter infrastructure added to
component_summary.py, including the pass-through_jsonmethod and unimplemented stub methods with inconsistent type signatures (dict[str, Any]vslist[T]). The same refactoring considerations apply here.
179-181: Pass-through _json adds no transformation.Like in
component_summary.py, the_jsoncall is currently a no-op. The code is clear and correct, but the extra indirection provides no immediate benefit.src/scanoss/gitlabqualityreport.py (1)
134-150: Fix spacing after comma in method call.Line 149 is missing a space after the comma in the method call parameters.
Apply this diff:
- self._write_output(data=code_quality,output_file=output_file) + self._write_output(data=code_quality, output_file=output_file)src/scanoss/inspection/raw/match_summary.py (2)
114-114: Use idiomatic Python indexing for last element.Replace
lines[len(lines) - 1]withlines[-1]for more concise and Pythonic code.Apply this diff:
- end_line = lines[len(lines) - 1] if len(lines) > 1 else lines[0] + end_line = lines[-1] if len(lines) > 1 else lines[0] file_url = f"{self.line_range_prefix}/{file_name}#L{lines[0]}-L{end_line}" return MatchSummaryItem( ... - lines=f"{lines[0]}-{lines[len(lines) - 1] if len(lines) > 1 else lines[0]}" + lines=f"{lines[0]}-{lines[-1] if len(lines) > 1 else lines[0]}"Also applies to: 124-124
162-167: Remove redundant list length validation.Lines 162-167 check if
purlandlicenseslists are empty, but this is redundant. Line 157 already checksif not result.get(field), and in Python, empty lists are falsy, soif not []evaluates toTrueand the validation would already fail at line 159.Apply this diff:
for field, error_msg in validations: if not result.get(field): self.print_debug(f'ERROR: {error_msg} for file {file_name}') return False - - # Additional validation for non-empty lists - if len(result.get('purl')) == 0: - self.print_debug(f'ERROR: No purl found for file {file_name}') - return False - if len(result.get('licenses')) == 0: - self.print_debug(f'ERROR: Empty licenses list for file {file_name}') - return False - return True
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
src/scanoss/gitlabqualityreport.py(1 hunks)src/scanoss/inspection/raw/component_summary.py(2 hunks)src/scanoss/inspection/raw/license_summary.py(3 hunks)src/scanoss/inspection/raw/match_summary.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (4)
src/scanoss/inspection/raw/license_summary.py (5)
src/scanoss/inspection/raw/component_summary.py (3)
_json(33-50)_markdown(52-62)_jira_markdown(64-74)src/scanoss/inspection/policy_check.py (3)
_json(101-112)_markdown(115-125)_jira_markdown(128-138)src/scanoss/inspection/raw/copyleft.py (3)
_json(91-106)_markdown(108-115)_jira_markdown(117-124)src/scanoss/inspection/raw/undeclared_component.py (3)
_json(166-181)_markdown(183-199)_jira_markdown(201-217)src/scanoss/scanossbase.py (1)
print_to_file_or_stdout(83-94)
src/scanoss/inspection/raw/component_summary.py (4)
src/scanoss/inspection/raw/raw_base.py (1)
RawBase(54-424)src/scanoss/inspection/raw/license_summary.py (3)
_json(41-55)_markdown(57-67)_jira_markdown(69-79)src/scanoss/inspection/policy_check.py (3)
_json(101-112)_markdown(115-125)_jira_markdown(128-138)src/scanoss/scanossbase.py (1)
print_to_file_or_stdout(83-94)
src/scanoss/gitlabqualityreport.py (2)
src/scanoss/scanossbase.py (1)
ScanossBase(28-107)src/scanoss/utils/scanoss_scan_results_utils.py (1)
get_lines(25-41)
src/scanoss/inspection/raw/match_summary.py (4)
src/scanoss/scanossbase.py (3)
ScanossBase(28-107)print_stdout(73-81)print_to_file_or_stdout(83-94)src/scanoss/inspection/utils/file_utils.py (1)
load_json_file(29-44)src/scanoss/inspection/utils/markdown_utils.py (1)
generate_table(25-50)src/scanoss/utils/scanoss_scan_results_utils.py (1)
get_lines(25-41)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (9)
src/scanoss/gitlabqualityreport.py (5)
1-23: LGTM! License header properly added.The SPDX MIT license header has been correctly added as requested in previous reviews.
25-31: LGTM! Imports are well-organized.The imports are appropriate and follow standard conventions.
66-76: LGTM! Class initialization is correct.The initialization properly calls the parent class constructor and follows expected patterns.
121-132: LGTM! Output writing is properly implemented.The method correctly serializes CodeQuality objects to JSON and handles both file and stdout output with appropriate error handling.
170-185: LGTM! File reading and delegation are properly implemented.The method correctly validates the file path, reads the content, and delegates to
_produce_from_strwith appropriate error handling. The docstring has been properly updated from previous reviews.src/scanoss/inspection/raw/match_summary.py (4)
1-23: License header is now present.The MIT license header properly addresses the previous review comment.
171-202: Clean processing logic with proper validation and categorization.The method correctly:
- Loads and parses the JSON file
- Skips non-matches
- Validates required fields before processing
- Categorizes matches by type (file vs snippet)
205-265: Well-structured Markdown generation with clean table formatting.The method properly:
- Returns early when no matches exist
- Generates separate tables for file and snippet matches using the shared utility
- Creates collapsible sections for better readability
- Builds clickable links for files and PURLs
267-287: Clean orchestration of the workflow.The method properly coordinates the entire process:
- Loads and processes matches
- Generates Markdown
- Handles the no-matches case with a user-friendly message
- Outputs to the configured destination
7befd76 to
80a0004
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
80a0004 to
1ff52e8
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (7)
src/scanoss/inspection/raw/license_summary.py (1)
26-26: Use built‑in generics and fix spacing (duplicate of prior nit).Prefer PEP 585 generics and correct spacing; also drop Dict import.
-from typing import Any, Dict +from typing import Any @@ - def _json(self, data: Dict[str,Any]) -> Dict[str, Any]: + def _json(self, data: dict[str, Any]) -> dict[str, Any]:Also applies to: 41-41
src/scanoss/inspection/raw/component_summary.py (1)
33-33: Fix type annotation syntax and use built‑in generics (duplicate of prior nit).Remove the space before '[' and prefer built‑ins.
- def _json(self, data: dict [str,Any]) -> dict[str,Any]: + def _json(self, data: dict[str, Any]) -> dict[str, Any]:src/scanoss/cli.py (1)
2005-2016: Validate input path and show scoped help on missing args.Add clear error messages and file existence/readability checks; show help for the exact subcommand. This prevents obscure tracebacks and improves UX.
- if args.input is None: - parser.parse_args([args.subparser, '-h']) - sys.exit(1) - - if args.line_range_prefix is None: - parser.parse_args([args.subparser, '-h']) - sys.exit(1) + # Required args (parser enforces, but keep defensive with scoped help) + if not args.input: + print_stderr('ERROR: Input file is required for GitLab matches summary') + parser.parse_args([args.subparser, args.subparsercmd, args.subparser_subcmd, '-h']) + sys.exit(1) + if not args.line_range_prefix: + print_stderr('ERROR: --line-range-prefix is required for GitLab file links') + parser.parse_args([args.subparser, args.subparsercmd, args.subparser_subcmd, '-h']) + sys.exit(1) + # Validate input file path + if not os.path.exists(args.input) or not os.path.isfile(args.input) or not os.access(args.input, os.R_OK): + print_stderr(f'ERROR: Input file does not exist, is not a file, or is not readable: {args.input}') + sys.exit(1)src/scanoss/gitlabqualityreport.py (4)
34-37: Includeendin line range model.Expose both begin/end as supported by GitLab Code Quality (Code Climate).
@dataclass class Lines: - begin: int + begin: int + end: int
51-64: Emit both begin and end in JSON.Update serializer to include end.
return { "description": self.description, "check_name": self.check_name, "fingerprint": self.fingerprint, "severity": self.severity, "location": { "path": self.location.path, - "lines": { - "begin": self.location.lines.begin - } + "lines": { + "begin": self.location.lines.begin, + "end": self.location.lines.end + } } }
99-120: Populate end line for snippets and fix spacing.Ensure both begin and end are present.
- lines = scanoss_scan_results_utils.get_lines(result.get('lines')) + lines = scanoss_scan_results_utils.get_lines(result.get('lines')) if len(lines) == 0: self.print_debug(f"Warning: empty lines for result: {result}") return None end_line = lines[len(lines) - 1] if len(lines) > 1 else lines[0] description = f"Snippet found in: {file_name} - lines {lines[0]}-{end_line}" return CodeQuality( @@ - location=Location( - path=file_name, - lines=Lines( - begin=lines[0] - ) - ) + location=Location( + path=file_name, + lines=Lines( + begin=lines[0], + end=end_line + ) + ) )
152-158: Docstring typo.Fix “Reportoutput” → “Report output”.
- Produce Gitlab Code Quality Reportoutput from input JSON string + Produce GitLab Code Quality Report output from input JSON string
🧹 Nitpick comments (11)
src/scanoss/inspection/raw/license_summary.py (3)
42-54: Docstring mismatch: method is implemented as pass‑through.Update wording to reflect behavior (returns input unchanged and ready for serialization).
- """ - Format license summary data as JSON (not yet implemented). - - This method is intended to return the license summary data in JSON structure - for serialization. The data should include license information with copyleft - analysis and license statistics. - - :param data: List of license summary items to format - :return: Dictionary containing license summary information including: - - licenses: List of detected licenses with SPDX IDs, URLs, and copyleft status - - detectedLicenses: Total number of unique licenses - - detectedLicensesWithCopyleft: Count of licenses marked as copyleft - """ + """ + Format license summary data as JSON (pass‑through). + + Returns the provided license summary dictionary unchanged for JSON serialization. + + :param data: Dictionary with license summary information, including: + - 'licenses': list of detected licenses with SPDX IDs, URLs, and copyleft + - 'detectedLicenses': total unique licenses + - 'detectedLicensesWithCopyleft': count of copyleft licenses + """
57-79: Fail fast for unimplemented formatters.Raise NotImplementedError instead of silently returning None.
- def _markdown(self, data: list[T]) -> Dict[str, Any]: + def _markdown(self, data: list[T]) -> dict[str, Any]: @@ - pass + raise NotImplementedError("LicenseSummary._markdown is not implemented yet.") @@ - def _jira_markdown(self, data: list[T]) -> Dict[str, Any]: + def _jira_markdown(self, data: list[T]) -> dict[str, Any]: @@ - pass + raise NotImplementedError("LicenseSummary._jira_markdown is not implemented yet.")
179-181: Minor consistency: variable name and stable JSON output.Align with ComponentSummary and reduce diff churn by sorting keys.
- # TODO: Implement formatter to support dynamic outputs - json_format = self._json(license_summary) - self.print_to_file_or_stdout(json.dumps(json_format, indent=2), self.output) + # TODO: Implement formatter to support dynamic outputs + json_data = self._json(license_summary) + self.print_to_file_or_stdout(json.dumps(json_data, indent=2, sort_keys=True), self.output)src/scanoss/inspection/raw/component_summary.py (2)
52-74: Fail fast for unimplemented Markdown formatters.Raise NotImplementedError; also standardize return annotation to built‑ins.
- def _markdown(self, data: list[T]) -> Dict[str, Any]: + def _markdown(self, data: list[T]) -> dict[str, Any]: @@ - pass + raise NotImplementedError("ComponentSummary._markdown is not implemented yet.") @@ - def _jira_markdown(self, data: list[T]) -> Dict[str, Any]: + def _jira_markdown(self, data: list[T]) -> dict[str, Any]: @@ - pass + raise NotImplementedError("ComponentSummary._jira_markdown is not implemented yet.")
135-137: Stable JSON output.Add sort_keys for deterministic output ordering.
- self.print_to_file_or_stdout(json.dumps(json_data, indent=2), self.output) + self.print_to_file_or_stdout(json.dumps(json_data, indent=2, sort_keys=True), self.output)src/scanoss/cli.py (4)
843-848: Use a simple short option for line-range prefix.
-lpris unconventional; prefer a single-letter short flag for usability, e.g.-L.- p_gl_inspect_matches.add_argument( - '-lpr', + p_gl_inspect_matches.add_argument( + '-L', '--line-range-prefix', required=True, type=str,
1275-1276: Tighten inspect help guard conditional.
args.subparser in 'inspect'relies on substring matching; use explicit tuple to avoid surprises.- elif (args.subparser in 'inspect') and (args.subparsercmd in ('raw', 'dt', 'glc', 'gitlab')) and (args.subparser_subcmd is None): + elif (args.subparser in ('inspect', 'insp', 'ins')) and (args.subparsercmd in ('raw', 'dt', 'glc', 'gitlab')) and (args.subparser_subcmd is None):
1696-1701: Nit: brand casing.Prefer “GitLab” in user messages.
- if not args.quiet: - print_stderr('Producing Gitlab code quality report...') + if not args.quiet: + print_stderr('Producing GitLab Code Quality report...')
1976-2003: Docstring grammar and clarity.Tighten phrasing and align with command name.
- Handle GitLab matches the summary inspection command. + Handle GitLab matches summary inspection command. @@ - This command processes SCANOSS scan output and creates human-readable Markdown. + This command processes SCANOSS scan output and creates a human‑readable Markdown report.src/scanoss/gitlabqualityreport.py (2)
92-97: Normalize spacing and populate end for file-level matches.Also set a sensible end (0 or 1) consistently.
- location=Location( - path=file_name, - lines = Lines( - begin= 0 - ) - ) + location=Location( + path=file_name, + lines=Lines( + begin=0, + end=0 + ) + )
149-150: Minor: spacing around arguments.PEP 8: add a space after the comma.
- self._write_output(data=code_quality,output_file=output_file) + self._write_output(data=code_quality, output_file=output_file)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
src/scanoss/cli.py(9 hunks)src/scanoss/gitlabqualityreport.py(1 hunks)src/scanoss/inspection/raw/component_summary.py(2 hunks)src/scanoss/inspection/raw/license_summary.py(3 hunks)src/scanoss/inspection/raw/match_summary.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- src/scanoss/inspection/raw/match_summary.py
🧰 Additional context used
🧬 Code graph analysis (4)
src/scanoss/inspection/raw/license_summary.py (3)
src/scanoss/inspection/raw/component_summary.py (3)
_json(33-50)_markdown(52-62)_jira_markdown(64-74)src/scanoss/inspection/policy_check.py (3)
_json(101-112)_markdown(115-125)_jira_markdown(128-138)src/scanoss/scanossbase.py (1)
print_to_file_or_stdout(83-94)
src/scanoss/cli.py (2)
src/scanoss/inspection/raw/match_summary.py (2)
MatchSummary(63-287)run(267-287)src/scanoss/gitlabqualityreport.py (2)
GitLabQualityReport(66-185)produce_from_file(170-185)
src/scanoss/inspection/raw/component_summary.py (5)
src/scanoss/inspection/raw/license_summary.py (3)
_json(41-55)_markdown(57-67)_jira_markdown(69-79)src/scanoss/inspection/policy_check.py (3)
_json(101-112)_markdown(115-125)_jira_markdown(128-138)src/scanoss/inspection/raw/copyleft.py (3)
_json(91-106)_markdown(108-115)_jira_markdown(117-124)src/scanoss/inspection/raw/undeclared_component.py (3)
_json(166-181)_markdown(183-199)_jira_markdown(201-217)src/scanoss/scanossbase.py (1)
print_to_file_or_stdout(83-94)
src/scanoss/gitlabqualityreport.py (2)
src/scanoss/scanossbase.py (1)
ScanossBase(28-107)src/scanoss/utils/scanoss_scan_results_utils.py (1)
get_lines(25-41)
🔇 Additional comments (1)
src/scanoss/cli.py (1)
288-289: Format choice LGTM.New choice 'glc-codequality' is correctly wired.
1ff52e8 to
66938e5
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
66938e5 to
f72e7fb
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
f72e7fb to
26eb9b8
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
♻️ Duplicate comments (8)
src/scanoss/inspection/raw/component_summary.py (1)
33-33: Type annotation spacing issue (already flagged).The type annotation still has the spacing issue previously identified:
dict[str,Any]should bedict[str, Any](space after comma, no space before bracket).Apply this diff to fix the spacing:
- def _json(self, data: dict[str,Any]) -> dict[str,Any]: + def _json(self, data: dict[str, Any]) -> dict[str, Any]:src/scanoss/inspection/raw/license_summary.py (1)
41-41: Fix inconsistent type annotation spacing within the same signature.The parameter type uses
dict[str,Any](no space after comma) while the return type usesdict[str, Any](with space). This inconsistency violates PEP 8 style guidelines.Apply this diff to fix the spacing:
- def _json(self, data: dict[str,Any]) -> dict[str, Any]: + def _json(self, data: dict[str, Any]) -> dict[str, Any]:src/scanoss/inspection/raw/match_summary.py (3)
127-135: Guard file URL when prefix is absent and normalize slash.Avoid emitting “None/…” and double slashes.
- return MatchSummaryItem( - file=file_name, - file_url=f"{self.line_range_prefix}/{file_name}", + prefix = (self.line_range_prefix.rstrip('/') if self.line_range_prefix else '') + return MatchSummaryItem( + file=file_name, + file_url=f"{prefix}/{file_name}" if prefix else file_name, license=result.get('licenses')[0].get('name'), similarity=result.get('matched'), purl=result.get('purl')[0], purl_url=result.get('url'), version=result.get('version'), lines="all" )
111-125: Robust snippet URL and lines: guard prefix, anchor first range only, preserve original ranges, handle malformed lines.Avoid “None/…” URLs and invalid anchors across disjoint ranges; keep the original ranges string for display and tolerate malformed input.
- if result.get('id') == "snippet": - # Snippet match: create URL with line range anchor - lines = scanoss_scan_results_utils.get_lines(result.get('lines')) - end_line = lines[len(lines) - 1] if len(lines) > 1 else lines[0] - file_url = f"{self.line_range_prefix}/{file_name}#L{lines[0]}-L{end_line}" - return MatchSummaryItem( - file_url=file_url, - file=file_name, - license=result.get('licenses')[0].get('name'), - similarity=result.get('matched'), - purl=result.get('purl')[0], - purl_url=result.get('url'), - version=result.get('version'), - lines=f"{lines[0]}-{lines[len(lines) - 1] if len(lines) > 1 else lines[0]}" - ) + if result.get('id') == "snippet": + # Build a URL anchoring only the first contiguous range; preserve the original ranges string for display + ranges_str = str(result.get('lines') or '').strip() + first_range = ranges_str.split(',', 1)[0].strip() if ranges_str else '' + try: + if first_range and '-' in first_range: + start_s, end_s = (p.strip() for p in first_range.split('-', 1)) + start, end = int(start_s), int(end_s) + anchor = f"#L{start}-L{end}" + elif first_range: + start = int(first_range) + anchor = f"#L{start}" + else: + start = 1 + anchor = f"#L{start}" + except ValueError as e: + self.print_debug(f'ERROR: Malformed line range "{ranges_str}" for file {file_name}: {e}') + start, anchor = 1, "#L1" + prefix = (self.line_range_prefix.rstrip('/') if self.line_range_prefix else '') + file_url = f"{prefix}/{file_name}{anchor}" if prefix else file_name + return MatchSummaryItem( + file_url=file_url, + file=file_name, + license=result.get('licenses')[0].get('name'), + similarity=result.get('matched'), + purl=result.get('purl')[0], + purl_url=result.get('url'), + version=result.get('version'), + lines=ranges_str or str(start) + )
146-169: Do not require ‘lines’ for file matches; validate it only for snippet. Also avoid truthiness checks for numeric fields.Current validation drops valid file matches and may reject matched=0.
- validations = [ - ('id', 'No id found'), - ('lines', 'No lines found'), - ('purl', 'No purl found'), - ('licenses', 'No licenses found'), - ('version', 'No version found'), - ('matched', 'No matched found'), - ('url', 'No url found'), - ] + validations = [ + ('id', 'No id found'), + ('purl', 'No purl found'), + ('licenses', 'No licenses found'), + ('version', 'No version found'), + ('matched', 'No matched found'), + ('url', 'No url found'), + ] - for field, error_msg in validations: - if not result.get(field): + for field, error_msg in validations: + if field not in result or result.get(field) is None: self.print_debug(f'ERROR: {error_msg} for file {file_name}') return False + # 'lines' required only for snippet matches + if result.get('id') == "snippet" and not result.get('lines'): + self.print_debug(f'ERROR: No lines found for snippet match in file {file_name}') + return False # Additional validation for non-empty lists if len(result.get('purl')) == 0: self.print_debug(f'ERROR: No purl found for file {file_name}') return False if len(result.get('licenses')) == 0: self.print_debug(f'ERROR: Empty licenses list for file {file_name}') return Falsesrc/scanoss/gitlabqualityreport.py (2)
34-64: Include ‘end’ in Lines and emit it in JSON.Improves fidelity with Code Climate/GitLab expectations.
@dataclass class Lines: - begin: int + begin: int + end: int @@ return { "description": self.description, "check_name": self.check_name, "fingerprint": self.fingerprint, "severity": self.severity, "location": { "path": self.location.path, - "lines": { - "begin": self.location.lines.begin - } + "lines": { + "begin": self.location.lines.begin, + "end": self.location.lines.end + } } }
99-119: Handle malformed line strings; include end in snippet location.Avoid crashes on bad ranges and improve location detail.
- if not result.get('lines'): + if not result.get('lines'): self.print_debug(f"Warning: No lines found for result: {result}") return None - lines = scanoss_scan_results_utils.get_lines(result.get('lines')) + try: + lines = scanoss_scan_results_utils.get_lines(result.get('lines')) + except Exception as e: + self.print_debug(f"Warning: Malformed line range for {file_name}: {e} ({result.get('lines')})") + return None if len(lines) == 0: self.print_debug(f"Warning: empty lines for result: {result}") return None end_line = lines[len(lines) - 1] if len(lines) > 1 else lines[0] description = f"Snippet found in: {file_name} - lines {lines[0]}-{end_line}" return CodeQuality( description=description, check_name=file_name, fingerprint=result.get('file_hash'), severity="info", location=Location( path=file_name, - lines=Lines( - begin=lines[0] - ) + lines=Lines(begin=lines[0], end=end_line) ) )src/scanoss/cli.py (1)
1978-2040: Validate input path and show nested help; keep user-facing errors friendly.Add existence check and display the correct subcommand help.
- if args.input is None: - parser.parse_args([args.subparser, '-h']) - sys.exit(1) + if not args.input: + print_stderr('ERROR: Input file is required for GitLab matches summary') + parser.parse_args([args.subparser, args.subparsercmd, 'matches', '-h']) + sys.exit(1) + if not os.path.exists(args.input) or not os.path.isfile(args.input): + print_stderr(f'ERROR: Input file does not exist or is not a file: {args.input}') + sys.exit(1) - if args.line_range_prefix is None: - parser.parse_args([args.subparser, '-h']) - sys.exit(1) + # line_range_prefix is optional; MatchSummary will handle absence (links without prefix) + # if you want to enforce it, keep the next block and adjust help-path as below + # if args.line_range_prefix is None: + # print_stderr('ERROR: --line-range-prefix is required') + # parser.parse_args([args.subparser, args.subparsercmd, 'matches', '-h']) + # sys.exit(1)To verify repo-wide, run:
#!/bin/bash # Check all CLI help fallbacks and input validations related to gitlab matches rg -nC2 -e 'inspect_gitlab_matches\(' -e 'subparser.*glc|gitlab' --type=py
🧹 Nitpick comments (3)
src/scanoss/inspection/raw/component_summary.py (1)
52-74: Verify parameter types when implementing these methods.The
_markdownand_jira_markdownmethods acceptlist[T]parameters, while_jsonacceptsdict[str, Any]. This inconsistency suggests these methods may expect different data structures (perhaps extracting thecomponentslist from the summary dict). Ensure the parameter types align with the intended usage when these placeholder methods are implemented.src/scanoss/inspection/raw/license_summary.py (1)
26-28: Consider defining TypeVar T locally instead of importing from policy_check.Importing
Tfrom..policy_checkcreates an unusual dependency. TypeVars are typically defined within the module that uses them or in a dedicated types module. This approach creates tight coupling topolicy_checksolely for a type annotation.Consider defining
Tlocally:from typing import Any -from ..policy_check import T from .raw_base import RawBase + +from typing import TypeVar +T = TypeVar('T')src/scanoss/cli.py (1)
832-858: Make --line-range-prefix optional to match docs and broader use-cases.Docs say links are included “when a line range prefix is provided.” Keep optional in CLI and handle None in MatchSummary (guard already suggested).
- p_gl_inspect_matches.add_argument( + p_gl_inspect_matches.add_argument( '-lpr', '--line-range-prefix', - required=True, + required=False, type=str, help='Base URL prefix for GitLab file links with line ranges (e.g., https://gitlab.com/org/project/-/blob/main)' )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
src/scanoss/cli.py(9 hunks)src/scanoss/gitlabqualityreport.py(1 hunks)src/scanoss/inspection/raw/component_summary.py(2 hunks)src/scanoss/inspection/raw/license_summary.py(3 hunks)src/scanoss/inspection/raw/match_summary.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (5)
src/scanoss/inspection/raw/match_summary.py (4)
src/scanoss/scanossbase.py (3)
ScanossBase(28-107)print_stdout(73-81)print_to_file_or_stdout(83-94)src/scanoss/inspection/utils/file_utils.py (1)
load_json_file(29-44)src/scanoss/inspection/utils/markdown_utils.py (1)
generate_table(25-50)src/scanoss/utils/scanoss_scan_results_utils.py (1)
get_lines(25-41)
src/scanoss/cli.py (3)
src/scanoss/inspection/raw/match_summary.py (2)
MatchSummary(63-287)run(267-287)src/scanoss/gitlabqualityreport.py (2)
GitLabQualityReport(66-185)produce_from_file(170-185)src/scanoss/scanossbase.py (1)
print_stderr(45-49)
src/scanoss/inspection/raw/license_summary.py (3)
src/scanoss/inspection/raw/component_summary.py (6)
_json(33-50)_markdown(52-62)_jira_markdown(64-74)_format(132-135)run(137-142)_get_components(113-130)src/scanoss/inspection/policy_check.py (4)
_json(101-112)_markdown(115-125)_jira_markdown(128-138)run(83-98)src/scanoss/scanossbase.py (1)
print_to_file_or_stdout(83-94)
src/scanoss/gitlabqualityreport.py (2)
src/scanoss/scanossbase.py (1)
ScanossBase(28-107)src/scanoss/utils/scanoss_scan_results_utils.py (1)
get_lines(25-41)
src/scanoss/inspection/raw/component_summary.py (3)
src/scanoss/inspection/raw/raw_base.py (2)
RawBase(54-424)_get_components(83-102)src/scanoss/inspection/raw/license_summary.py (6)
_json(41-55)_markdown(57-67)_jira_markdown(69-79)_format(176-179)run(181-186)_get_components(156-174)src/scanoss/scanossbase.py (1)
print_to_file_or_stdout(83-94)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (8)
src/scanoss/inspection/raw/component_summary.py (3)
24-27: Imports look good.The new imports of
AnyandTare appropriate and necessary for the type annotations in the new formatting methods.
132-135: Good refactoring for extensibility.The
_formatmethod centralizes the formatting logic and sets up a good foundation for supporting multiple output formats in the future. The TODO comment appropriately tracks the remaining work.
137-142: LGTM - Clean refactoring.The refactoring to use
_formatmethod improves code organization and prepares the class for supporting multiple output formats. The change maintains existing behavior while improving maintainability.src/scanoss/inspection/raw/license_summary.py (4)
57-67: LGTM!The stub method with clear documentation is appropriate for future implementation.
69-79: LGTM!The stub method with clear documentation is appropriate for future implementation.
176-179: LGTM!The
_formatmethod provides a clean abstraction layer for future format support. The TODO comment appropriately signals the planned extension for additional output formats.
181-186: LGTM!The refactoring to use
_formatimproves consistency with other summary classes and provides better extensibility for supporting multiple output formats in the future.src/scanoss/cli.py (1)
288-291: Add glc-codequality format: LGTM.New convert choice is correctly wired.
…ct command Implements GitLab-compatible Markdown summary reports for SCANOSS matches with the following changes: - Add new 'inspect gitlab matches' CLI command to generate match summaries - Create MatchSummary class to process and format scan results with collapsible sections - Extract table generation utilities into shared markdown_utils module - Extract JSON file loading into shared file_utils module - Add support for GitLab file links with line range anchors
…at and GitLab matches summary features in CLIENT_HELP.md
26eb9b8 to
38aebf0
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
38aebf0 to
4b6597a
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
4b6597a to
585bdde
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
| This method returns the component summary data in its original JSON structure | ||
| without any transformation. The data can be directly serialized to JSON format. | ||
| :param data: Dictionary containing component summary information including: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory we are moving towards google's way of writing python docstrings
You can see examples in folder_hasher.py
def hash_directory(self, path: str) -> dict:
"""
Generate the folder hashing request structure from a directory path.
This method builds a directory tree (DirectoryNode) and computes the associated
hash data for the folder.
Args:
path (str): The root directory path.
Returns:
dict: The folder hash request structure.
"""
root_node = self._build_root_node(path)
tree = self._hash_calc_from_node(root_node)
self.tree = tree
return tree| analysis and license statistics. | ||
| :param data: List of license summary items to format | ||
| :return: Dictionary containing license summary information including: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we know the output schema of this, maybe could be nice to add typings
| file_url=f"{self.line_range_prefix}/{file_name}", | ||
| license=result.get('licenses')[0].get('name'), | ||
| similarity=result.get('matched'), | ||
| purl=result.get('purl')[0], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should check if licenses is not an empty list before trying to access [0] otherwise we'll have a runtime error
Same thing for result.get('purl')[0] which can be an empty list
| if result.get('id') == "snippet": | ||
| # Snippet match: create URL with line range anchor | ||
| lines = scanoss_scan_results_utils.get_lines(result.get('lines')) | ||
| end_line = lines[len(lines) - 1] if len(lines) > 1 else lines[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wouldn't we need to point to the first line matched?
| Skips invalid or incomplete results with debug messages. | ||
| """ | ||
| # Load scan results from JSON file | ||
| scan_results = load_json_file(self.scanoss_results_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to handle possible ValueError here if the path doesn't exist
| """ | ||
|
|
||
| if len(gitlab_matches_summary.files) == 0 and len(gitlab_matches_summary.snippet) == 0: | ||
| return "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could add a debug log here
| # Format matches as GitLab-compatible Markdown | ||
| matches_md = self._markdown(matches) | ||
| if matches_md == "": | ||
| self.print_stdout("No matches found.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if there are no matches, we're just presenting a plain text. Wouldn't be better to also give it md format? So it looks nicer when displaying on the web
| import os | ||
|
|
||
|
|
||
| def load_json_file(file_path: str) -> dict: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have a util for this in src/scanoss/utils/file.py that handles different types of errors, and returns data already parsed as dict
| '-f', | ||
| type=str, | ||
| choices=['cyclonedx', 'spdxlite', 'csv'], | ||
| choices=['cyclonedx', 'spdxlite', 'csv', 'glc-codequality'], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for me it's not clear what glc-codequality is unless i have context, but even so if you search glc code quality in google, it shows results about mercedes benz glc model
i would use something like gitlab-codequality or something clearer
| '--line-range-prefix', | ||
| required=True, | ||
| type=str, | ||
| help='Base URL prefix for GitLab file links with line ranges (e.g., https://gitlab.com/org/project/-/blob/main)' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds confusing --line-range-prefix for the actual base url
What's Changed
Added
glc-codequalityformat to convert subcomandinspect gitlab matchessubcommand to generate GitLab-compatible Markdown match summary from SCANOSS scan resultsmarkdown_utils.pyandfile_utils.py)Changed
markdown_utilsmodulefile_utilsmoduleSummary by CodeRabbit
New Features
Documentation
Refactor
Bug Fixes
Chores