Skip to content

Conversation

@oscarvalenzuelab
Copy link
Contributor

Security Issue

String matching for domains and protocols at arbitrary positions in URLs could be exploited with malicious URLs:

Examples of vulnerable patterns:

  • evil.com/git+malicious → matches "git+" in url
  • evil-github.com-fake.ru → matches "github.com" in url
  • https://malicious.com?redirect=github.com/evil → matches domain check

Changes

1. src/ossval/parsers/spdx.py

  • ✅ Add _is_git_url() helper method for safe URL validation
  • ✅ Properly parse URLs and check domain via urlparse().netloc
  • ✅ Check protocol prefixes with startswith() instead of in operator
  • ✅ Fix 3 vulnerable checks in JSON and tag-value parsing

2. src/ossval/analyzers/repo_finder.py

  • ✅ Fix _is_valid_git_url() to parse URLs and check netloc
  • ✅ Fix _normalize_git_url() to safely check domains
  • ✅ Use urlparse to extract and verify domain names
  • ✅ Check protocol prefixes with startswith()

3. src/ossval/core.py

  • ✅ Fix GitHub health check to parse URL and validate netloc
  • ✅ Replace unsafe in check with urlparse validation
  • ✅ Only analyze health for actual github.com domain

Security Improvements

✅ All domain checks now use urlparse().netloc == "domain.com"
✅ Protocol checks use startswith() instead of substring matching
✅ Prevents bypasses via domain name in path or subdomain
✅ All tests passing (94 tests)

Testing

pytest tests/ -v
# 94 passed, 13 warnings in 8.10s

Impact

  • Severity: Medium - Could allow URL spoofing/bypass
  • Affected versions: All versions prior to this fix
  • Mitigation: Proper URL parsing and domain validation

Security issue: String matching for domains and protocols at arbitrary
positions in URLs could be exploited with malicious URLs like:
- evil.com/git+malicious (matches "git+" check)
- evil-github.com-fake.ru (matches "github.com" check)

Changes:

1. src/ossval/parsers/spdx.py:
   - Add _is_git_url() helper method for safe URL validation
   - Properly parse URLs and check domain via urlparse.netloc
   - Check protocol prefixes with startswith() instead of 'in'
   - Fix 3 vulnerable checks in JSON and tag-value parsing

2. src/ossval/analyzers/repo_finder.py:
   - Fix _is_valid_git_url() to parse URLs and check netloc
   - Fix _normalize_git_url() to safely check domains
   - Use urlparse to extract and verify domain names
   - Check protocol prefixes with startswith()

3. src/ossval/core.py:
   - Fix GitHub health check to parse URL and validate netloc
   - Replace unsafe 'in' check with urlparse validation
   - Only analyze health for actual github.com domain

Security improvements:
- All domain checks now use urlparse().netloc == "domain.com"
- Protocol checks use startswith() instead of substring matching
- Prevents bypasses via domain name in path or subdomain
- All tests passing (94 tests)
- Update version in pyproject.toml and __init__.py
- Add 1.2.2 release notes to CHANGELOG.md
- Document security fixes for URL validation vulnerabilities
@oscarvalenzuelab oscarvalenzuelab merged commit f21018f into main Dec 9, 2025
14 checks passed
@oscarvalenzuelab oscarvalenzuelab deleted the fix-url-security-vulnerabilities branch December 9, 2025 07:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants