Skip to content

Conversation

@richbodo
Copy link
Owner

@richbodo richbodo commented Nov 16, 2025

Contributor License Statement

By submitting this pull request, I confirm that:

  • I have the rights to contribute this code.
  • I license this contribution under the GPLv3 license.
  • I grant a patent license consistent with the GPLv3 terms.

Signed-off-by: (type your GitHub handle)


Note

Updates default LLM selection to prefer gpt-oss:20b and supported models, expands CLI/TUI usage docs, and adds a spec plus a repro test for the llama3-8b-local communication issue.

  • LLM selection:
    • Update prt_src/llm_model_registry.get_default_model() to prioritize gpt-oss:20b, then officially supported models, then any available model.
  • Docs:
    • README and docs/DEV_SETUP.md significantly expand CLI/TUI usage, debugging commands, and daily workflow, clarifying TUI as default and adding AI testing commands (prt-debug-info, list-models, chat examples).
  • Testing/Specs:
    • Add specs/bug_llama3_8b_local_communication_error.md detailing an Ollama 400 error with llama3-8b-local and steps to fix/validate.
    • Introduce test_llama3_communication.py to reproduce and compare model communication behavior.
  • Scripts:
    • Minor cleanup in init.sh (mac/Linux setup messaging).

Written by Cursor Bugbot for commit 9efc8ed. This will update automatically on new commits. Configure here.

- Replace vague 'ALL SQL queries require confirm=true' with explicit parameter specification
- Add concrete example showing exact execute_sql tool call syntax
- Clarify that tool rejects ANY SQL query without the confirm=true parameter
- Specify boolean data type and automatic execution behavior
- Eliminates ambiguity for different LLM providers about confirmation requirements
Phase 1: Extract Help System
- Create centralized help system in cli_modules/help.py
- Move help content to markdown file for single source of truth
- Update cli.py to use new help module
- Maintain backward compatibility

Phase 2: Extract Services
- Extract 5 service modules with zero UI dependencies:
  * export.py: Export and JSON serialization services
  * images.py: Profile image export services
  * directory.py: Interactive directory generation
  * import_google.py: Google import services
  * llm.py: LLM chat integration services
- Update imports in cli.py to use extracted services
- Remove duplicate function definitions
- All services are now independently testable

This refactoring improves maintainability and separation of concerns
while preserving all existing functionality.
… fixes

This commit completes the CLI modularization refactoring and resolves critical
compatibility issues that were causing test failures.

CLI Modularization (Phases 1-3):
- Split monolithic CLI into specialized modules
- Maintained backward compatibility layer
- Added enhanced functionality with new flags
- Fixed missing imports and test compatibility issues

Key Features:
- --prt-debug-info flag for system diagnostics
- --classic flag for forced CLI mode
- Complete modular architecture with 26+ new files
- Zero test regressions, all functionality preserved

Validation: All CLI tests pass (8/8), debug info tests pass (13/13)
- Remove custom help flag handling in favor of Typer's native --help
- Eliminate print_custom_help() import and manual help processing
- Streamline command interface by letting Typer handle help automatically
- Reduces code complexity while maintaining same functionality

This is a cleanup following the CLI modularization in PR #153.
- Add specific RuntimeError handling for critical errors (permissions, disk space)
- Provide informative stderr warnings when credential setup fails
- Differentiate between critical errors and unexpected errors
- Improve user guidance about potential system issues
- Continue gracefully while preserving error context

This improves robustness of the credential system introduced in the CLI
modularization, ensuring users get clear feedback about configuration issues.
Security improvements:
- Replace random.choices() with secrets.choice() for Mistral tool call ID generation
- Use cryptographically secure random number generation for security-sensitive operations

Robustness improvements:
- Improve base URL processing to avoid corrupting domains like v1.example.com
- Use safer string manipulation for /v1 suffix removal
- Add explicit imports for security modules

These changes strengthen the security posture of the Mistral LLM integration
added in PR #153 and improve reliability of URL handling.
Use removesuffix() instead of rstrip() for URL handling to fix potential
issues with domains containing 'v1' in their name. This aligns with the
URL handling improvements from main branch.
@chatgpt-codex-connector
Copy link

💡 Codex Review

class LlamaCppLLM(BaseLLM):
"""LlamaCpp LLM client with tool calling support."""
def __init__(
self,
api: PRTAPI,
model_path: Optional[str] = None,
n_ctx: int = 4096,
n_gpu_layers: int = 0,
n_threads: Optional[int] = None,
timeout: Optional[int] = None,
config_manager: Optional[LLMConfigManager] = None,
):
"""Initialize LlamaCpp LLM client.
Args:
api: PRTAPI instance for database operations
model_path: Path to .gguf model file
n_ctx: Context window size (default: 4096)
n_gpu_layers: Number of layers to offload to GPU (default: 0 = CPU only)
n_threads: Number of CPU threads (default: None = auto-detect)
timeout: Request timeout in seconds
config_manager: LLMConfigManager instance. If None, loads config automatically.
"""
# Load configuration
if config_manager is None:
config_manager = LLMConfigManager()
# Call parent constructor (gets tools automatically from registry)
super().__init__(api, config_manager)
# Use config values, falling back to explicit parameters
self.model_path = model_path or config_manager.llm.model_path
self.n_ctx = n_ctx if n_ctx != 4096 else getattr(config_manager.llm, "n_ctx", 4096)
self.n_gpu_layers = (
n_gpu_layers if n_gpu_layers != 0 else getattr(config_manager.llm, "n_gpu_layers", 0)
)
self.n_threads = n_threads or getattr(config_manager.llm, "n_threads", None)
self.timeout = timeout if timeout is not None else config_manager.llm.timeout
self.temperature = config_manager.llm.temperature
# Validate model path
if not self.model_path:
raise ValueError("model_path is required for LlamaCpp provider")
model_file = Path(self.model_path)
if not model_file.exists():
raise FileNotFoundError(f"Model file not found: {self.model_path}")
# Extract friendly model name for display (filename without extension)
self.model = model_file.stem # e.g., "Meta-Llama-3-8B-Instruct-Q4_K_M"
# Initialize llama-cpp-python

P1 Badge Implement _get_model_name in LlamaCppLLM

After inheriting from BaseLLM, LlamaCppLLM no longer provides an implementation of _get_model_name. Because BaseLLM marks this method as @abstractmethod, attempting to instantiate LlamaCppLLM now raises TypeError: Can't instantiate abstract class ... with abstract method _get_model_name, so the llama-cpp backend becomes unusable. Add a concrete _get_model_name implementation that returns the loaded model’s identifier.


prt/prt_src/llm_tools.py

Lines 324 to 338 in ab7ad25

def get_write_tool_names() -> Set[str]:
"""Get set of tool names that perform write operations."""
return {
"add_tag_to_contact",
"remove_tag_from_contact",
"create_tag",
"delete_tag",
"add_note_to_contact",
"remove_note_from_contact",
"create_note",
"update_note",
"delete_note",
"add_contact_relationship",
"remove_contact_relationship",
"execute_sql", # Provider-specific implementation

P1 Badge Register execute_sql tool in shared registry

LLMToolRegistry.get_write_tool_names() advertises an execute_sql operation, but neither _create_read_tools nor _create_write_tools define a corresponding Tool. With the new BaseLLM initialization relying solely on this registry, any model that tries to call execute_sql receives Tool 'execute_sql' not found and SQL queries can no longer be executed via tool calling. Add a Tool definition for execute_sql (or remove it from the set) to restore the previous capability.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

- Fix README.md CLI examples from outdated 'python -m prt_src.cli' to current 'python -m prt_src' structure
- Add comprehensive CLI testing and debugging examples including:
  * prt-debug-info for system health checks
  * CLI chat mode for testing AI functionality
  * Model testing and selection commands
- Update docs/DEV_SETUP.md with current CLI patterns and troubleshooting workflows
- Add dedicated CLI Testing & Debugging Commands section with practical examples
- Modernize interface descriptions to reflect TUI as default with CLI as power-user tool
- Include specific examples mentioned for development: chat testing, directory generation, system diagnostics

The CLI is now properly documented as an excellent testing and debugging tool for developers.
Changes the model selection logic in OllamaModelRegistry.get_default_model()
to prioritize gpt-oss:20b as the primary default, followed by any officially
supported models, then any available models.

New priority order:
1. gpt-oss:20b (preferred default for tool calling support)
2. Any officially supported model (mistral:7b-instruct, llama3:8b, etc.)
3. Any available model (fallback)

This ensures users get the recommended model for PRT's tool calling features
when no specific model is requested via command line.
- Remove outdated SQLCipher comments from init.sh (Issue #41 migration complete)
- Add bug spec for llama3-8b-local communication error with Ollama
- Add test script to reproduce llama3 communication issues for debugging

These changes support the ongoing LLM model registry and Ollama integration work.
@claude
Copy link

claude bot commented Nov 16, 2025

Code Review - PR #156

I've reviewed this pull request and have the following feedback:

Strengths

  1. Clear Documentation Improvements: The README and DEV_SETUP updates significantly improve discoverability of CLI commands and testing workflows. The new command tables are well-organized and helpful.

  2. Logical Model Priority: Changing the default model selection to prefer gpt-oss:20b and then fall back to supported models is a sensible approach that aligns with the existing supported models registry.

  3. Consistent Docstring Updates: The docstring for get_default_model() has been properly updated to reflect the new priority order.

⚠️ Concerns & Recommendations

1. Potential Test Failures

The existing test test_get_default_model_first_available in tests/unit/test_llm_model_registry.py may need updating. The test fixture creates models in this order:

  • llama3:8b
  • gpt-oss:20b
  • mistral:7b

With the new logic, the test should expect gpt-oss:20b to be returned (priority 1), not just "any cached model." The test assertion at line 517 is too loose:

# Current (may be incorrect now):
assert default in cached_models

# Should probably be:
assert default == "gpt-oss:20b"

Recommendation: Add a specific test case that verifies the priority order works correctly when gpt-oss:20b is available.

2. Dictionary Iteration Order

At line 419, the code iterates over supported_models dictionary:

for model_name in supported_models:

While Python 3.7+ guarantees dict insertion order, the actual priority might not be what you expect. The SUPPORTED_MODELS dict in llm_supported_models.py has gpt-oss:20b first, but if you want guaranteed priority, consider using a list or explicitly ordering by support_status.

Recommendation: Either document that the dictionary order matters, or be more explicit about prioritization logic (e.g., prefer "official" status models over "experimental").

3. Import Placement

The import of get_supported_models is placed inside the function at line 404:

def get_default_model(self) -> Optional[str]:
    from prt_src.llm_supported_models import get_supported_models

While this avoids circular imports, it's inconsistent with the module's import style (see lines 8-19 where all imports are at the top).

Recommendation: If this is intentional to avoid circular imports, add a comment explaining why. Otherwise, move it to the top-level imports.

4. Missing Test Coverage for New Behavior

I don't see tests that verify:

  • The new priority logic selects gpt-oss:20b when available
  • The fallback to supported models works correctly
  • The integration with get_supported_models() functions properly

Recommendation: Add test cases like:

def test_get_default_model_prefers_gpt_oss():
    """Test that gpt-oss:20b is preferred when available."""
    ...

def test_get_default_model_fallback_to_supported():
    """Test fallback to supported models when gpt-oss:20b unavailable."""
    ...

📝 Minor Documentation Issues

  1. README.md Line 91: "AI-Powered Chat (Great for Testing & Development)" - Consider adding a note that Ollama must be running with models installed for these commands to work.

  2. DEV_SETUP.md Line 70: The "Notes" column could clarify that prt-debug-info is the first command to run when troubleshooting (it's mentioned in Pro Tips but could be in the table too).

🔒 Security & Performance

  • ✅ No security concerns identified
  • ✅ No performance issues (caching is already in place)
  • ✅ No breaking changes to public APIs

Overall Assessment

This is a solid improvement to the codebase. The documentation updates are valuable, and the model selection logic is clearer. However, the existing tests may need updating to match the new behavior, and additional test coverage would strengthen confidence in the priority fallback logic.

Recommendation:

  1. Update/add tests for the new default model selection logic
  2. Consider clarifying the dictionary iteration order behavior
  3. Run the full test suite to verify no regressions

Would you like me to help write the additional test cases?

@richbodo richbodo merged commit 113a030 into main Nov 16, 2025
3 of 7 checks passed
@claude
Copy link

claude bot commented Nov 16, 2025

Code Review: Feature/mistral support and cli enhancements

Overall Assessment

This PR makes several valuable improvements to LLM model selection, documentation, and testing infrastructure. The changes are well-structured and align with the project's goals. However, there are some concerns around code organization, test coverage, and potential issues that should be addressed.

Summary:

  • Documentation improvements are excellent and significantly improve usability
  • Model selection logic is improved with better defaults
  • ⚠️ Test file placement violates project structure conventions
  • ⚠️ Spec file is extensive but describes unimplemented work
  • ⚠️ Import optimization in model registry could cause issues

Detailed Feedback

1. Code Quality & Best Practices

Good: Model Selection Logic (prt_src/llm_model_registry.py:393-426)

The new get_default_model() implementation is a clear improvement:

  • Well-documented priority system (gpt-oss:20b → supported models → any model)
  • Good use of sets for O(1) lookups (model_names = {model.name for model in models})
  • Appropriate logging at each decision point

Minor suggestion: The import at line 404 is inside the method:

from prt_src.llm_supported_models import get_supported_models

While this avoids circular imports, it's unconventional. Consider:

  1. Moving to module-level if no circular dependency exists
  2. If circular dependency is unavoidable, add a comment explaining why:
# Import here to avoid circular dependency between registry and supported_models
from prt_src.llm_supported_models import get_supported_models

⚠️ Concern: llama3:8b Still Experimental (prt_src/llm_supported_models.py:87-102)

The PR description mentions "mistral support" but doesn't actually change llama3:8b from "experimental" to "official". The spec file specs/bug_llama3_8b_local_communication_error.md describes a plan to fix llama3-8b-local, but no implementation is included.

Recommendation: Either:

  1. Implement the llama3 fixes described in the spec, OR
  2. Remove the spec file and test file from this PR and create a separate issue/PR for llama3 support

2. Test Coverage & Testing Concerns

Critical: Test File in Wrong Location (test_llama3_communication.py)

The file test_llama3_communication.py is placed at the repository root instead of the tests/ directory. This violates the project structure as documented in CLAUDE.md.

Per CLAUDE.md testing guidelines:

# Run all tests
./prt_env/bin/pytest tests/ -v

Required fix:

  • Move to tests/integration/test_llama3_communication.py
  • Add proper pytest markers (@pytest.mark.integration)
  • Follow existing test patterns in tests/integration/

⚠️ Concern: Test File Tests Broken Functionality

The test file appears to be a reproduction script for a bug, not a proper test:

  • It prints to stdout instead of using assertions
  • It returns True/False instead of asserting expected behavior
  • The llama3-8b-local model is not in the supported models list

Expected behavior:
Either this should be:

  1. A test that validates the fix (but the fix isn't in this PR), OR
  2. A diagnostic script in utils/ or tools/ folder

⚠️ Missing Test Coverage

No tests were added for the new get_default_model() logic. Given that this is critical functionality, tests should verify:

  • Returns gpt-oss:20b when available
  • Falls back to supported models
  • Falls back to any model
  • Returns None when no models exist

Example test structure:

def test_get_default_model_prefers_gpt_oss_20b(mock_ollama_models):
    registry = OllamaModelRegistry()
    # Mock ollama to return gpt-oss:20b and other models
    ...
    assert registry.get_default_model() == "gpt-oss:20b"

3. Documentation Quality

Excellent: README.md Updates (README.md:43-141)

The documentation improvements are outstanding:

  • Clear examples of TUI vs CLI usage
  • Practical debugging commands (prt-debug-info, list-models)
  • Good structure with purpose-driven organization
  • Helpful for both users and developers

Minor suggestion: The AI chat examples could specify when to use which model:

# For general queries (best tool calling)
python -m prt_src --model gpt-oss-20b --chat "find friends"

# For lower resource systems
python -m prt_src --model mistral-7b-instruct --chat "count contacts"

Excellent: DEV_SETUP.md Enhancements (docs/DEV_SETUP.md:43-226)

The developer workflow table is very helpful. Great addition for onboarding and daily development.


4. Spec File Concerns

⚠️ Large Spec for Unimplemented Work (specs/bug_llama3_8b_local_communication_error.md)

This 157-line spec describes:

  • A bug analysis
  • Root cause investigation
  • Step-by-step implementation tasks
  • Validation commands

However, none of this work is implemented in the PR.

Recommendation:

  1. Option A (Preferred): Remove the spec file from this PR. Create a separate issue referencing the spec, then implement in a future PR.
  2. Option B: Implement the spec in this PR (but this would make it much larger and change the scope significantly).

Per CLAUDE.md:

Do what has been asked; nothing more, nothing less.

The spec describes work that hasn't been done yet, creating confusion about what this PR actually delivers.


5. Security & Performance

No Security Concerns

The changes don't introduce security issues:

  • No new external dependencies
  • No unsafe data handling
  • Documentation-only changes are safe

Performance: Model Selection Optimization

The use of sets for model name lookups is good:

model_names = {model.name for model in models}  # O(1) lookups

However, get_supported_models() returns a Dict.copy() which is unnecessary if we only need the keys. Minor optimization opportunity:

supported_model_names = get_supported_models().keys()
for model_name in supported_model_names:
    if model_name in model_names:
        ...

6. Code Organization

Good: Cleanup in init.sh

Removing obsolete SQLCipher comments (lines removed at 35 and 49) is good housekeeping.


Recommended Actions

Must Fix (Blocking)

  1. Move test file: test_llama3_communication.pytests/integration/test_llama3_communication.py
  2. Clarify scope: Either implement llama3 fixes OR remove spec/test files and create separate issue

Should Fix (Recommended)

  1. Add tests for get_default_model() priority logic
  2. Document import placement in llm_model_registry.py line 404 if it's intentional to avoid circular imports
  3. Update test file to use pytest patterns (assertions, markers) instead of print statements

Nice to Have

  1. Add model selection guidance to README examples
  2. Minor performance optimization in get_default_model()

Testing Checklist

Before merging, verify:

  • All existing tests pass: ./prt_env/bin/pytest tests/ -v
  • New test file runs successfully in tests/ directory
  • Documentation examples work as written
  • Default model selection works with various Ollama configurations
  • No regressions in existing model selection

Conclusion

This PR delivers valuable improvements to documentation and model selection. The main issues are:

  1. Scope confusion: The spec describes llama3 fixes that aren't implemented
  2. Test organization: Test file in wrong location and doesn't follow pytest conventions
  3. Missing coverage: No tests for the new model selection logic

Recommendation: Address the test file placement and either implement the llama3 work or remove those files. The documentation improvements are excellent and ready to merge once these issues are resolved.

for model_name in supported_models:
if model_name in model_names:
logger.debug(f"Found supported default model: {model_name}")
return model_name
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Default Model Selection Ignores Documented Priority

The get_default_model() method claims to prioritize officially supported models but actually iterates through all models in the registry regardless of support status. The loop at line 419 iterates over supported_models dictionary keys without filtering by support_status, so it can return experimental or deprecated models before checking other official ones. This contradicts the documented priority and could select a lower-quality experimental model when better official alternatives exist.

Fix in Cursor Fix in Web

import traceback

# Add prt_src to path
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "prt_src"))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Module Import Path Conflict

The path manipulation adds the prt_src subdirectory to sys.path, but the imports use from prt_src.api and from prt_src.llm_factory, which would look for modules at prt_src/prt_src/api.py. The correct fix is either to add the parent directory instead (sys.path.insert(0, os.path.dirname(__file__))) or change imports to from api import PRTAPI and from llm_factory import create_llm.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants