|
| 1 | +# Grafana Loki MCP - Architecture Patterns & Design Decisions |
| 2 | + |
| 3 | +## Core Architecture Pattern: Layered Service Architecture |
| 4 | + |
| 5 | +``` |
| 6 | +┌─────────────────────────┐ |
| 7 | +│ MCP Tools Layer │ ← @mcp.tool() decorators, FastMCP framework |
| 8 | +├─────────────────────────┤ |
| 9 | +│ Business Logic Layer │ ← GrafanaClient, time parsing, error handling |
| 10 | +├─────────────────────────┤ |
| 11 | +│ HTTP Client Layer │ ← requests library, connection management |
| 12 | +├─────────────────────────┤ |
| 13 | +│ Grafana API │ ← External Grafana/Loki services |
| 14 | +└─────────────────────────┘ |
| 15 | +``` |
| 16 | + |
| 17 | +## Key Design Patterns Identified |
| 18 | + |
| 19 | +### 1. Client-Server Abstraction Pattern |
| 20 | +- **GrafanaClient**: Encapsulates all Grafana API interactions |
| 21 | +- **Clean Interface**: Simple method signatures hiding HTTP complexity |
| 22 | +- **Error Translation**: HTTP errors → domain-specific exceptions |
| 23 | + |
| 24 | +### 2. Configuration Strategy Pattern |
| 25 | +- **Environment Variables**: Primary configuration source |
| 26 | +- **Command Line Args**: Override mechanism via argparse |
| 27 | +- **Sensible Defaults**: Fallback values for optional settings |
| 28 | + |
| 29 | +### 3. Time Format Adapter Pattern |
| 30 | +- **Multiple Input Formats**: Grafana relative, ISO8601, Unix timestamps |
| 31 | +- **Unified Output**: All formats → Unix nanoseconds for Loki |
| 32 | +- **Graceful Degradation**: Invalid formats → current time |
| 33 | + |
| 34 | +### 4. Lazy Initialization Pattern |
| 35 | +- **Datasource Discovery**: UID cached after first lookup |
| 36 | +- **Description Generation**: Dynamic tool descriptions on server start |
| 37 | +- **Client Creation**: Instantiated only when needed |
| 38 | + |
| 39 | +## SOLID Principles Adherence |
| 40 | + |
| 41 | +### Single Responsibility Principle ✅ |
| 42 | +- **GrafanaClient**: Only Grafana API interactions |
| 43 | +- **parse_grafana_time()**: Only time format conversion |
| 44 | +- **MCP tools**: Each tool has single, focused purpose |
| 45 | + |
| 46 | +### Open/Closed Principle ✅ |
| 47 | +- **Extensible**: New MCP tools easily added via decorators |
| 48 | +- **Configurable**: Behavior modification through environment variables |
| 49 | +- **Plugin-ready**: MCP framework supports additional capabilities |
| 50 | + |
| 51 | +### Liskov Substitution Principle ✅ |
| 52 | +- **Interface Consistency**: All MCP tools follow same pattern |
| 53 | +- **Error Handling**: Consistent exception behavior across methods |
| 54 | + |
| 55 | +### Interface Segregation Principle ✅ |
| 56 | +- **Focused Interfaces**: Each MCP tool exposes only necessary parameters |
| 57 | +- **Optional Parameters**: Non-required functionality clearly separated |
| 58 | + |
| 59 | +### Dependency Inversion Principle ✅ |
| 60 | +- **Abstraction Layers**: Business logic depends on interfaces, not implementations |
| 61 | +- **HTTP Library**: Could swap requests for httpx without business logic changes |
| 62 | + |
| 63 | +## Error Handling Strategy |
| 64 | + |
| 65 | +### 3-Layer Error Handling |
| 66 | +1. **HTTP Layer**: requests.exceptions.RequestException |
| 67 | +2. **Translation Layer**: Convert to ValueError with context |
| 68 | +3. **MCP Layer**: Framework handles final error serialization |
| 69 | + |
| 70 | +### Error Context Preservation |
| 71 | +```python |
| 72 | +# Pattern: Enrich error messages with response details |
| 73 | +try: |
| 74 | + error_json = e.response.json() |
| 75 | + error_detail = f"{error_detail} - Details: {json.dumps(error_json)}" |
| 76 | +except Exception: |
| 77 | + if e.response.text: |
| 78 | + error_detail = f"{error_detail} - Response: {e.response.text}" |
| 79 | +``` |
| 80 | + |
| 81 | +## Testing Architecture |
| 82 | + |
| 83 | +### Test Strategy: Comprehensive Mocking |
| 84 | +- **Session-scoped fixtures**: Prevent external API calls during tests |
| 85 | +- **Mock Grafana responses**: Controlled test environment |
| 86 | +- **Edge case coverage**: Invalid inputs, network failures, malformed responses |
| 87 | + |
| 88 | +### Test Organization |
| 89 | +- **Unit Tests**: Individual function behavior |
| 90 | +- **Integration Tests**: End-to-end MCP tool functionality |
| 91 | +- **Time Parsing Tests**: Comprehensive format validation |
| 92 | + |
| 93 | +## Performance Considerations |
| 94 | + |
| 95 | +### Optimization Strategies |
| 96 | +1. **Connection Reuse**: requests library handles connection pooling |
| 97 | +2. **Lazy Loading**: Datasource UID cached to avoid repeated lookups |
| 98 | +3. **Result Limiting**: Configurable log line limits prevent memory issues |
| 99 | +4. **String Truncation**: max_per_line parameter controls memory usage |
| 100 | + |
| 101 | +### Scalability Patterns |
| 102 | +- **Stateless Design**: No shared state between requests |
| 103 | +- **Resource Limits**: Built-in limits prevent resource exhaustion |
| 104 | +- **Graceful Degradation**: Fallback behaviors for edge cases |
| 105 | + |
| 106 | +## Future Architecture Considerations |
| 107 | + |
| 108 | +### Enhancement Opportunities |
| 109 | +1. **Caching Layer**: Redis/memory cache for datasource metadata |
| 110 | +2. **Rate Limiting**: Token bucket pattern for API calls |
| 111 | +3. **Observability**: Structured logging with correlation IDs |
| 112 | +4. **Connection Pooling**: Custom pool configuration for high throughput |
| 113 | + |
| 114 | +### Extensibility Points |
| 115 | +- **Additional MCP Tools**: New Loki/Grafana endpoints |
| 116 | +- **Format Support**: Additional time formats or output formats |
| 117 | +- **Authentication**: Support for other Grafana auth methods |
| 118 | +- **Multi-tenant**: Support for multiple Grafana instances |
| 119 | + |
| 120 | +## Code Quality Patterns |
| 121 | + |
| 122 | +### Type Safety Strategy |
| 123 | +- **Comprehensive Type Hints**: All functions properly annotated |
| 124 | +- **Generic Types**: Dict[str, Any] for JSON responses |
| 125 | +- **Optional Parameters**: Proper Optional[] usage |
| 126 | +- **Cast Operations**: Safe type assertions with cast() |
| 127 | + |
| 128 | +### Documentation Strategy |
| 129 | +- **Function Docstrings**: Args, returns, exceptions documented |
| 130 | +- **Inline Comments**: Minimal but strategic placement |
| 131 | +- **Example Usage**: examples/ directory with working code |
| 132 | + |
| 133 | +This architecture represents a well-designed, production-ready implementation following Python and software engineering best practices. |
0 commit comments