⚡ A high-performance semantic code search tool for C# projects using the Model Context Protocol (MCP). Built with .NET 9 and optimized with AVX-512 SIMD acceleration.
🎯 Search code by meaning, not text - Find implementations using natural language descriptions instead of exact string matching.
Important: This tool should run on your Windows host machine, not in WSL. Claude Code instances running in WSL can connect to the Windows-hosted server.
Why? File monitoring (FileSystemWatcher) doesn't work reliably in WSL when watching Windows filesystems. Running on Windows ensures real-time index updates work correctly.
WSL Users: If you must run in WSL, disable file monitoring (
"enableFileWatching": false) and enable periodic rescan ("enablePeriodicRescan": true) to check for changes every 30 minutes.
Super simple setup - just select your embedding provider and run! The server runs as a background service on Windows, accessible from any Claude Code instance.
| Feature | Semantic Search | Traditional grep |
|---|---|---|
| Natural language queries | ✅ Yes | ❌ No |
| Typo tolerance | ✅ Yes | ❌ No |
| Relevance scoring | ✅ Yes | ❌ No |
| Conceptual matching | ✅ Yes | ❌ No |
| 100% completeness | ❌ No | ✅ Yes |
| Setup required | ✅ Indexing | ❌ None |
| Speed | ~50-200ms | ~20-50ms |
Tested embedding models:
- vLLM Qwen3-8B (recommended) - Best overall performance, open weights
- VoyageAI voyage-code-3 - Excellent cloud option when local models aren't viable
- Snowflake Arctic Embed2 - Good baseline option via Ollama
Smaller models often compromise too much on quality. For production use, we recommend 8B+ parameter models.
Enables natural language search over C# codebases. Instead of exact string matching, find code by describing what it does:
- "email notifications" → finds SendGrid implementation
- "rate limiting" → finds anti-flood protection code
- "user authentication" → finds login controllers and auth services
- Semantic search with relevance scoring (0.0-1.0)
- Multi-project support - index multiple C# projects separately
- Smart incremental updates - only re-indexes changed files
- Extended context viewing - see surrounding code lines
- Code structure filtering - search by class, method, interface, etc.
- Fast performance - ~50ms average search time on 25k vectors
- Hardware acceleration - Uses .NET 9 TensorPrimitives with AVX-512 support
- .NET 9.0 SDK
- Embedding API (OpenAI, Ollama, or compatible)
- Claude Code for MCP SSE (http) integration
- Clone and build:
git clone https://github.com/jvadura/SimilarityAVX.MCP.NET
cd SimilarityAVX.MCP.NET/SimilarityAVX
dotnet build -c Release- Configure embedding API in
config.json:
{
"embedding": {
"provider": "VoyageAI", // or "OpenAI" for OpenAI-compatible APIs
"apiUrl": "https://api.voyageai.com/v1/",
"apiKey": "", // Set via EMBEDDING_API_KEY env var
"model": "voyage-code-3",
"dimension": 2048,
"precision": "Float32",
"batchSize": 50,
"maxRetries": 6,
"retryDelayMs": 1000
},
"security": {
"allowedDirectories": ["E:\\"], // Whitelist directories for indexing
"enablePathValidation": true // Enforce directory restrictions
},
"monitoring": {
"enableAutoReindex": true, // Auto-sync with code changes
"verifyOnStartup": true, // Check for changes on startup
"debounceDelaySeconds": 60, // Wait after last file change
"enableFileWatching": true, // Real-time monitoring (Windows only)
"enablePeriodicRescan": false, // Enable for WSL users
"periodicRescanMinutes": 30 // How often to check for changes
}
}Supported embedding providers:
- VoyageAI - Optimized for code search (voyage-code-3)
- Ollama - For local models (use provider: "OpenAI")
- OpenAI - text-embedding-3-small/large
- Any OpenAI-compatible API - vLLM, TEI, etc.
- Set your API key and start the server:
export EMBEDDING_API_KEY="your-api-key"
dotnet runServer will start on http://0.0.0.0:5001
- Add to Claude Code:
claude mcp add cstools --transport sse http://localhost:5001/sseFor remote access (if running on different machine):
claude mcp add cstools --transport sse http://YOUR_IP:5001/ssemcp__cstools__code_search- Basic semantic search with relevance scoring ⭐mcp__cstools__code_search_context- Extended context viewing (15-20 lines recommended) ⭐mcp__cstools__code_search_filtered- Filter by file types and code structures ⭐mcp__cstools__code_batch_search- Multiple queries at once (3-5 optimal) ⭐mcp__cstools__code_get_filter_help- Comprehensive help for search filters
mcp__cstools__code_index- Index or update projects (useforce: truefor reindexing)mcp__cstools__code_list_projects- Show all indexed projectsmcp__cstools__code_get_stats- Memory and performance infomcp__cstools__code_clear_index- Remove project indexmcp__cstools__code_get_directory- Get project root directory
mcp__cstools__memory_add- Store persistent memories with tags and metadatamcp__cstools__memory_get- Retrieve full memory content with parent/child contextmcp__cstools__memory_search- Semantic search with relevance scoresmcp__cstools__memory_list- List all memories with tag filteringmcp__cstools__memory_delete- Remove memories by ID or aliasmcp__cstools__memory_update- Update existing memory content, name, or tagsmcp__cstools__memory_append- Append child memories with automatic tag inheritancemcp__cstools__memory_get_stats- Memory system statisticsmcp__cstools__memory_get_tree- ASCII tree visualizationmcp__cstools__memory_export_tree- Export memory hierarchies as markdown/JSONmcp__cstools__memory_import_markdown- Import markdown files as memory hierarchies
Tested on a 761-file enterprise Blazor application:
- Index size: 73.6 MB for 5,575 code chunks
- Search time: ~176ms first search (with embeddings), ~60ms subsequent searches
- Memory usage: Efficient (1MB per ~220 chunks)
- Hardware acceleration: AVX-512 SIMD support for maximum speed
Synthetic benchmark comparing custom AVX-512 vs .NET 9 TensorPrimitives (25k vectors, 4096 dimensions):
| Implementation | Time per Search | Throughput | GFLOPS | Notes |
|---|---|---|---|---|
| AVX-512 (Custom) | 13.7ms | 1.82M/sec | 29.9 | Lower variance, predictable |
| TensorPrimitives | 10.6ms | 2.35M/sec | 38.5 | .NET 9 with AVX-512 support |
Key findings:
- Both implementations achieve perfect numerical accuracy (identical cosine similarity scores)
- TensorPrimitives is ~22% faster in synthetic benchmarks with .NET 9's AVX-512 optimizations
- Custom implementation offers more predictable latency (lower variance)
- Update: Now using TensorPrimitives by default for best performance
To run benchmarks:
dotnet run -c Release -- bench [dimension] [vectors] [iterations] [searches]
# Example: dotnet run -c Release -- bench 4096 25000 50 5Excellent for:
- Finding code by concept rather than exact text
- Understanding unfamiliar codebases
- UI component searches (Qwen3: 0.70-0.83, Voyage: 0.61-0.62)
- Business logic discovery
- Czech domain terminology (superior performance)
Use traditional tools for:
- Finding ALL instances (100% completeness)
- Cross-cutting concerns
- Known exact patterns
- Security audits requiring exhaustive search
The server includes built-in security features:
- Path validation - Projects are restricted to whitelisted directories (default:
E:\) - Directory traversal protection - Project names are sanitized to prevent path escaping
- Configurable whitelist - Add allowed paths via
security.allowedDirectoriesin config.json
The server automatically keeps your search index synchronized:
- Startup verification - Checks all projects for changes when the server starts
- Real-time monitoring - Watches for code changes and reindexes automatically (Windows only)
- Periodic rescan - Optional scheduled rescan for WSL users (every 30 minutes by default)
- Smart debouncing - Waits 60 seconds after last change to avoid excessive reindexing
- Configurable - Control monitoring behavior via
monitoringsection in config.json
Qwen3-8B (vLLM):
- 0.45+ - Comprehensive results (recommended)
- 0.60+ - More focused results
- 0.70+ - High confidence only
Voyage AI (voyage-code-3):
- 0.40+ - Comprehensive results (recommended)
- 0.50+ - More focused results
- 0.60+ - High confidence only
Note: Voyage AI scores run ~0.10-0.20 points lower than Qwen3 with similar relevance.
See config/examples/ for provider-specific configurations:
config-voyageai.json- VoyageAI setupconfig-vllm.json- vLLM with Qwen3-8Bconfig_snowflake.json- Ollama with Snowflake Arcticconfig-ollama.json- Generic Ollama setup
The server includes a comprehensive memory management system for storing and retrieving contextual knowledge during development sessions. This is completely separate from code search and uses its own optimized embedding model.
- Per-project isolation - Each project has its own memory database
- Hierarchical organization - Parent-child relationships with tag inheritance
- Semantic search - Natural language queries with relevance scoring
- Human-friendly aliases - Reference memories as
@api-designinstead of numeric IDs - Import/Export - Convert documentation to/from searchable memory hierarchies
- Hardware acceleration - Uses TensorPrimitives for parallel SIMD operations
The memory system uses a separate embedding model optimized for general text:
{
"memory": {
"embedding": {
"model": "voyage-3-large", // Default for memories
"dimension": 2048, // Auto-detected if not specified
"provider": "VoyageAI" // Inherits from main config
}
}
}# Store a memory
mcp__cstools__memory_add --project myproject --memoryName "API Design Decisions" --content "We chose REST over GraphQL because..." --tags "architecture,api,decisions"
# Search memories
mcp__cstools__memory_search --project myproject --query "authentication patterns" --topK 5
# View memory hierarchy
mcp__cstools__memory_get_tree --project myproject --includeContent true
# Import documentation
mcp__cstools__memory_import_markdown --project myproject --filePath "/path/to/docs.md" --tags "documentation,imported"The server stores SQLite databases in platform-specific locations:
- Windows:
%LOCALAPPDATA%\csharp-mcp-server\ - Linux/WSL:
~/.local/share/csharp-mcp-server/ - Project-specific files:
codesearch-{project}.db
- Windows:
%LOCALAPPDATA%\csharp-mcp-server\memories\ - Linux/WSL:
~/.local/share/csharp-mcp-server/memories/ - Per-project files:
memory-{project}.db
- Shared cache:
embedding_cache.db(in main directory) - Purpose: Speeds up repeated queries by caching embeddings
All databases are created automatically when first accessed. Each project maintains complete isolation with its own database files.
- Requires indexing before searching
- Results are probabilistic, not exhaustive
- Score thresholds vary by embedding model
- Best combined with traditional search tools
Contributions welcome! Please open an issue or submit a pull request.
MIT License - see LICENSE file for details.
- Model Context Protocol by Anthropic
- VoyageAI for excellent code embeddings
- Roslyn for C# syntax analysis
- Community testers for invaluable feedback
Built with ❤️ using Claude AI