🚀 Features: - FastMCP 2.8.1+ integration with modern Python 3.11+ features - Kuzu graph database for intelligent memory relationships - Multi-provider embedding support (OpenAI, Ollama, Sentence Transformers) - Automatic relationship detection via semantic similarity - Graph traversal for connected memory discovery - 8 MCP tools for comprehensive memory operations 🦙 Self-Hosted Focus: - Ollama provider for complete privacy and control - Zero external dependencies for sacred trust applications - Production-ready with comprehensive testing - Interactive setup script with provider selection 📦 Complete Package: - memory_mcp_server.py (1,010 lines) - Main FastMCP server - Comprehensive test suite and examples - Detailed documentation including Ollama setup guide - MCP client configuration examples - Interactive setup script 🎯 Perfect for LLM memory systems requiring: - Privacy-first architecture - Intelligent relationship modeling - Graph-based memory exploration - Self-hosted deployment capabilities
194 lines
5.4 KiB
Markdown
194 lines
5.4 KiB
Markdown
# Ultimate Memory MCP Server - Ollama Edition Structure
|
|
|
|
```
|
|
mcp-ultimate-memory/
|
|
├── memory_mcp_server.py # 🦙 Main Ollama-powered server (841 lines)
|
|
├── requirements.txt # 📦 Minimal dependencies (no OpenAI)
|
|
├── .env.example # ⚙️ Ollama-focused configuration
|
|
├── schema.cypher # 🕸️ Kuzu graph database schema
|
|
├── setup.sh # 🚀 Ollama-specific setup script
|
|
├── test_server.py # 🧪 Ollama-focused test suite
|
|
├── examples.py # 📚 Ollama usage examples & patterns
|
|
├── mcp_config_example.json # 🔧 MCP client configuration
|
|
├── README.md # 📖 Ollama-focused documentation
|
|
├── OLLAMA_SETUP.md # 🦙 Detailed Ollama setup guide
|
|
└── PROJECT_STRUCTURE.md # 📋 This file
|
|
```
|
|
|
|
## File Descriptions
|
|
|
|
### Core Server Files
|
|
|
|
- **`memory_mcp_server.py`** - FastMCP server with OllamaProvider integration
|
|
- **`schema.cypher`** - Kuzu graph database schema (unchanged)
|
|
- **`requirements.txt`** - Minimal dependencies (fastmcp, kuzu, numpy, requests)
|
|
|
|
### Configuration & Setup
|
|
|
|
- **`.env.example`** - Ollama-focused environment variables
|
|
- **`setup.sh`** - Interactive Ollama setup with model downloading
|
|
- **`mcp_config_example.json`** - MCP client configuration for Ollama
|
|
|
|
### Testing & Examples
|
|
|
|
- **`test_server.py`** - Comprehensive Ollama testing suite
|
|
- **`examples.py`** - Ollama-specific usage patterns and tips
|
|
|
|
### Documentation
|
|
|
|
- **`README.md`** - Complete Ollama-focused documentation
|
|
- **`OLLAMA_SETUP.md`** - Detailed Ollama installation and configuration guide
|
|
|
|
## Key Changes from Multi-Provider Version
|
|
|
|
### Removed Components
|
|
- ❌ OpenAI provider class and dependencies
|
|
- ❌ Sentence Transformers provider
|
|
- ❌ Provider factory pattern
|
|
- ❌ Multi-provider configuration options
|
|
- ❌ OpenAI-specific documentation
|
|
|
|
### Simplified Architecture
|
|
- ✅ Single `OllamaProvider` class
|
|
- ✅ Direct integration with memory server
|
|
- ✅ Simplified configuration (only Ollama settings)
|
|
- ✅ Streamlined error handling
|
|
- ✅ Focused testing and setup
|
|
|
|
### Enhanced Ollama Features
|
|
- ✅ Connection health checking
|
|
- ✅ Model availability verification
|
|
- ✅ Server status monitoring tool
|
|
- ✅ Ollama-specific troubleshooting
|
|
- ✅ Performance optimization tips
|
|
|
|
## Quick Commands
|
|
|
|
```bash
|
|
# Complete setup (interactive)
|
|
./setup.sh
|
|
|
|
# Test Ollama connection only
|
|
python test_server.py --connection-only
|
|
|
|
# Test full system
|
|
python test_server.py
|
|
|
|
# View examples and patterns
|
|
python examples.py
|
|
|
|
# Start the server
|
|
python memory_mcp_server.py
|
|
```
|
|
|
|
## Configuration Files
|
|
|
|
### `.env` Configuration
|
|
```env
|
|
KUZU_DB_PATH=./memory_graph_db
|
|
OLLAMA_BASE_URL=http://localhost:11434
|
|
OLLAMA_EMBEDDING_MODEL=nomic-embed-text
|
|
```
|
|
|
|
### MCP Client Configuration
|
|
```json
|
|
{
|
|
"mcpServers": {
|
|
"memory": {
|
|
"command": "python",
|
|
"args": ["/path/to/memory_mcp_server.py"],
|
|
"env": {
|
|
"KUZU_DB_PATH": "/path/to/memory_graph_db",
|
|
"OLLAMA_BASE_URL": "http://localhost:11434",
|
|
"OLLAMA_EMBEDDING_MODEL": "nomic-embed-text"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
## Dependencies
|
|
|
|
### Required Python Packages
|
|
```
|
|
fastmcp>=2.8.1 # MCP framework
|
|
kuzu>=0.4.0 # Graph database
|
|
numpy>=1.26.0 # Vector operations
|
|
python-dotenv>=1.0.0 # Environment loading
|
|
requests>=2.28.0 # HTTP requests to Ollama
|
|
```
|
|
|
|
### System Requirements
|
|
- **Python 3.11+** (for modern type hints)
|
|
- **Ollama** (latest version from ollama.ai)
|
|
- **nomic-embed-text model** (or alternative)
|
|
|
|
### Optional Components
|
|
- **llama3.2:1b model** (for AI summaries)
|
|
- **systemd** (for service deployment)
|
|
|
|
## Database Structure
|
|
|
|
The Kuzu graph database creates:
|
|
- **Memory nodes** with embeddings from Ollama
|
|
- **Relationship edges** with metadata and strengths
|
|
- **Conversation nodes** for context grouping
|
|
- **Topic and Cluster nodes** for organization
|
|
|
|
See `schema.cypher` for complete schema definition.
|
|
|
|
## Performance Characteristics
|
|
|
|
### Ollama-Specific Performance
|
|
- **First Request**: ~2-3 seconds (model loading)
|
|
- **Subsequent Requests**: ~500-800ms per embedding
|
|
- **Memory Usage**: ~1.5GB RAM for nomic-embed-text
|
|
- **Storage**: ~2GB for models and database
|
|
|
|
### Optimization Features
|
|
- ✅ Connection pooling and reuse
|
|
- ✅ Model persistence across requests
|
|
- ✅ Batch operation support
|
|
- ✅ Efficient vector similarity calculations
|
|
|
|
## Security & Privacy
|
|
|
|
### Complete Local Processing
|
|
- ✅ No external API calls
|
|
- ✅ No data transmission
|
|
- ✅ Full user control
|
|
- ✅ Audit trail available
|
|
|
|
### Recommended Practices
|
|
- 🔒 Firewall Ollama port (11434)
|
|
- 🔄 Regular database backups
|
|
- 📊 Resource monitoring
|
|
- 🔐 Access control for server
|
|
|
|
## Monitoring & Health
|
|
|
|
### Built-in Health Checks
|
|
- `check_ollama_status` - Server and model status
|
|
- `analyze_memory_patterns` - Graph health metrics
|
|
- Connection verification in startup
|
|
- Model availability checking
|
|
|
|
### Debug Commands
|
|
```bash
|
|
# Check Ollama directly
|
|
curl http://localhost:11434/api/tags
|
|
|
|
# Test embedding generation
|
|
curl http://localhost:11434/api/embeddings \
|
|
-d '{"model": "nomic-embed-text", "prompt": "test"}'
|
|
|
|
# Verify Python integration
|
|
python test_server.py --help-setup
|
|
```
|
|
|
|
---
|
|
|
|
**🦙 Simplified, Focused, Self-Hosted**
|
|
|
|
This Ollama edition provides a streamlined, privacy-first memory system without the complexity of multiple providers. Perfect for environments where data control and simplicity are priorities.
|