# 🚀 LLM Fusion MCP Server

> A comprehensive Model Context Protocol (MCP) server providing unified access to multiple major LLM providers through a single interface.

[![MCP](https://img.shields.io/badge/MCP-Compatible-blue)](https://modelcontextprotocol.io)
[![FastMCP](https://img.shields.io/badge/FastMCP-2.12.2-blue)](https://gofastmcp.com)
[![Python](https://img.shields.io/badge/Python-3.10+-green)](https://python.org)
[![License](https://img.shields.io/badge/License-MIT-brightgreen)](https://opensource.org/licenses/MIT)

This server enables AI assistants to interact with multiple LLM providers simultaneously through the standardized Model Context Protocol interface. Built for the MCP ecosystem, it provides seamless access to Gemini, OpenAI, Anthropic, and Grok models with advanced features like streaming, multimodal processing, and intelligent document handling.

---

## ⚡ **Why This Server Rocks**

🎯 **Universal LLM Access** - One API to rule them all  
🌊 **Always Streaming** - Real-time responses with beautiful progress  
🧠 **Intelligent Document Processing** - Handle files of any size with smart chunking  
🎨 **Multimodal AI** - Text, images, audio understanding  
🔧 **OpenAI-Specific Tools** - Assistants API, DALL-E, Whisper integration  
⚡ **Lightning Fast** - Built with modern Python tooling (uv, ruff, FastMCP)  
🔒 **Production Grade** - Comprehensive error handling and health monitoring  

---

## 🔧 **Quick Start for MCP Clients**

### **Claude Desktop Integration**
```bash
# 1. Clone the repository
git clone https://github.com/MCP/llm-fusion-mcp.git
cd llm-fusion-mcp

# 2. Configure API keys
cp .env.example .env
# Edit .env with your API keys

# 3. Add to Claude Desktop
claude mcp add -s local -- llm-fusion-mcp /path/to/llm-fusion-mcp/run_server.sh
```

### **Manual Launch**
```bash
# Install dependencies and start server
./run_server.sh
```

The launcher script will:
- ✅ Validate dependencies and install if needed
- ✅ Check API key configuration
- ✅ Start the server with proper error handling
- ✅ Provide colored logs for easy debugging

---

## 🤖 **Supported AI Providers**

| Provider | Models | Context Window | Status | Special Features |
|----------|--------|----------------|--------|------------------|
| **🟢 Gemini** | 64+ models | **1M tokens** | ✅ Production Ready | Video, thinking modes, native audio |
| **🔵 OpenAI** | 90+ models | **1M tokens** | ✅ Production Ready | GPT-5, O3, Assistants API, DALL-E |  
| **🟣 Anthropic** | Claude 3.5/4 | **200K tokens** | ✅ Production Ready | Advanced reasoning, code analysis |
| **⚫ Grok** | Latest models | **100K tokens** | ✅ Production Ready | Real-time data, conversational AI |

---

## 🎯 **Key Features**

### 🚀 **Core Capabilities**
- **🌐 Universal LLM API** - Switch between providers seamlessly
- **📡 Real-time Streaming** - Token-by-token generation across all providers
- **📚 Large File Analysis** - Intelligent document processing up to millions of tokens
- **🖼️ Multimodal AI** - Image analysis and audio transcription
- **🔧 OpenAI Integration** - Full Assistants API, DALL-E, Whisper support
- **🎛️ Session Management** - Dynamic API key switching without server restart

### ⚡ **Advanced Features**  
- **🧠 Smart Chunking** - Semantic, hierarchical, fixed, and auto strategies
- **🔍 Provider Auto-Selection** - Optimal model choice based on task and context
- **📊 Vector Embeddings** - Semantic similarity and text analysis
- **🛠️ Function Calling** - OpenAI-compatible tool integration
- **💾 Caching Support** - Advanced caching for performance
- **🏥 Health Monitoring** - Real-time provider status and diagnostics

---

## 🚦 **Quick Start**

### 1️⃣ **Installation** 
```bash
# Clone and setup
git clone <repository>
cd llm-fusion-mcp
uv install
```

### 2️⃣ **Configure API Keys**
```bash
# Copy template and add your keys
cp .env.example .env

# Edit .env with your API keys
GOOGLE_API_KEY=your_google_api_key_here
OPENAI_API_KEY=your_openai_api_key_here  # Optional
ANTHROPIC_API_KEY=your_anthropic_api_key_here  # Optional
XAI_API_KEY=your_xai_api_key_here  # Optional
```

### 3️⃣ **Launch Server**
```bash
# Method 1: Direct execution
uv run python src/llm_fusion_mcp/server.py

# Method 2: Using run script (recommended)
./run_server.sh
```

### 4️⃣ **Connect with Claude Code** 
```bash
# Add to Claude Code MCP
claude mcp add -s local -- llm-fusion-mcp /path/to/llm-fusion-mcp/run_server.sh
```

---

## 🛠️ **Available Tools**

### 🎯 **Universal LLM Tools**

#### 🔑 **Provider & Key Management**
```python
llm_set_provider("gemini")           # Switch default provider
llm_get_provider()                   # Get current provider info
llm_list_providers()                 # See all providers + models
llm_health_check()                   # Provider health status

llm_set_api_key("openai", "key")     # Set session API key
llm_list_api_keys()                  # Check key configuration
llm_remove_api_key("openai")         # Remove session key
```

#### 💬 **Text Generation**
```python
llm_generate(                        # 🌟 UNIVERSAL GENERATION
    prompt="Write a haiku about AI",
    provider="gemini",               # Override provider  
    model="gemini-2.5-flash",        # Specific model
    stream=True                      # Real-time streaming
)

llm_analyze_large_file(              # 📚 SMART DOCUMENT ANALYSIS
    file_path="/path/to/document.pdf",
    prompt="Summarize key findings",
    chunk_strategy="auto",           # Auto-select best strategy
    max_chunks=10                    # Control processing scope
)
```

#### 🎨 **Multimodal AI**
```python
llm_analyze_image(                   # 🖼️ IMAGE UNDERSTANDING
    image_path="/path/to/image.jpg",
    prompt="What's in this image?",
    provider="gemini"                # Best for multimodal
)

llm_analyze_audio(                   # 🎵 AUDIO PROCESSING
    audio_path="/path/to/audio.mp3",
    prompt="Transcribe this audio",
    provider="gemini"                # Native audio support
)
```

#### 📊 **Embeddings & Similarity**
```python
llm_embed_text(                     # 🧮 VECTOR EMBEDDINGS
    text="Your text here",
    provider="openai",               # Multiple providers
    model="text-embedding-3-large"
)

llm_similarity(                     # 🔍 SEMANTIC SIMILARITY  
    text1="AI is amazing",
    text2="Artificial intelligence rocks"
)
```

### 🔧 **OpenAI-Specific Tools**

#### 🤖 **Assistants API**
```python
openai_create_assistant(            # 🎭 CREATE AI ASSISTANT
    name="Code Review Bot", 
    instructions="Expert code reviewer",
    model="gpt-4o"
)

openai_test_connection()            # 🔌 CONNECTION TEST
# Returns: 90 available models, connection status
```

#### 🎨 **DALL-E Image Generation**
```python
openai_generate_image(              # 🎨 AI IMAGE CREATION
    prompt="Futuristic robot coding",
    model="dall-e-3",
    size="1024x1024"
)
```

#### 🎵 **Audio Processing**
```python
openai_transcribe_audio(            # 🎤 WHISPER TRANSCRIPTION
    audio_path="/path/to/speech.mp3",
    model="whisper-1"
)

openai_generate_speech(             # 🔊 TEXT-TO-SPEECH
    text="Hello, world!",
    voice="alloy"
)
```

---

## 📊 **System Testing Results**

| Component | Status | Details |
|-----------|--------|---------|
| 🟢 **Gemini Provider** | ✅ Perfect | 64 models, 1M tokens, streaming excellent |
| 🔵 **OpenAI Provider** | ✅ Working | 90 models, API functional, quota management |
| 🟣 **Anthropic Provider** | ⚠️ Ready | Needs API key configuration |
| ⚫ **Grok Provider** | ✅ Perfect | Excellent streaming, fast responses |
| 📡 **Streaming** | ✅ Excellent | Real-time across all providers |
| 📚 **Large Files** | ✅ Perfect | Auto provider selection, intelligent chunking |
| 🔧 **OpenAI Tools** | ✅ Working | Assistants, DALL-E, connection verified |
| 🔑 **Key Management** | ✅ Perfect | Session override, health monitoring |

---

## 🎛️ **Configuration**

### 📁 **API Key Setup Options**

#### Option 1: Environment Variables (System-wide)
```bash
export GOOGLE_API_KEY="your_google_api_key"
export OPENAI_API_KEY="your_openai_api_key"  
export ANTHROPIC_API_KEY="your_anthropic_api_key"
export XAI_API_KEY="your_xai_api_key"
```

#### Option 2: .env File (Project-specific)
```env
# .env file
GOOGLE_API_KEY=your_google_api_key_here
OPENAI_API_KEY=your_openai_api_key_here
ANTHROPIC_API_KEY=your_anthropic_api_key_here
XAI_API_KEY=your_xai_api_key_here
```

#### Option 3: Session Keys (Dynamic)
```python
# Override keys during MCP session
llm_set_api_key("openai", "temporary_key_here")
llm_set_api_key("anthropic", "another_temp_key")
```

### 🔗 **Claude Code Integration**

#### Recommended: Command Line Setup
```bash
claude mcp add -s local -- llm-fusion-mcp /path/to/llm-fusion-mcp/run_server.sh
```

#### Alternative: JSON Configuration
```json
{
  "mcpServers": {
    "llm-fusion-mcp": {
      "command": "/path/to/llm-fusion-mcp/run_server.sh",
      "env": {
        "GOOGLE_API_KEY": "${GOOGLE_API_KEY}",
        "OPENAI_API_KEY": "${OPENAI_API_KEY}",
        "ANTHROPIC_API_KEY": "${ANTHROPIC_API_KEY}",
        "XAI_API_KEY": "${XAI_API_KEY}"
      }
    }
  }
}
```

---

## 🔧 **Development & Testing**

### 🧪 **Test Suite**
```bash
# Comprehensive testing
uv run python test_all_tools.py           # All tools
uv run python test_providers_direct.py    # Provider switching  
uv run python test_streaming_direct.py    # Streaming functionality
uv run python test_large_file_analysis.py # Document processing

# Code quality
uv run ruff format    # Format code
uv run ruff check     # Lint code  
uv run mypy src/      # Type checking
```

### 📋 **Requirements**
- **Python**: 3.10+
- **Dependencies**: FastMCP, OpenAI, Pydantic, python-dotenv
- **API Keys**: At least one provider (Gemini recommended)

---

## 🏗️ **Architecture**

### 🎨 **Design Philosophy**
- **🌐 Provider Agnostic** - OpenAI-compatible APIs for universal access
- **📡 Streaming First** - Real-time responses across all operations
- **🧠 Intelligent Processing** - Smart chunking, auto provider selection
- **🔧 Production Ready** - Comprehensive error handling, health monitoring
- **⚡ Modern Python** - Built with uv, ruff, FastMCP toolchain

### 📊 **Performance Features**
- **Dynamic Model Discovery** - 5-minute cache refresh from provider APIs
- **Intelligent Chunking** - Semantic, hierarchical, fixed, auto strategies
- **Provider Auto-Selection** - Optimal choice based on context windows
- **Session Management** - Hot-swap API keys without server restart
- **Health Monitoring** - Real-time provider status and diagnostics

---

## 🚨 **Troubleshooting**

### Common Issues

#### 🔑 **API Key Issues**
```python
# Check configuration
llm_list_api_keys()    # Shows key status for all providers
llm_health_check()     # Tests actual API connectivity

# Fix missing keys  
llm_set_api_key("provider", "your_key")
```

#### 🔄 **Server Issues**
```bash
# Kill existing servers
pkill -f "python src/llm_fusion_mcp/server.py"

# Restart fresh
./run_server.sh
```

#### 📚 **Large File Issues**
- Files automatically chunked when exceeding context windows
- Use `max_chunks` parameter to control processing scope
- Check provider context limits in health check

---

## 🎉 **What's New**

### ✨ **Latest Features**
- 🔧 **OpenAI Integration** - Full Assistants API, DALL-E, Whisper support
- 📊 **Health Monitoring** - Real-time provider diagnostics  
- 🎛️ **Session Keys** - Dynamic API key management
- 📡 **Enhanced Streaming** - Beautiful real-time progress across all tools
- 🧠 **Smart Processing** - Intelligent provider and strategy selection

### 🔮 **Coming Soon**
- 🎬 **Video Understanding** - Gemini video analysis
- 🌐 **More Providers** - Cohere, Mistral, and others
- 📊 **Vector Databases** - Pinecone, Weaviate integration
- 🔗 **Workflow Chains** - Multi-step AI operations

---

## 📞 **Get Help**

- 📖 **Documentation**: Check `INTEGRATION.md` for advanced setup
- 🧪 **Testing**: Run test suite to verify functionality  
- 🔍 **Health Check**: Use `llm_health_check()` for diagnostics
- ⚡ **Performance**: Check provider context windows and rate limits

---

<div align="center">

## 🌟 **Ready to Launch?**

**Experience the future of LLM integration with LLM Fusion MCP!**

*Built with ❤️ using FastMCP, modern Python tooling, and a passion for AI excellence.*

</div>