MCP/llm-fusion-mcp

Fork 1

Go to file

Ryan Malloy 3dcb6b94cf

🚀 LLM Fusion MCP - CI/CD Pipeline / 🔍 Code Quality & Testing (3.10) (push) Has been cancelled

Details

🚀 LLM Fusion MCP - CI/CD Pipeline / 🔍 Code Quality & Testing (3.11) (push) Has been cancelled

Details

🚀 LLM Fusion MCP - CI/CD Pipeline / 🔍 Code Quality & Testing (3.12) (push) Has been cancelled

Details

🚀 LLM Fusion MCP - CI/CD Pipeline / 🛡️ Security Scanning (push) Has been cancelled

Details

🚀 LLM Fusion MCP - CI/CD Pipeline / 🐳 Docker Build & Push (push) Has been cancelled

Details

🚀 LLM Fusion MCP - CI/CD Pipeline / 🎉 Create Release (push) Has been cancelled

Details

🚀 LLM Fusion MCP - CI/CD Pipeline / 📢 Deployment Notification (push) Has been cancelled

Details

🌊 Revolutionary MCP Streamable HTTP Transport Implementation

Implements the latest MCP protocol specification (2024-11-05) with modern
streamable HTTP transport, replacing deprecated SSE-only approach!

## 🚀 Major Features Added:
- **MCP Streamable HTTP Transport** - Latest protocol specification
- **Bidirectional Streaming** - Single endpoint with Server-Sent Events
- **OAuth Proxy Integration** - Ready for FastMCP oauth-proxy & remote-oauth
- **Per-User API Key Management** - Framework for user-specific billing
- **Modern HTTP API** - RESTful endpoints for all functionality
- **Comprehensive Testing** - Full transport validation suite

## 🔧 Key Implementation Files:
- `src/llm_fusion_mcp/mcp_streamable_client.py` - Modern MCP client with streaming
- `src/llm_fusion_mcp/server.py` - Full HTTP API server with OAuth hooks
- `test_streamable_server.py` - Complete transport testing suite

## 📡 Revolutionary Endpoints:
- `POST /mcp/` - Direct MCP protocol communication
- `GET /mcp/` - SSE streaming for bidirectional events
- `POST /api/v1/oauth/proxy` - OAuth proxy for authenticated servers
- `POST /api/v1/tools/execute` - Universal tool execution
- `POST /api/v1/generate` - Multi-provider LLM generation

## 🌟 This Creates the FIRST System That:
✅ Implements latest MCP Streamable HTTP specification
✅ Bridges remote LLMs to entire MCP ecosystem
✅ Supports OAuth-protected MCP servers via proxy
✅ Enables per-user API key management
✅ Provides concurrent multi-client access
✅ Offers comprehensive error handling & circuit breakers

🎉 Remote LLMs can now access ANY MCP server through a single,
modern HTTP API with full OAuth and streaming support!

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-09-06 10:43:26 -06:00

.github/workflows

Initial commit: LLM Fusion MCP Server

2025-09-05 05:47:51 -06:00

config

🚀 Phase 2 Complete: Universal MCP Tool Orchestrator

2025-09-06 10:01:37 -06:00

src/llm_fusion_mcp

🌊 Revolutionary MCP Streamable HTTP Transport Implementation

2025-09-06 10:43:26 -06:00

tests

Initial commit: LLM Fusion MCP Server

2025-09-05 05:47:51 -06:00

.env.example

Initial commit: LLM Fusion MCP Server

2025-09-05 05:47:51 -06:00

.env.production

Initial commit: LLM Fusion MCP Server

2025-09-05 05:47:51 -06:00

.gitignore

Initial commit: LLM Fusion MCP Server

2025-09-05 05:47:51 -06:00

deploy.sh

Initial commit: LLM Fusion MCP Server

2025-09-05 05:47:51 -06:00

DEPLOYMENT.md

Initial commit: LLM Fusion MCP Server

2025-09-05 05:47:51 -06:00

docker-compose.yml

Initial commit: LLM Fusion MCP Server

2025-09-05 05:47:51 -06:00

Dockerfile

Initial commit: LLM Fusion MCP Server

2025-09-05 05:47:51 -06:00

health-check.sh

Initial commit: LLM Fusion MCP Server

2025-09-05 05:47:51 -06:00

install.sh

Initial commit: LLM Fusion MCP Server

2025-09-05 05:47:51 -06:00

INTEGRATION.md

Initial commit: LLM Fusion MCP Server

2025-09-05 05:47:51 -06:00

LICENSE

Initial commit: LLM Fusion MCP Server

2025-09-05 05:47:51 -06:00

mcp-config.json

Initial commit: LLM Fusion MCP Server

2025-09-05 05:47:51 -06:00

openai_compatibility_analysis.md

🚀 Phase 2 Complete: Universal MCP Tool Orchestrator

2025-09-06 10:01:37 -06:00

openai_compatibility_results.json

🚀 Phase 2 Complete: Universal MCP Tool Orchestrator

2025-09-06 10:01:37 -06:00

pyproject.toml

🌊 Revolutionary MCP Streamable HTTP Transport Implementation

2025-09-06 10:43:26 -06:00

README.md

Initial commit: LLM Fusion MCP Server

2025-09-05 05:47:51 -06:00

REQUIREMENTS.md

Initial commit: LLM Fusion MCP Server

2025-09-05 05:47:51 -06:00

run_server.sh

Initial commit: LLM Fusion MCP Server

2025-09-05 05:47:51 -06:00

test_all_tools.py

Initial commit: LLM Fusion MCP Server

2025-09-05 05:47:51 -06:00

test_complete_system.py

🚀 Phase 2 Complete: Universal MCP Tool Orchestrator

2025-09-06 10:01:37 -06:00

test_comprehensive.py

Initial commit: LLM Fusion MCP Server

2025-09-05 05:47:51 -06:00

test_large_document.md

Initial commit: LLM Fusion MCP Server

2025-09-05 05:47:51 -06:00

test_large_file_analysis.py

Initial commit: LLM Fusion MCP Server

2025-09-05 05:47:51 -06:00

test_mcp_protocol.py

🚀 Phase 2 Complete: Universal MCP Tool Orchestrator

2025-09-06 10:01:37 -06:00

test_openai_compatibility.py

🚀 Phase 2 Complete: Universal MCP Tool Orchestrator

2025-09-06 10:01:37 -06:00

test_orchestrator.py

🚀 Phase 2 Complete: Universal MCP Tool Orchestrator

2025-09-06 10:01:37 -06:00

test_performance_comparison.py

🚀 Phase 2 Complete: Universal MCP Tool Orchestrator

2025-09-06 10:01:37 -06:00

test_providers_direct.py

Initial commit: LLM Fusion MCP Server

2025-09-05 05:47:51 -06:00

test_providers.py

Initial commit: LLM Fusion MCP Server

2025-09-05 05:47:51 -06:00

test_streamable_server.py

🌊 Revolutionary MCP Streamable HTTP Transport Implementation

2025-09-06 10:43:26 -06:00

test_streaming_direct.py

Initial commit: LLM Fusion MCP Server

2025-09-05 05:47:51 -06:00

test_streaming.py

Initial commit: LLM Fusion MCP Server

2025-09-05 05:47:51 -06:00

test_tools.py

Initial commit: LLM Fusion MCP Server

2025-09-05 05:47:51 -06:00

unified_architecture_design.md

🚀 Phase 2 Complete: Universal MCP Tool Orchestrator

2025-09-06 10:01:37 -06:00

uv.lock

🌊 Revolutionary MCP Streamable HTTP Transport Implementation

2025-09-06 10:43:26 -06:00

README.md

🚀 LLM Fusion MCP Server

A comprehensive Model Context Protocol (MCP) server providing unified access to multiple major LLM providers through a single interface.

This server enables AI assistants to interact with multiple LLM providers simultaneously through the standardized Model Context Protocol interface. Built for the MCP ecosystem, it provides seamless access to Gemini, OpenAI, Anthropic, and Grok models with advanced features like streaming, multimodal processing, and intelligent document handling.

⚡ Why This Server Rocks

🎯 Universal LLM Access - One API to rule them all
🌊 Always Streaming - Real-time responses with beautiful progress
🧠 Intelligent Document Processing - Handle files of any size with smart chunking
🎨 Multimodal AI - Text, images, audio understanding
🔧 OpenAI-Specific Tools - Assistants API, DALL-E, Whisper integration
⚡ Lightning Fast - Built with modern Python tooling (uv, ruff, FastMCP)
🔒 Production Grade - Comprehensive error handling and health monitoring

🔧 Quick Start for MCP Clients

Claude Desktop Integration

# 1. Clone the repository
git clone https://github.com/MCP/llm-fusion-mcp.git
cd llm-fusion-mcp

# 2. Configure API keys
cp .env.example .env
# Edit .env with your API keys

# 3. Add to Claude Desktop
claude mcp add -s local -- llm-fusion-mcp /path/to/llm-fusion-mcp/run_server.sh

Manual Launch

# Install dependencies and start server
./run_server.sh

The launcher script will:

✅ Validate dependencies and install if needed
✅ Check API key configuration
✅ Start the server with proper error handling
✅ Provide colored logs for easy debugging

🤖 Supported AI Providers

Provider	Models	Context Window	Status	Special Features
🟢 Gemini	64+ models	1M tokens	✅ Production Ready	Video, thinking modes, native audio
🔵 OpenAI	90+ models	1M tokens	✅ Production Ready	GPT-5, O3, Assistants API, DALL-E
🟣 Anthropic	Claude 3.5/4	200K tokens	✅ Production Ready	Advanced reasoning, code analysis
⚫ Grok	Latest models	100K tokens	✅ Production Ready	Real-time data, conversational AI

🎯 Key Features

🚀 Core Capabilities

🌐 Universal LLM API - Switch between providers seamlessly
📡 Real-time Streaming - Token-by-token generation across all providers
📚 Large File Analysis - Intelligent document processing up to millions of tokens
🖼️ Multimodal AI - Image analysis and audio transcription
🔧 OpenAI Integration - Full Assistants API, DALL-E, Whisper support
🎛️ Session Management - Dynamic API key switching without server restart

⚡ Advanced Features

🧠 Smart Chunking - Semantic, hierarchical, fixed, and auto strategies
🔍 Provider Auto-Selection - Optimal model choice based on task and context
📊 Vector Embeddings - Semantic similarity and text analysis
🛠️ Function Calling - OpenAI-compatible tool integration
💾 Caching Support - Advanced caching for performance
🏥 Health Monitoring - Real-time provider status and diagnostics

🚦 Quick Start

1️⃣ Installation

# Clone and setup
git clone <repository>
cd llm-fusion-mcp
uv install

2️⃣ Configure API Keys

# Copy template and add your keys
cp .env.example .env

# Edit .env with your API keys
GOOGLE_API_KEY=your_google_api_key_here
OPENAI_API_KEY=your_openai_api_key_here  # Optional
ANTHROPIC_API_KEY=your_anthropic_api_key_here  # Optional
XAI_API_KEY=your_xai_api_key_here  # Optional

3️⃣ Launch Server

# Method 1: Direct execution
uv run python src/llm_fusion_mcp/server.py

# Method 2: Using run script (recommended)
./run_server.sh

4️⃣ Connect with Claude Code

# Add to Claude Code MCP
claude mcp add -s local -- llm-fusion-mcp /path/to/llm-fusion-mcp/run_server.sh

🛠️ Available Tools

🎯 Universal LLM Tools

🔑 Provider & Key Management

llm_set_provider("gemini")           # Switch default provider
llm_get_provider()                   # Get current provider info
llm_list_providers()                 # See all providers + models
llm_health_check()                   # Provider health status

llm_set_api_key("openai", "key")     # Set session API key
llm_list_api_keys()                  # Check key configuration
llm_remove_api_key("openai")         # Remove session key

💬 Text Generation

llm_generate(                        # 🌟 UNIVERSAL GENERATION
    prompt="Write a haiku about AI",
    provider="gemini",               # Override provider  
    model="gemini-2.5-flash",        # Specific model
    stream=True                      # Real-time streaming
)

llm_analyze_large_file(              # 📚 SMART DOCUMENT ANALYSIS
    file_path="/path/to/document.pdf",
    prompt="Summarize key findings",
    chunk_strategy="auto",           # Auto-select best strategy
    max_chunks=10                    # Control processing scope
)

🎨 Multimodal AI

llm_analyze_image(                   # 🖼️ IMAGE UNDERSTANDING
    image_path="/path/to/image.jpg",
    prompt="What's in this image?",
    provider="gemini"                # Best for multimodal
)

llm_analyze_audio(                   # 🎵 AUDIO PROCESSING
    audio_path="/path/to/audio.mp3",
    prompt="Transcribe this audio",
    provider="gemini"                # Native audio support
)

📊 Embeddings & Similarity

llm_embed_text(                     # 🧮 VECTOR EMBEDDINGS
    text="Your text here",
    provider="openai",               # Multiple providers
    model="text-embedding-3-large"
)

llm_similarity(                     # 🔍 SEMANTIC SIMILARITY  
    text1="AI is amazing",
    text2="Artificial intelligence rocks"
)

🔧 OpenAI-Specific Tools

🤖 Assistants API

openai_create_assistant(            # 🎭 CREATE AI ASSISTANT
    name="Code Review Bot", 
    instructions="Expert code reviewer",
    model="gpt-4o"
)

openai_test_connection()            # 🔌 CONNECTION TEST
# Returns: 90 available models, connection status

🎨 DALL-E Image Generation

openai_generate_image(              # 🎨 AI IMAGE CREATION
    prompt="Futuristic robot coding",
    model="dall-e-3",
    size="1024x1024"
)

🎵 Audio Processing

openai_transcribe_audio(            # 🎤 WHISPER TRANSCRIPTION
    audio_path="/path/to/speech.mp3",
    model="whisper-1"
)

openai_generate_speech(             # 🔊 TEXT-TO-SPEECH
    text="Hello, world!",
    voice="alloy"
)

📊 System Testing Results

Component	Status	Details
🟢 Gemini Provider	✅ Perfect	64 models, 1M tokens, streaming excellent
🔵 OpenAI Provider	✅ Working	90 models, API functional, quota management
🟣 Anthropic Provider	⚠️ Ready	Needs API key configuration
⚫ Grok Provider	✅ Perfect	Excellent streaming, fast responses
📡 Streaming	✅ Excellent	Real-time across all providers
📚 Large Files	✅ Perfect	Auto provider selection, intelligent chunking
🔧 OpenAI Tools	✅ Working	Assistants, DALL-E, connection verified
🔑 Key Management	✅ Perfect	Session override, health monitoring

🎛️ Configuration

📁 API Key Setup Options

Option 1: Environment Variables (System-wide)

export GOOGLE_API_KEY="your_google_api_key"
export OPENAI_API_KEY="your_openai_api_key"  
export ANTHROPIC_API_KEY="your_anthropic_api_key"
export XAI_API_KEY="your_xai_api_key"

Option 2: .env File (Project-specific)

# .env file
GOOGLE_API_KEY=your_google_api_key_here
OPENAI_API_KEY=your_openai_api_key_here
ANTHROPIC_API_KEY=your_anthropic_api_key_here
XAI_API_KEY=your_xai_api_key_here

Option 3: Session Keys (Dynamic)

# Override keys during MCP session
llm_set_api_key("openai", "temporary_key_here")
llm_set_api_key("anthropic", "another_temp_key")

🔗 Claude Code Integration

Recommended: Command Line Setup

claude mcp add -s local -- llm-fusion-mcp /path/to/llm-fusion-mcp/run_server.sh

Alternative: JSON Configuration

{
  "mcpServers": {
    "llm-fusion-mcp": {
      "command": "/path/to/llm-fusion-mcp/run_server.sh",
      "env": {
        "GOOGLE_API_KEY": "${GOOGLE_API_KEY}",
        "OPENAI_API_KEY": "${OPENAI_API_KEY}",
        "ANTHROPIC_API_KEY": "${ANTHROPIC_API_KEY}",
        "XAI_API_KEY": "${XAI_API_KEY}"
      }
    }
  }
}

🔧 Development & Testing

🧪 Test Suite

# Comprehensive testing
uv run python test_all_tools.py           # All tools
uv run python test_providers_direct.py    # Provider switching  
uv run python test_streaming_direct.py    # Streaming functionality
uv run python test_large_file_analysis.py # Document processing

# Code quality
uv run ruff format    # Format code
uv run ruff check     # Lint code  
uv run mypy src/      # Type checking

📋 Requirements

Python: 3.10+
Dependencies: FastMCP, OpenAI, Pydantic, python-dotenv
API Keys: At least one provider (Gemini recommended)

🏗️ Architecture

🎨 Design Philosophy

🌐 Provider Agnostic - OpenAI-compatible APIs for universal access
📡 Streaming First - Real-time responses across all operations
🧠 Intelligent Processing - Smart chunking, auto provider selection
🔧 Production Ready - Comprehensive error handling, health monitoring
⚡ Modern Python - Built with uv, ruff, FastMCP toolchain

📊 Performance Features

Dynamic Model Discovery - 5-minute cache refresh from provider APIs
Intelligent Chunking - Semantic, hierarchical, fixed, auto strategies
Provider Auto-Selection - Optimal choice based on context windows
Session Management - Hot-swap API keys without server restart
Health Monitoring - Real-time provider status and diagnostics

🚨 Troubleshooting

Common Issues

🔑 API Key Issues

# Check configuration
llm_list_api_keys()    # Shows key status for all providers
llm_health_check()     # Tests actual API connectivity

# Fix missing keys  
llm_set_api_key("provider", "your_key")

🔄 Server Issues

# Kill existing servers
pkill -f "python src/llm_fusion_mcp/server.py"

# Restart fresh
./run_server.sh

📚 Large File Issues

Files automatically chunked when exceeding context windows
Use max_chunks parameter to control processing scope
Check provider context limits in health check

🎉 What's New

✨ Latest Features

🔧 OpenAI Integration - Full Assistants API, DALL-E, Whisper support
📊 Health Monitoring - Real-time provider diagnostics
🎛️ Session Keys - Dynamic API key management
📡 Enhanced Streaming - Beautiful real-time progress across all tools
🧠 Smart Processing - Intelligent provider and strategy selection

🔮 Coming Soon

🎬 Video Understanding - Gemini video analysis
🌐 More Providers - Cohere, Mistral, and others
📊 Vector Databases - Pinecone, Weaviate integration
🔗 Workflow Chains - Multi-step AI operations

📞 Get Help

📖 Documentation: Check INTEGRATION.md for advanced setup
🧪 Testing: Run test suite to verify functionality
🔍 Health Check: Use llm_health_check() for diagnostics
⚡ Performance: Check provider context windows and rate limits

🌟 Ready to Launch?

Experience the future of LLM integration with LLM Fusion MCP!

Built with ❤️ using FastMCP, modern Python tooling, and a passion for AI excellence.

README.md Unescape Escape

🚀 LLM Fusion MCP Server

⚡ Why This Server Rocks

🔧 Quick Start for MCP Clients

Claude Desktop Integration

Manual Launch

🤖 Supported AI Providers

🎯 Key Features

🚀 Core Capabilities

⚡ Advanced Features

🚦 Quick Start

1️⃣ Installation

2️⃣ Configure API Keys

3️⃣ Launch Server

4️⃣ Connect with Claude Code

🛠️ Available Tools

🎯 Universal LLM Tools

🔑 Provider & Key Management

💬 Text Generation

🎨 Multimodal AI

📊 Embeddings & Similarity

🔧 OpenAI-Specific Tools

🤖 Assistants API

🎨 DALL-E Image Generation

🎵 Audio Processing

📊 System Testing Results

🎛️ Configuration

📁 API Key Setup Options

Option 1: Environment Variables (System-wide)

Option 2: .env File (Project-specific)

Option 3: Session Keys (Dynamic)

🔗 Claude Code Integration

Recommended: Command Line Setup

Alternative: JSON Configuration

🔧 Development & Testing

🧪 Test Suite

📋 Requirements

🏗️ Architecture

🎨 Design Philosophy

📊 Performance Features

🚨 Troubleshooting

Common Issues

🔑 API Key Issues

🔄 Server Issues

📚 Large File Issues

🎉 What's New

✨ Latest Features

🔮 Coming Soon

📞 Get Help

🌟 Ready to Launch?

README.md