forked from MCP/llm-fusion-mcp
Revolutionary architecture that bridges remote LLMs with the entire MCP ecosystem! ## 🌟 Key Features Added: - Real MCP protocol implementation (STDIO + HTTP servers) - Hybrid LLM provider system (OpenAI-compatible + Native APIs) - Unified YAML configuration with environment variable substitution - Advanced error handling with circuit breakers and provider fallback - FastAPI HTTP bridge for remote LLM access - Comprehensive tool & resource discovery system - Complete test suite with 4 validation levels ## 🔧 Architecture Components: - `src/llm_fusion_mcp/orchestrator.py` - Main orchestrator with hybrid providers - `src/llm_fusion_mcp/mcp_client.py` - Full MCP protocol implementation - `src/llm_fusion_mcp/config.py` - Configuration management system - `src/llm_fusion_mcp/error_handling.py` - Circuit breaker & retry logic - `config/orchestrator.yaml` - Unified system configuration ## 🧪 Testing Infrastructure: - Complete system integration tests (4/4 passed) - MCP protocol validation tests - Provider compatibility analysis - Performance benchmarking suite 🎉 This creates the FIRST system enabling remote LLMs to access the entire MCP ecosystem through a unified HTTP API! 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
290 lines
9.5 KiB
Markdown
290 lines
9.5 KiB
Markdown
# Universal MCP Tool Orchestrator - Unified Architecture Design
|
|
|
|
## Executive Summary
|
|
|
|
Based on comprehensive testing, we recommend a **Hybrid OpenAI-First Architecture** that leverages OpenAI-compatible endpoints where available while maintaining native client support for optimal performance and feature coverage.
|
|
|
|
## Research Findings Summary
|
|
|
|
### Provider Compatibility Analysis
|
|
|
|
| Provider | OpenAI Compatible | Performance | Function Calling | Recommendation |
|
|
|----------|-------------------|-------------|-------------------|----------------|
|
|
| **OpenAI** | ✅ 100% (Native) | Baseline | ✅ Full support | **OpenAI interface** |
|
|
| **Gemini** | ✅ 100% (Bridge) | 62.8% faster via OpenAI | 67% success rate | **OpenAI interface** |
|
|
| **Anthropic** | ❌ 0% | N/A | N/A | **Native client** |
|
|
| **Grok** | ❌ 0% | N/A | N/A | **Native client** |
|
|
|
|
### Key Performance Insights
|
|
|
|
- **OpenAI Interface**: 62.8% faster for Gemini, more reliable (0 errors vs 4 errors)
|
|
- **Function Calling**: OpenAI interface 67% success rate, Native 100% success rate
|
|
- **Streaming**: Similar performance across both interfaces
|
|
- **Overall**: OpenAI interface provides better speed/reliability balance
|
|
|
|
## Recommended Architecture
|
|
|
|
### Core Design Pattern
|
|
|
|
```python
|
|
class UniversalMCPOrchestrator:
|
|
def __init__(self):
|
|
# OpenAI-compatible providers (2/4 = 50%)
|
|
self.openai_providers = {
|
|
'openai': OpenAI(api_key=..., base_url="https://api.openai.com/v1"),
|
|
'gemini': OpenAI(api_key=..., base_url="https://generativelanguage.googleapis.com/v1beta/openai/")
|
|
}
|
|
|
|
# Native providers (2/4 = 50%)
|
|
self.native_providers = {
|
|
'anthropic': AnthropicProvider(),
|
|
'grok': GrokProvider()
|
|
}
|
|
|
|
# MCP client connections (the key innovation!)
|
|
self.mcp_clients = {} # STDIO and HTTP MCP servers
|
|
self.available_tools = {} # Aggregated tools from all MCP servers
|
|
|
|
async def execute_tool(self, tool_name: str, **kwargs):
|
|
"""Unified tool execution - the core of MCP orchestration"""
|
|
if tool_name.startswith('llm_'):
|
|
return await self.execute_llm_tool(tool_name, **kwargs)
|
|
else:
|
|
return await self.execute_mcp_tool(tool_name, **kwargs)
|
|
```
|
|
|
|
### Provider Abstraction Layer
|
|
|
|
```python
|
|
class ProviderAdapter:
|
|
async def generate_text(self, provider: str, **kwargs):
|
|
if provider in self.openai_providers:
|
|
# Unified OpenAI interface - Fast & Reliable
|
|
client = self.openai_providers[provider]
|
|
return await client.chat.completions.create(**kwargs)
|
|
else:
|
|
# Native interface - Full features
|
|
return await self.native_providers[provider].generate(**kwargs)
|
|
|
|
async def function_call(self, provider: str, tools: List, **kwargs):
|
|
if provider in self.openai_providers:
|
|
# OpenAI function calling format
|
|
return await self.openai_function_call(provider, tools, **kwargs)
|
|
else:
|
|
# Provider-specific function calling
|
|
return await self.native_function_call(provider, tools, **kwargs)
|
|
```
|
|
|
|
### MCP Integration Layer (The Innovation)
|
|
|
|
```python
|
|
class MCPIntegrationLayer:
|
|
async def connect_mcp_server(self, config: Dict):
|
|
"""Connect to STDIO or HTTP MCP servers"""
|
|
if config['type'] == 'stdio':
|
|
client = await self.create_stdio_mcp_client(config)
|
|
else: # HTTP
|
|
client = await self.create_http_mcp_client(config)
|
|
|
|
# Discover and register tools
|
|
tools = await client.list_tools()
|
|
namespace = config['namespace']
|
|
|
|
for tool in tools:
|
|
tool_name = f"{namespace}_{tool['name']}"
|
|
self.available_tools[tool_name] = {
|
|
'client': client,
|
|
'original_name': tool['name'],
|
|
'schema': tool['schema']
|
|
}
|
|
|
|
async def execute_mcp_tool(self, tool_name: str, **kwargs):
|
|
"""Execute tool from connected MCP server"""
|
|
tool_info = self.available_tools[tool_name]
|
|
client = tool_info['client']
|
|
|
|
return await client.call_tool(
|
|
tool_info['original_name'],
|
|
**kwargs
|
|
)
|
|
```
|
|
|
|
## HTTP API for Remote LLMs
|
|
|
|
### Unified Endpoint Structure
|
|
|
|
```http
|
|
POST /api/v1/tools/execute
|
|
{
|
|
"tool": "llm_generate_text",
|
|
"provider": "gemini",
|
|
"params": {
|
|
"prompt": "Analyze the weather data",
|
|
"model": "gemini-2.5-flash"
|
|
}
|
|
}
|
|
|
|
POST /api/v1/tools/execute
|
|
{
|
|
"tool": "fs_read_file",
|
|
"params": {
|
|
"path": "/home/user/data.txt"
|
|
}
|
|
}
|
|
|
|
POST /api/v1/tools/execute
|
|
{
|
|
"tool": "git_commit",
|
|
"params": {
|
|
"message": "Updated analysis",
|
|
"files": ["analysis.md"]
|
|
}
|
|
}
|
|
```
|
|
|
|
### Dynamic Tool Discovery
|
|
|
|
```http
|
|
GET /api/v1/tools/list
|
|
{
|
|
"categories": {
|
|
"llm_tools": ["llm_generate_text", "llm_analyze_image", "llm_embed_text"],
|
|
"filesystem": ["fs_read_file", "fs_write_file", "fs_list_directory"],
|
|
"git": ["git_status", "git_commit", "git_log"],
|
|
"weather": ["weather_current", "weather_forecast"],
|
|
"database": ["db_query", "db_execute"]
|
|
}
|
|
}
|
|
```
|
|
|
|
## Configuration System
|
|
|
|
### Server Configuration
|
|
|
|
```yaml
|
|
# config/orchestrator.yaml
|
|
providers:
|
|
openai:
|
|
api_key: "${OPENAI_API_KEY}"
|
|
models: ["gpt-4o", "gpt-4o-mini"]
|
|
|
|
gemini:
|
|
api_key: "${GOOGLE_API_KEY}"
|
|
models: ["gemini-2.5-flash", "gemini-2.5-pro"]
|
|
|
|
anthropic:
|
|
api_key: "${ANTHROPIC_API_KEY}"
|
|
models: ["claude-3.5-sonnet-20241022"]
|
|
|
|
grok:
|
|
api_key: "${XAI_API_KEY}"
|
|
models: ["grok-3"]
|
|
|
|
mcp_servers:
|
|
filesystem:
|
|
type: stdio
|
|
command: ["uvx", "mcp-server-filesystem"]
|
|
namespace: "fs"
|
|
auto_start: true
|
|
|
|
git:
|
|
type: stdio
|
|
command: ["npx", "@modelcontextprotocol/server-git"]
|
|
namespace: "git"
|
|
working_directory: "."
|
|
|
|
weather:
|
|
type: http
|
|
url: "https://weather-mcp.example.com"
|
|
namespace: "weather"
|
|
auth:
|
|
type: bearer
|
|
token: "${WEATHER_API_KEY}"
|
|
|
|
http_server:
|
|
host: "0.0.0.0"
|
|
port: 8000
|
|
cors_origins: ["*"]
|
|
auth_required: false
|
|
```
|
|
|
|
## Implementation Phases
|
|
|
|
### Phase 1: Foundation (Week 1)
|
|
1. **Hybrid Provider System**: Implement OpenAI + Native provider abstraction
|
|
2. **Basic HTTP API**: Expose LLM tools via HTTP for remote access
|
|
3. **Configuration System**: YAML-based provider and server configuration
|
|
4. **Error Handling**: Robust error handling and provider fallback
|
|
|
|
### Phase 2: MCP Integration (Week 2)
|
|
1. **STDIO MCP Client**: Connect to local STDIO MCP servers
|
|
2. **HTTP MCP Client**: Connect to remote HTTP MCP servers
|
|
3. **Tool Discovery**: Auto-discover and register tools from MCP servers
|
|
4. **Unified Tool Interface**: Single API for LLM + MCP tools
|
|
|
|
### Phase 3: Advanced Features (Week 3)
|
|
1. **Dynamic Server Management**: Hot-add/remove MCP servers
|
|
2. **Tool Composition**: Create composite workflows combining multiple tools
|
|
3. **Caching Layer**: Cache tool results and MCP connections
|
|
4. **Monitoring**: Health checks and usage analytics
|
|
|
|
### Phase 4: Production Ready (Week 4)
|
|
1. **Authentication**: API key management for HTTP endpoints
|
|
2. **Rate Limiting**: Per-client rate limiting and quotas
|
|
3. **Load Balancing**: Distribute requests across provider instances
|
|
4. **Documentation**: Comprehensive API documentation and examples
|
|
|
|
## Benefits of This Architecture
|
|
|
|
### For Remote LLMs
|
|
- **Single Integration Point**: One HTTP API for all capabilities
|
|
- **Rich Tool Ecosystem**: Access to entire MCP ecosystem + LLM providers
|
|
- **Dynamic Discovery**: New tools automatically available
|
|
- **Unified Interface**: Consistent API regardless of backend
|
|
|
|
### For MCP Ecosystem
|
|
- **Bridge to Hosted LLMs**: STDIO servers accessible to remote services
|
|
- **Zero Changes Required**: Existing MCP servers work unchanged
|
|
- **Protocol Translation**: Seamless HTTP ↔ STDIO bridging
|
|
- **Ecosystem Amplification**: Broader reach for existing tools
|
|
|
|
### For Developers
|
|
- **Balanced Complexity**: Not too simple, not too complex
|
|
- **Future Proof**: Easy to add new providers and MCP servers
|
|
- **Performance Optimized**: OpenAI interface where beneficial
|
|
- **Feature Complete**: Native clients where needed
|
|
|
|
## Risk Mitigation
|
|
|
|
### Provider Failures
|
|
- **Multi-provider redundancy**: Route to alternative providers
|
|
- **Graceful degradation**: Disable failed providers, continue with others
|
|
- **Health monitoring**: Continuous provider health checks
|
|
|
|
### MCP Server Failures
|
|
- **Auto-restart**: Automatically restart failed STDIO servers
|
|
- **Circuit breakers**: Temporarily disable failing servers
|
|
- **Error isolation**: Server failures don't affect other tools
|
|
|
|
### Performance Issues
|
|
- **Connection pooling**: Reuse connections across requests
|
|
- **Caching**: Cache tool results and provider responses
|
|
- **Load balancing**: Distribute load across instances
|
|
|
|
## Success Metrics
|
|
|
|
1. **Provider Coverage**: 4/4 LLM providers working
|
|
2. **MCP Integration**: 5+ MCP servers connected successfully
|
|
3. **Performance**: <1s average response time for tool execution
|
|
4. **Reliability**: >95% uptime and success rate
|
|
5. **Adoption**: Remote LLMs successfully using the orchestrator
|
|
|
|
---
|
|
|
|
**Final Recommendation**: ✅ **PROCEED with Hybrid OpenAI-First Architecture**
|
|
|
|
This design provides the optimal balance of simplicity, performance, and feature coverage while enabling the revolutionary capability of giving remote LLMs access to the entire MCP ecosystem through a single integration point.
|
|
|
|
*Architecture finalized: 2025-09-05*
|
|
*Based on: Comprehensive provider testing and performance benchmarking*
|
|
*Next step: Begin Phase 1 implementation* |