llm-fusion-mcp/DEPLOYMENT.md

# 🚀 LLM Fusion MCP - Production Deployment Guide

This guide covers deploying **LLM Fusion MCP** in production environments with Docker, cloud platforms, and enterprise setups.

---

## 📋 **Quick Start**

### **1. Prerequisites**
- Docker & Docker Compose
- At least 2GB RAM
- Internet connection for AI provider APIs
- One or more LLM provider API keys

### **2. One-Command Deployment**
```bash
# Clone and deploy
git clone <repository-url>
cd llm-fusion-mcp

# Configure environment
cp .env.production .env
# Edit .env with your API keys

# Deploy with Docker
./deploy.sh production
```

---

## 🐳 **Docker Deployment**

### **Method 1: Docker Compose (Recommended)**
```bash
# Start services
docker-compose up -d

# View logs
docker-compose logs -f

# Stop services
docker-compose down
```

### **Method 2: Standalone Docker**
```bash
# Build image
docker build -t llm-fusion-mcp:latest .

# Run container
docker run -d \
  --name llm-fusion-mcp \
  --restart unless-stopped \
  -e GOOGLE_API_KEY="your_key" \
  -e OPENAI_API_KEY="your_key" \
  -v ./logs:/app/logs \
  llm-fusion-mcp:latest
```

### **Method 3: Pre-built Images**
```bash
# Pull from GitHub Container Registry
docker pull ghcr.io/username/llm-fusion-mcp:latest

# Run with your environment
docker run -d \
  --name llm-fusion-mcp \
  --env-file .env \
  ghcr.io/username/llm-fusion-mcp:latest
```

---

## ☁️ **Cloud Platform Deployment**

### **🔵 AWS Deployment**

#### **AWS ECS with Fargate**
```yaml
# ecs-task-definition.json
{
  "family": "llm-fusion-mcp",
  "networkMode": "awsvpc",
  "requiresCompatibilities": ["FARGATE"],
  "cpu": "1024",
  "memory": "2048",
  "executionRoleArn": "arn:aws:iam::account:role/ecsTaskExecutionRole",
  "containerDefinitions": [
    {
      "name": "llm-fusion-mcp",
      "image": "ghcr.io/username/llm-fusion-mcp:latest",
      "essential": true,
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/llm-fusion-mcp",
          "awslogs-region": "us-east-1",
          "awslogs-stream-prefix": "ecs"
        }
      },
      "environment": [
        {"name": "GOOGLE_API_KEY", "value": "your_key"},
        {"name": "SERVER_MODE", "value": "production"}
      ]
    }
  ]
}
```

#### **AWS Lambda (Serverless)**
```bash
# Package for Lambda
zip -r llm-fusion-mcp-lambda.zip src/ requirements.txt

# Deploy with AWS CLI
aws lambda create-function \
  --function-name llm-fusion-mcp \
  --runtime python3.12 \
  --role arn:aws:iam::account:role/lambda-execution-role \
  --handler src.llm_fusion_mcp.lambda_handler \
  --zip-file fileb://llm-fusion-mcp-lambda.zip \
  --timeout 300 \
  --memory-size 1024
```

### **🔷 Azure Deployment**

#### **Azure Container Instances**
```bash
# Deploy to Azure
az container create \
  --resource-group myResourceGroup \
  --name llm-fusion-mcp \
  --image ghcr.io/username/llm-fusion-mcp:latest \
  --cpu 2 --memory 4 \
  --restart-policy Always \
  --environment-variables \
    GOOGLE_API_KEY="your_key" \
    SERVER_MODE="production"
```

#### **Azure App Service**
```bash
# Deploy as Web App
az webapp create \
  --resource-group myResourceGroup \
  --plan myAppServicePlan \
  --name llm-fusion-mcp \
  --deployment-container-image-name ghcr.io/username/llm-fusion-mcp:latest

# Configure environment
az webapp config appsettings set \
  --resource-group myResourceGroup \
  --name llm-fusion-mcp \
  --settings \
    GOOGLE_API_KEY="your_key" \
    SERVER_MODE="production"
```

### **🟢 Google Cloud Deployment**

#### **Cloud Run**
```bash
# Deploy to Cloud Run
gcloud run deploy llm-fusion-mcp \
  --image ghcr.io/username/llm-fusion-mcp:latest \
  --platform managed \
  --region us-central1 \
  --allow-unauthenticated \
  --set-env-vars GOOGLE_API_KEY="your_key",SERVER_MODE="production" \
  --memory 2Gi \
  --cpu 2
```

#### **GKE (Kubernetes)**
```yaml
# kubernetes-deployment.yml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: llm-fusion-mcp
spec:
  replicas: 3
  selector:
    matchLabels:
      app: llm-fusion-mcp
  template:
    metadata:
      labels:
        app: llm-fusion-mcp
    spec:
      containers:
      - name: llm-fusion-mcp
        image: ghcr.io/username/llm-fusion-mcp:latest
        ports:
        - containerPort: 8000
        env:
        - name: GOOGLE_API_KEY
          valueFrom:
            secretKeyRef:
              name: llm-fusion-secrets
              key: google-api-key
        resources:
          requests:
            memory: "1Gi"
            cpu: "500m"
          limits:
            memory: "2Gi"
            cpu: "1000m"
---
apiVersion: v1
kind: Service
metadata:
  name: llm-fusion-mcp-service
spec:
  selector:
    app: llm-fusion-mcp
  ports:
  - protocol: TCP
    port: 80
    targetPort: 8000
  type: LoadBalancer
```

---

## 🏢 **Enterprise Deployment**

### **🔐 Security Hardening**

#### **1. API Key Security**
```bash
# Use encrypted secrets
kubectl create secret generic llm-fusion-secrets \
  --from-literal=google-api-key="$GOOGLE_API_KEY" \
  --from-literal=openai-api-key="$OPENAI_API_KEY"

# Enable key rotation
export ENABLE_KEY_ROTATION=true
export KEY_ROTATION_INTERVAL=24
```

#### **2. Network Security**
```bash
# Firewall rules (example for AWS)
aws ec2 create-security-group \
  --group-name llm-fusion-mcp-sg \
  --description "LLM Fusion MCP Security Group"

# Allow only necessary ports
aws ec2 authorize-security-group-ingress \
  --group-id sg-xxxxxxx \
  --protocol tcp \
  --port 8000 \
  --source-group sg-frontend
```

#### **3. Resource Limits**
```yaml
# Docker Compose with limits
version: '3.8'
services:
  llm-fusion-mcp:
    image: llm-fusion-mcp:latest
    deploy:
      resources:
        limits:
          cpus: '2.0'
          memory: 4G
        reservations:
          cpus: '1.0'
          memory: 2G
    restart: unless-stopped
```

### **📊 Monitoring & Observability**

#### **1. Health Checks**
```bash
# Built-in health endpoint
curl http://localhost:8000/health

# Docker health check
docker run --health-cmd="curl -f http://localhost:8000/health" \
  --health-interval=30s \
  --health-retries=3 \
  --health-start-period=40s \
  --health-timeout=10s \
  llm-fusion-mcp:latest
```

#### **2. Prometheus Metrics**
```yaml
# prometheus.yml
scrape_configs:
  - job_name: 'llm-fusion-mcp'
    static_configs:
      - targets: ['llm-fusion-mcp:9090']
    metrics_path: /metrics
    scrape_interval: 15s
```

#### **3. Centralized Logging**
```bash
# ELK Stack integration
docker run -d \
  --name llm-fusion-mcp \
  --log-driver=fluentd \
  --log-opt fluentd-address=localhost:24224 \
  --log-opt tag="docker.llm-fusion-mcp" \
  llm-fusion-mcp:latest
```

### **🔄 High Availability Setup**

#### **1. Load Balancing**
```nginx
# nginx.conf
upstream llm_fusion_backend {
    server llm-fusion-mcp-1:8000;
    server llm-fusion-mcp-2:8000;
    server llm-fusion-mcp-3:8000;
}

server {
    listen 80;
    location / {
        proxy_pass http://llm_fusion_backend;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}
```

#### **2. Auto-scaling**
```yaml
# Kubernetes HPA
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: llm-fusion-mcp-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: llm-fusion-mcp
  minReplicas: 3
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
```

---

## 🔧 **Configuration Management**

### **Environment Variables**
| Variable | Required | Default | Description |
|----------|----------|---------|-------------|
| `GOOGLE_API_KEY` | ✅ | - | Google Gemini API key |
| `OPENAI_API_KEY` | ❌ | - | OpenAI API key |
| `ANTHROPIC_API_KEY` | ❌ | - | Anthropic API key |
| `XAI_API_KEY` | ❌ | - | xAI Grok API key |
| `SERVER_MODE` | ❌ | `production` | Server mode |
| `LOG_LEVEL` | ❌ | `INFO` | Logging level |
| `MAX_FILE_SIZE_MB` | ❌ | `50` | Max file size for analysis |
| `REQUEST_TIMEOUT` | ❌ | `300` | Request timeout in seconds |

### **Volume Mounts**
```bash
# Data persistence
-v ./data:/app/data        # Persistent data
-v ./logs:/app/logs        # Log files
-v ./config:/app/config    # Configuration files
-v ./cache:/app/cache      # Model cache
```

---

## 🚨 **Troubleshooting**

### **Common Issues**

#### **Container Won't Start**
```bash
# Check logs
docker-compose logs llm-fusion-mcp

# Common fixes
# 1. API key not configured
# 2. Port already in use
# 3. Insufficient memory

# Debug mode
docker-compose run --rm llm-fusion-mcp bash
```

#### **API Connection Issues**
```bash
# Test API connectivity
curl -H "Authorization: Bearer $GOOGLE_API_KEY" \
  https://generativelanguage.googleapis.com/v1beta/models

# Check firewall/network
telnet api.openai.com 443
```

#### **Performance Issues**
```bash
# Monitor resource usage
docker stats llm-fusion-mcp

# Scale horizontally
docker-compose up --scale llm-fusion-mcp=3
```

### **Health Checks**
```bash
# Built-in health check
curl http://localhost:8000/health

# Provider status
curl http://localhost:8000/health/providers

# System metrics
curl http://localhost:8000/metrics
```

---

## 📞 **Support**

### **Getting Help**
- 📖 **Documentation**: Check README.md and INTEGRATION.md
- 🧪 **Testing**: Run health checks and test suite
- 🔍 **Debugging**: Enable DEBUG log level
- 📊 **Monitoring**: Check metrics and logs

### **Performance Tuning**
- **Memory**: Increase container memory for large file processing
- **CPU**: Scale horizontally for high throughput
- **Cache**: Tune model cache timeout for your usage patterns
- **Network**: Use CDN for static assets, optimize API endpoints

---

<div align="center">

## 🎉 **Ready for Production!**

**Your LLM Fusion MCP server is now deployed and ready to handle production workloads!**

*Built with ❤️ for enterprise-grade AI integration*

</div>