diff --git a/OAUTH-ARCHITECTURE.md b/OAUTH-ARCHITECTURE.md index dad0373..cd7d888 100644 --- a/OAUTH-ARCHITECTURE.md +++ b/OAUTH-ARCHITECTURE.md @@ -1,17 +1,17 @@ -# OAuth Architecture for vSphere MCP Server +# OAuth & RBAC Architecture for mcvsphere -## The Problem +## Overview -We need to add authentication to the MCP server so that: -1. Users authenticate via **OAuth 2.1 / OIDC** (using Authentik as IdP) -2. The MCP server knows WHO is making requests (for audit logging) -3. vCenter permissions are respected per-user +mcvsphere supports multi-user OAuth 2.1 authentication with Role-Based Access Control (RBAC). This enables: -**Challenge:** vCenter 7.0.3 doesn't support OAuth token exchange (RFC 8693), so we can't pass OAuth tokens directly to vCenter. +1. **Single Sign-On** via any OIDC provider (Authentik, Keycloak, Auth0, etc.) +2. **User Identity** for audit logging - know WHO made each request +3. **Group-Based Permissions** - control what users can do based on OAuth groups +4. **Audit Trail** - every tool invocation logged with user identity and timing --- -## Architecture Overview +## Architecture ``` ┌─────────────────────────────────────────────────────────────────┐ @@ -22,360 +22,352 @@ We need to add authentication to the MCP server so that: │ (browser opens for login) │ ┌────────────────────────────▼────────────────────────────────────┐ -│ Authentik │ -│ (Self-hosted OIDC IdP) │ +│ OIDC Provider │ +│ (Authentik, Keycloak, Auth0, etc.) │ │ │ │ - Issues JWT access tokens │ │ - Validates user credentials │ -│ - Includes user identity in token (sub, email, groups) │ +│ - Includes groups claim in token │ └────────────────────────────┬────────────────────────────────────┘ │ │ 2. JWT Bearer token │ Authorization: Bearer │ ┌────────────────────────────▼────────────────────────────────────┐ -│ vSphere MCP Server │ +│ mcvsphere │ │ (FastMCP + pyvmomi) │ │ │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ OIDCProxy (FastMCP) │ │ -│ │ - Validates JWT signature via Authentik JWKS │ │ -│ │ - Extracts user identity (preferred_username) │ │ -│ │ - Makes user available via ctx.request_context.user │ │ +│ │ - Validates JWT signature via JWKS endpoint │ │ +│ │ - Extracts user identity (preferred_username, email) │ │ +│ │ - Extracts groups from token claims │ │ │ └─────────────────────────────────────────────────────────┘ │ │ │ │ │ ▼ │ │ ┌─────────────────────────────────────────────────────────┐ │ -│ │ Credential Broker │ │ -│ │ - Maps OAuth user → vCenter credentials │ │ -│ │ - Caches pyvmomi connections per-user │ │ -│ │ - Retrieves passwords from Vault / env vars │ │ +│ │ RBACMiddleware │ │ +│ │ - Intercepts ALL tool calls via on_call_tool() │ │ +│ │ - Maps OAuth groups → Permission levels │ │ +│ │ - Denies access if user lacks required permission │ │ +│ │ - Logs audit events with user identity │ │ │ └─────────────────────────────────────────────────────────┘ │ │ │ │ │ ▼ │ │ ┌─────────────────────────────────────────────────────────┐ │ -│ │ Audit Logger │ │ -│ │ - Logs all tool invocations with OAuth identity │ │ -│ │ - "User ryan@example.com powered on VM web-server" │ │ +│ │ VMware Tools (94) │ │ +│ │ - Execute vCenter/ESXi operations via pyvmomi │ │ +│ │ - Single service account connection to vCenter │ │ │ └─────────────────────────────────────────────────────────┘ │ └────────────────────────────┬────────────────────────────────────┘ │ - │ 3. pyvmomi (as mapped user) + │ 3. pyvmomi (service account) │ ┌────────────────────────────▼────────────────────────────────────┐ -│ vCenter 7.0.3 │ -│ - Receives API calls as the actual user │ -│ - Native audit logs show real user identity │ -│ - vCenter permissions apply naturally │ +│ vCenter / ESXi │ +│ - Receives API calls as service account │ +│ - mcvsphere audit logs show real user identity │ └─────────────────────────────────────────────────────────────────┘ ``` --- -## User Mapping Strategies +## RBAC Permission Model -Since we can't exchange OAuth tokens for vCenter tokens, we need a "credential broker": +### Permission Levels -| Strategy | How it Works | Security | Use Case | -|----------|--------------|----------|----------| -| **Service Account** | All requests use one vCenter admin account | Medium | Simple/dev | -| **Per-User Mapping** | Map OAuth username → vCenter credentials from Vault | High | Production | -| **LDAP Sync** | Same username/password in Authentik and vCenter SSO | Medium | AD environments | +mcvsphere defines 5 permission levels, from least to most privileged: -### Recommended: Per-User Mapping with Fallback +| Level | Description | Example Tools | +|-------|-------------|---------------| +| `READ_ONLY` | View-only operations | `list_vms`, `get_vm_info`, `vm_screenshot` | +| `POWER_OPS` | Power and snapshot operations | `power_on`, `create_snapshot`, `reboot_guest` | +| `VM_LIFECYCLE` | Create/delete/modify VMs | `create_vm`, `clone_vm`, `add_disk`, `deploy_ovf` | +| `HOST_ADMIN` | ESXi host operations | `reboot_host`, `enter_maintenance_mode` | +| `FULL_ADMIN` | Everything including guest OS ops | `run_command_in_guest`, `restart_service` | + +### OAuth Groups → Permissions + +Users are granted permissions based on their OAuth group memberships: + +| OAuth Group | Permissions Granted | +|-------------|---------------------| +| `vsphere-readers` | READ_ONLY | +| `vsphere-operators` | READ_ONLY, POWER_OPS | +| `vsphere-admins` | READ_ONLY, POWER_OPS, VM_LIFECYCLE | +| `vsphere-host-admins` | READ_ONLY, POWER_OPS, VM_LIFECYCLE, HOST_ADMIN | +| `vsphere-super-admins` | ALL (full access) | + +**Security Note:** Users with NO recognized groups are denied ALL access. There is no default permission. + +### Tool → Permission Mapping + +All 94 tools are mapped to permission levels in `src/mcvsphere/permissions.py`: ```python -class CredentialBroker: - """Maps OAuth users to vCenter credentials.""" +# READ_ONLY - 32 tools +"list_vms", "get_vm_info", "list_snapshots", "get_vm_stats", ... - def __init__(self, vcenter_host: str, fallback_user: str = None, fallback_password: str = None): - self.vcenter_host = vcenter_host - self.fallback_user = fallback_user # Service account fallback - self.fallback_password = fallback_password - self._connections: dict[str, ServiceInstance] = {} +# POWER_OPS - 14 tools +"power_on", "power_off", "create_snapshot", "revert_to_snapshot", ... - def get_connection_for_user(self, oauth_user: dict) -> ServiceInstance: - """Get pyvmomi connection for this OAuth user.""" - username = oauth_user.get("preferred_username") +# VM_LIFECYCLE - 33 tools +"create_vm", "clone_vm", "delete_vm", "add_disk", "deploy_ovf", ... - # Try per-user credentials first - try: - vcenter_creds = self._lookup_credentials(username) - return self._get_or_create_connection( - vcenter_creds["user"], - vcenter_creds["password"] - ) - except KeyError: - # Fall back to service account - if self.fallback_user: - return self._get_or_create_connection( - self.fallback_user, - self.fallback_password - ) - raise ValueError(f"No vCenter credentials for user: {username}") +# HOST_ADMIN - 6 tools +"enter_maintenance_mode", "reboot_host", "shutdown_host", ... - def _lookup_credentials(self, username: str) -> dict: - """Look up vCenter credentials for OAuth user.""" - # Option 1: Environment variable - env_key = f"VCENTER_PASSWORD_{username.upper().replace('@', '_').replace('.', '_')}" - if password := os.environ.get(env_key): - return {"user": f"{username}@vsphere.local", "password": password} - - # Option 2: HashiCorp Vault (production) - # return vault_client.read(f"secret/vcenter/users/{username}") - - raise KeyError(f"No credentials found for {username}") +# FULL_ADMIN - 11 tools +"run_command_in_guest", "write_guest_file", "restart_service", ... ``` --- -## FastMCP OAuth Integration +## Implementation Details -### 1. Add OIDCProxy to server.py +### Key Files + +| File | Purpose | +|------|---------| +| `src/mcvsphere/auth.py` | OIDCProxy configuration | +| `src/mcvsphere/permissions.py` | Permission levels and tool mappings | +| `src/mcvsphere/middleware.py` | RBACMiddleware implementation | +| `src/mcvsphere/audit.py` | Audit logging with user context | +| `src/mcvsphere/server.py` | Server setup with OAuth + RBAC | + +### RBACMiddleware Flow ```python -import os -from fastmcp import FastMCP -from fastmcp.server.auth import OIDCProxy +class RBACMiddleware(Middleware): + """Intercepts all tool calls to enforce permissions.""" -# Configure OAuth with Authentik -auth = OIDCProxy( - # Authentik OIDC Discovery URL - config_url=os.environ["AUTHENTIK_OIDC_URL"], - # e.g., "https://auth.example.com/application/o/vsphere-mcp/.well-known/openid-configuration" + async def on_call_tool(self, context, call_next): + # 1. Extract user from OAuth token + claims = self._extract_user_from_context(context.fastmcp_context) + username = claims.get("preferred_username", "unknown") + groups = claims.get("groups", []) - # Application credentials from Authentik - client_id=os.environ["AUTHENTIK_CLIENT_ID"], - client_secret=os.environ["AUTHENTIK_CLIENT_SECRET"], + # 2. Check permission + tool_name = context.message.name + if not check_permission(tool_name, groups): + required = get_required_permission(tool_name) + audit_permission_denied(tool_name, {...}, required.value) + raise PermissionDeniedError(username, tool_name, required) - # MCP Server base URL (for redirects) - base_url=os.environ.get("MCP_BASE_URL", "http://localhost:8000"), + # 3. Execute tool with timing + start = time.perf_counter() + result = await call_next(context) + duration_ms = (time.perf_counter() - start) * 1000 - # Token validation - required_scopes=["openid", "profile", "email"], - - # Allow Claude Code localhost redirects - allowed_client_redirect_uris=["http://localhost:*", "http://127.0.0.1:*"], -) - -# Create MCP server with OAuth -mcp = FastMCP( - "vSphere MCP Server", - auth=auth, - # Use Streamable HTTP transport for OAuth -) + # 4. Audit log + audit_log(tool_name, {...}, result="success", duration_ms=duration_ms) + return result ``` -### 2. Access User Identity in Tools +### Audit Log Format -```python -from fastmcp import Context +```json +{ + "timestamp": "2025-12-27T08:15:32.123456+00:00", + "user": "ryan@example.com", + "groups": ["vsphere-admins", "vsphere-operators"], + "tool": "power_on", + "args": {"vm_name": "web-server"}, + "duration_ms": 1234.56, + "result": "success" +} +``` -@mcp.tool() -async def power_on_vm(ctx: Context, vm_name: str) -> str: - """Power on a virtual machine.""" - # Get authenticated user from OAuth token - user = ctx.request_context.user - username = user.get("preferred_username", user.get("sub")) - - # Get vCenter connection for this user - broker = get_credential_broker() - connection = broker.get_connection_for_user(user) - - # Execute operation - content = connection.RetrieveContent() - vm = find_vm(content, vm_name) - vm.PowerOnVM_Task() - - # Audit log - logger.info(f"User {username} powered on VM {vm_name}") - - return f"VM '{vm_name}' is powering on" +Permission denied events: +```json +{ + "timestamp": "2025-12-27T08:15:32.123456+00:00", + "user": "guest@example.com", + "groups": ["vsphere-readers"], + "tool": "delete_vm", + "args": {"vm_name": "web-server"}, + "required_permission": "vm_lifecycle", + "event": "PERMISSION_DENIED" +} ``` --- -## MCP Transport: Streamable HTTP +## Configuration -OAuth requires HTTP transport (not stdio). Use Streamable HTTP: - -```python -# In server.py or via environment -mcp = FastMCP( - "vSphere MCP Server", - auth=auth, -) - -def main(): - # Run with HTTP transport for OAuth support - mcp.run(transport="streamable-http", host="0.0.0.0", port=8000) -``` - -**Transport characteristics:** -- Single HTTP endpoint (`/mcp`) -- POST requests with JSON-RPC body -- Server-Sent Events (SSE) for streaming responses -- `Authorization: Bearer ` header on every request -- `Mcp-Session-Id` header for session continuity - ---- - -## Environment Variables +### Environment Variables ```bash -# Authentik OIDC -AUTHENTIK_OIDC_URL=https://auth.example.com/application/o/vsphere-mcp/.well-known/openid-configuration -AUTHENTIK_CLIENT_ID= -AUTHENTIK_CLIENT_SECRET= +# ═══════════════════════════════════════════════════════════════ +# OAuth Configuration +# ═══════════════════════════════════════════════════════════════ +OAUTH_ENABLED=true +OAUTH_ISSUER_URL=https://auth.example.com/application/o/mcvsphere/ +OAUTH_CLIENT_ID= +OAUTH_CLIENT_SECRET= +OAUTH_BASE_URL=https://mcp.example.com # Public URL for callbacks +OAUTH_SCOPES='["openid", "profile", "email", "groups"]' -# MCP Server -MCP_BASE_URL=https://mcp.example.com # Public URL for OAuth redirects +# ═══════════════════════════════════════════════════════════════ +# Transport (must be HTTP for OAuth) +# ═══════════════════════════════════════════════════════════════ MCP_TRANSPORT=streamable-http +MCP_HOST=0.0.0.0 +MCP_PORT=8080 -# vCenter Connection (service account fallback) +# ═══════════════════════════════════════════════════════════════ +# vCenter Connection (service account) +# ═══════════════════════════════════════════════════════════════ VCENTER_HOST=vcenter.example.com -VCENTER_USER=svc-mcp@vsphere.local +VCENTER_USER=svc-mcvsphere@vsphere.local VCENTER_PASSWORD= -VCENTER_INSECURE=true +VCENTER_INSECURE=false -# Per-user credentials (optional, for testing) -VCENTER_PASSWORD_RYAN= -VCENTER_PASSWORD_ALICE= +# ═══════════════════════════════════════════════════════════════ +# Optional +# ═══════════════════════════════════════════════════════════════ +LOG_LEVEL=INFO +``` -# User Mapping Mode -USER_MAPPING_MODE=service_account # or 'per_user', 'ldap_sync' +### Server Startup Banner + +When OAuth is enabled, the server shows: +``` +mcvsphere v0.2.1 +──────────────────────────────────────── +Starting HTTP transport on 0.0.0.0:8080 +OAuth: ENABLED via https://auth.example.com/application/o/mcvsphere/ +RBAC: ENABLED - permissions enforced via groups +──────────────────────────────────────── ``` --- -## Authentik Setup (Quick Reference) +## OIDC Provider Setup + +### Authentik (Recommended) 1. **Create OAuth2/OIDC Provider:** - - Name: `vsphere-mcp` - - Client Type: Confidential + - Name: `mcvsphere` + - Client Type: **Confidential** - Redirect URIs: - - `http://localhost:*/callback` + - `http://localhost:*/callback` (for local dev) - `https://mcp.example.com/auth/callback` - - Scopes: `openid`, `profile`, `email` - - Signing Key: Select or create RS256 key + - Signing Key: Select RS256 certificate 2. **Create Application:** - - Name: `vSphere MCP Server` - - Slug: `vsphere-mcp` - - Provider: Select the provider above - - Note the **Client ID** and **Client Secret** + - Name: `mcvsphere` + - Slug: `mcvsphere` + - Provider: Select provider from step 1 -3. **Configure Groups (optional):** - - `vsphere-admins` - Full access - - `vsphere-operators` - Limited access - - Groups are included in JWT `groups` claim +3. **Create Groups:** + - `vsphere-readers` + - `vsphere-operators` + - `vsphere-admins` + - `vsphere-host-admins` + - `vsphere-super-admins` + +4. **Add Scope Mapping for Groups:** + - Ensure `groups` claim is included in tokens + - Authentik includes this by default + +5. **Note Credentials:** + - Copy Client ID and Client Secret + - Discovery URL: `https://auth.example.com/application/o/mcvsphere/.well-known/openid-configuration` + +### Other Providers + +The same pattern works with Keycloak, Auth0, Okta, etc. Key requirements: +- OIDC Discovery endpoint (`.well-known/openid-configuration`) +- JWT access tokens (not opaque) +- `groups` claim in tokens with group names --- -## OAuth Flow (End-to-End) +## OAuth Flow ``` -1. Claude Code connects to MCP Server - → GET /mcp - → Server returns 401 Unauthorized - → WWW-Authenticate header includes OAuth metadata URL +1. Client connects to mcvsphere + → POST /mcp (no auth) + → Server returns 401 + OAuth metadata URL -2. Claude Code fetches OAuth metadata - → Discovers Authentik authorization URL - → Discovers required scopes +2. Client initiates OAuth flow + → Opens browser to OIDC provider + → User logs in + → Provider redirects with authorization code -3. Claude Code initiates OAuth flow - → Opens browser to Authentik login page - → User enters credentials - → Authentik redirects back with authorization code +3. Client exchanges code for tokens + → POST to provider token endpoint + → Receives JWT access token -4. Claude Code exchanges code for tokens - → POST to Authentik token endpoint - → Receives JWT access token + refresh token - -5. Claude Code reconnects with Bearer token +4. Client reconnects with token → POST /mcp with Authorization: Bearer - → Server validates JWT via Authentik JWKS - → Server extracts user identity - → User can now invoke tools + → Server validates JWT via JWKS + → Server extracts user + groups + → RBACMiddleware checks permissions + → User can invoke allowed tools -6. Tool invocation - → Client: "power on web-server VM" - → Server: Validates token, maps user to vCenter creds - → Server: Executes pyvmomi call - → Server: Logs "User ryan@example.com powered on web-server" - → Client: Receives success response +5. Tool invocation + → Client: "power on web-server" + → Middleware: Validate user has POWER_OPS + → Tool: Execute pyvmomi call + → Audit: Log with user identity + → Client: Receive response ``` --- -## Implementation Checklist +## Implementation Status -### Phase 1: Prepare Server -- [ ] Add `fastmcp[auth]` to dependencies -- [ ] Create `auth.py` with OIDCProxy configuration -- [ ] Create `credential_broker.py` for user mapping -- [ ] Add audit logging to all tools -- [ ] Update server.py to use HTTP transport +### Completed -### Phase 2: Deploy Authentik -- [ ] Docker Compose for Authentik -- [ ] Create OIDC provider and application -- [ ] Configure redirect URIs -- [ ] Note client credentials -- [ ] Test OIDC flow manually with curl +- [x] OIDCProxy configuration (`auth.py`) +- [x] Permission levels and tool mappings (`permissions.py`) +- [x] RBACMiddleware with FastMCP integration (`middleware.py`) +- [x] Audit logging with user context (`audit.py`) +- [x] Server integration with OAuth + RBAC (`server.py`) +- [x] Startup banner showing OAuth/RBAC status +- [x] Security fix: deny-by-default for no groups +- [x] Authentik setup with 5 vsphere-* groups -### Phase 3: Integration -- [ ] Configure environment variables -- [ ] Test with `fastmcp dev` (OAuth mode) -- [ ] Test with Claude Code (`claude mcp add --auth oauth`) -- [ ] Verify audit logs show correct user identity +### Future Enhancements -### Phase 4: Production -- [ ] HTTPS via Caddy reverse proxy -- [ ] Secrets in Docker secrets / Vault -- [ ] Service account with minimal vCenter permissions -- [ ] Log aggregation and monitoring +- [ ] Per-user vCenter credential mapping (Vault integration) +- [ ] Rate limiting per user +- [ ] Session management and token refresh +- [ ] Admin tools for permission management +- [ ] Prometheus metrics for RBAC decisions --- -## Files to Create/Modify +## Security Considerations -``` -esxi-mcp-server/ -├── src/esxi_mcp_server/ -│ ├── auth.py # NEW: OIDCProxy configuration -│ ├── credential_broker.py # NEW: OAuth → vCenter credential mapping -│ ├── server.py # MODIFY: Add auth, HTTP transport -│ └── mixins/ -│ └── *.py # MODIFY: Add ctx.request_context.user logging -├── .env.example # MODIFY: Add OAuth variables -└── docker-compose.yml # MODIFY: Add Authentik services -``` +1. **Default Deny**: Users without recognized groups get NO access +2. **Token Validation**: JWTs validated via OIDC provider's JWKS endpoint +3. **Audit Trail**: All operations logged with user identity +4. **Secrets**: Client secrets should be stored securely (env vars, Docker secrets, Vault) +5. **HTTPS**: Production deployments should use TLS (via Caddy, nginx, etc.) +6. **Service Account**: Use minimal vCenter permissions for the service account --- -## Key Insight +## Troubleshooting -The "middleman" role of the MCP server is critical: +### "401 Unauthorized" on all requests +- Check `OAUTH_ISSUER_URL` points to valid OIDC discovery endpoint +- Verify client ID and secret match provider configuration +- Ensure token hasn't expired -``` -OAuth Token (Authentik) ──┐ - │ - ▼ - ┌─────────────┐ - │ MCP Server │ ← Validates OAuth, maps to vCenter creds - │ (Middleman) │ ← Logs audit trail with OAuth identity - └─────────────┘ - │ - ▼ - vCenter API (pyvmomi) -``` +### "Permission denied" errors +- Check user's group memberships in OIDC provider +- Verify groups claim is included in JWT (decode at jwt.io) +- Confirm group names match exactly (e.g., `vsphere-admins` not `vsphere_admins`) -The MCP server doesn't pass OAuth tokens to vCenter. Instead, it: -1. **Authenticates** users via OAuth (trusts Authentik) -2. **Authorizes** by mapping OAuth identity to vCenter credentials -3. **Audits** by logging all actions with the OAuth user identity -4. **Executes** vCenter API calls using mapped credentials +### Token validation fails +- Ensure OIDC provider issues JWTs (not opaque tokens) +- Check signing key is configured in provider +- Verify `OAUTH_BASE_URL` matches redirect URI in provider -This gives you SSO-like experience while working within vCenter 7.0.3's authentication limitations. +### Audit logs not showing user +- Check `groups` scope is requested +- Verify token contains `preferred_username` or `email` claim