docs: update OAuth/RBAC architecture documentation

Rewrites the architecture doc from design proposal to implementation
reference. Documents the complete RBAC system including:

- 5 permission levels (READ_ONLY → FULL_ADMIN)
- 5 OAuth groups with permission mappings
- RBACMiddleware implementation details
- Audit log format with user identity
- Configuration environment variables
- OIDC provider setup (Authentik example)
- Troubleshooting guide for common issues

Updates implementation checklist to reflect completed status.
This commit is contained in:
Ryan Malloy 2025-12-27 08:22:02 -07:00
parent 00857b1840
commit ab83c70c31

View File

@ -1,17 +1,17 @@
# OAuth Architecture for vSphere MCP Server
# OAuth & RBAC Architecture for mcvsphere
## The Problem
## Overview
We need to add authentication to the MCP server so that:
1. Users authenticate via **OAuth 2.1 / OIDC** (using Authentik as IdP)
2. The MCP server knows WHO is making requests (for audit logging)
3. vCenter permissions are respected per-user
mcvsphere supports multi-user OAuth 2.1 authentication with Role-Based Access Control (RBAC). This enables:
**Challenge:** vCenter 7.0.3 doesn't support OAuth token exchange (RFC 8693), so we can't pass OAuth tokens directly to vCenter.
1. **Single Sign-On** via any OIDC provider (Authentik, Keycloak, Auth0, etc.)
2. **User Identity** for audit logging - know WHO made each request
3. **Group-Based Permissions** - control what users can do based on OAuth groups
4. **Audit Trail** - every tool invocation logged with user identity and timing
---
## Architecture Overview
## Architecture
```
┌─────────────────────────────────────────────────────────────────┐
@ -22,360 +22,352 @@ We need to add authentication to the MCP server so that:
│ (browser opens for login)
┌────────────────────────────▼────────────────────────────────────┐
Authentik
(Self-hosted OIDC IdP)
OIDC Provider
(Authentik, Keycloak, Auth0, etc.)
│ │
│ - Issues JWT access tokens │
│ - Validates user credentials │
│ - Includes user identity in token (sub, email, groups)
│ - Includes groups claim in token
└────────────────────────────┬────────────────────────────────────┘
│ 2. JWT Bearer token
│ Authorization: Bearer <jwt>
┌────────────────────────────▼────────────────────────────────────┐
vSphere MCP Server
mcvsphere
│ (FastMCP + pyvmomi) │
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ OIDCProxy (FastMCP) │ │
│ │ - Validates JWT signature via Authentik JWKS │ │
│ │ - Extracts user identity (preferred_username) │ │
│ │ - Makes user available via ctx.request_context.user │ │
│ │ - Validates JWT signature via JWKS endpoint │ │
│ │ - Extracts user identity (preferred_username, email) │ │
│ │ - Extracts groups from token claims │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Credential Broker │ │
│ │ - Maps OAuth user → vCenter credentials │ │
│ │ - Caches pyvmomi connections per-user │ │
│ │ - Retrieves passwords from Vault / env vars │ │
│ │ RBACMiddleware │ │
│ │ - Intercepts ALL tool calls via on_call_tool() │ │
│ │ - Maps OAuth groups → Permission levels │ │
│ │ - Denies access if user lacks required permission │ │
│ │ - Logs audit events with user identity │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Audit Logger │ │
│ │ - Logs all tool invocations with OAuth identity │ │
│ │ - "User ryan@example.com powered on VM web-server" │ │
│ │ VMware Tools (94) │ │
│ │ - Execute vCenter/ESXi operations via pyvmomi │ │
│ │ - Single service account connection to vCenter │ │
│ └─────────────────────────────────────────────────────────┘ │
└────────────────────────────┬────────────────────────────────────┘
│ 3. pyvmomi (as mapped user)
│ 3. pyvmomi (service account)
┌────────────────────────────▼────────────────────────────────────┐
│ vCenter 7.0.3 │
│ - Receives API calls as the actual user │
│ - Native audit logs show real user identity │
│ - vCenter permissions apply naturally │
│ vCenter / ESXi │
│ - Receives API calls as service account │
│ - mcvsphere audit logs show real user identity │
└─────────────────────────────────────────────────────────────────┘
```
---
## User Mapping Strategies
## RBAC Permission Model
Since we can't exchange OAuth tokens for vCenter tokens, we need a "credential broker":
### Permission Levels
| Strategy | How it Works | Security | Use Case |
|----------|--------------|----------|----------|
| **Service Account** | All requests use one vCenter admin account | Medium | Simple/dev |
| **Per-User Mapping** | Map OAuth username → vCenter credentials from Vault | High | Production |
| **LDAP Sync** | Same username/password in Authentik and vCenter SSO | Medium | AD environments |
mcvsphere defines 5 permission levels, from least to most privileged:
### Recommended: Per-User Mapping with Fallback
| Level | Description | Example Tools |
|-------|-------------|---------------|
| `READ_ONLY` | View-only operations | `list_vms`, `get_vm_info`, `vm_screenshot` |
| `POWER_OPS` | Power and snapshot operations | `power_on`, `create_snapshot`, `reboot_guest` |
| `VM_LIFECYCLE` | Create/delete/modify VMs | `create_vm`, `clone_vm`, `add_disk`, `deploy_ovf` |
| `HOST_ADMIN` | ESXi host operations | `reboot_host`, `enter_maintenance_mode` |
| `FULL_ADMIN` | Everything including guest OS ops | `run_command_in_guest`, `restart_service` |
### OAuth Groups → Permissions
Users are granted permissions based on their OAuth group memberships:
| OAuth Group | Permissions Granted |
|-------------|---------------------|
| `vsphere-readers` | READ_ONLY |
| `vsphere-operators` | READ_ONLY, POWER_OPS |
| `vsphere-admins` | READ_ONLY, POWER_OPS, VM_LIFECYCLE |
| `vsphere-host-admins` | READ_ONLY, POWER_OPS, VM_LIFECYCLE, HOST_ADMIN |
| `vsphere-super-admins` | ALL (full access) |
**Security Note:** Users with NO recognized groups are denied ALL access. There is no default permission.
### Tool → Permission Mapping
All 94 tools are mapped to permission levels in `src/mcvsphere/permissions.py`:
```python
class CredentialBroker:
"""Maps OAuth users to vCenter credentials."""
# READ_ONLY - 32 tools
"list_vms", "get_vm_info", "list_snapshots", "get_vm_stats", ...
def __init__(self, vcenter_host: str, fallback_user: str = None, fallback_password: str = None):
self.vcenter_host = vcenter_host
self.fallback_user = fallback_user # Service account fallback
self.fallback_password = fallback_password
self._connections: dict[str, ServiceInstance] = {}
# POWER_OPS - 14 tools
"power_on", "power_off", "create_snapshot", "revert_to_snapshot", ...
def get_connection_for_user(self, oauth_user: dict) -> ServiceInstance:
"""Get pyvmomi connection for this OAuth user."""
username = oauth_user.get("preferred_username")
# VM_LIFECYCLE - 33 tools
"create_vm", "clone_vm", "delete_vm", "add_disk", "deploy_ovf", ...
# Try per-user credentials first
try:
vcenter_creds = self._lookup_credentials(username)
return self._get_or_create_connection(
vcenter_creds["user"],
vcenter_creds["password"]
)
except KeyError:
# Fall back to service account
if self.fallback_user:
return self._get_or_create_connection(
self.fallback_user,
self.fallback_password
)
raise ValueError(f"No vCenter credentials for user: {username}")
# HOST_ADMIN - 6 tools
"enter_maintenance_mode", "reboot_host", "shutdown_host", ...
def _lookup_credentials(self, username: str) -> dict:
"""Look up vCenter credentials for OAuth user."""
# Option 1: Environment variable
env_key = f"VCENTER_PASSWORD_{username.upper().replace('@', '_').replace('.', '_')}"
if password := os.environ.get(env_key):
return {"user": f"{username}@vsphere.local", "password": password}
# Option 2: HashiCorp Vault (production)
# return vault_client.read(f"secret/vcenter/users/{username}")
raise KeyError(f"No credentials found for {username}")
# FULL_ADMIN - 11 tools
"run_command_in_guest", "write_guest_file", "restart_service", ...
```
---
## FastMCP OAuth Integration
## Implementation Details
### 1. Add OIDCProxy to server.py
### Key Files
| File | Purpose |
|------|---------|
| `src/mcvsphere/auth.py` | OIDCProxy configuration |
| `src/mcvsphere/permissions.py` | Permission levels and tool mappings |
| `src/mcvsphere/middleware.py` | RBACMiddleware implementation |
| `src/mcvsphere/audit.py` | Audit logging with user context |
| `src/mcvsphere/server.py` | Server setup with OAuth + RBAC |
### RBACMiddleware Flow
```python
import os
from fastmcp import FastMCP
from fastmcp.server.auth import OIDCProxy
class RBACMiddleware(Middleware):
"""Intercepts all tool calls to enforce permissions."""
# Configure OAuth with Authentik
auth = OIDCProxy(
# Authentik OIDC Discovery URL
config_url=os.environ["AUTHENTIK_OIDC_URL"],
# e.g., "https://auth.example.com/application/o/vsphere-mcp/.well-known/openid-configuration"
async def on_call_tool(self, context, call_next):
# 1. Extract user from OAuth token
claims = self._extract_user_from_context(context.fastmcp_context)
username = claims.get("preferred_username", "unknown")
groups = claims.get("groups", [])
# Application credentials from Authentik
client_id=os.environ["AUTHENTIK_CLIENT_ID"],
client_secret=os.environ["AUTHENTIK_CLIENT_SECRET"],
# 2. Check permission
tool_name = context.message.name
if not check_permission(tool_name, groups):
required = get_required_permission(tool_name)
audit_permission_denied(tool_name, {...}, required.value)
raise PermissionDeniedError(username, tool_name, required)
# MCP Server base URL (for redirects)
base_url=os.environ.get("MCP_BASE_URL", "http://localhost:8000"),
# 3. Execute tool with timing
start = time.perf_counter()
result = await call_next(context)
duration_ms = (time.perf_counter() - start) * 1000
# Token validation
required_scopes=["openid", "profile", "email"],
# Allow Claude Code localhost redirects
allowed_client_redirect_uris=["http://localhost:*", "http://127.0.0.1:*"],
)
# Create MCP server with OAuth
mcp = FastMCP(
"vSphere MCP Server",
auth=auth,
# Use Streamable HTTP transport for OAuth
)
# 4. Audit log
audit_log(tool_name, {...}, result="success", duration_ms=duration_ms)
return result
```
### 2. Access User Identity in Tools
### Audit Log Format
```python
from fastmcp import Context
```json
{
"timestamp": "2025-12-27T08:15:32.123456+00:00",
"user": "ryan@example.com",
"groups": ["vsphere-admins", "vsphere-operators"],
"tool": "power_on",
"args": {"vm_name": "web-server"},
"duration_ms": 1234.56,
"result": "success"
}
```
@mcp.tool()
async def power_on_vm(ctx: Context, vm_name: str) -> str:
"""Power on a virtual machine."""
# Get authenticated user from OAuth token
user = ctx.request_context.user
username = user.get("preferred_username", user.get("sub"))
# Get vCenter connection for this user
broker = get_credential_broker()
connection = broker.get_connection_for_user(user)
# Execute operation
content = connection.RetrieveContent()
vm = find_vm(content, vm_name)
vm.PowerOnVM_Task()
# Audit log
logger.info(f"User {username} powered on VM {vm_name}")
return f"VM '{vm_name}' is powering on"
Permission denied events:
```json
{
"timestamp": "2025-12-27T08:15:32.123456+00:00",
"user": "guest@example.com",
"groups": ["vsphere-readers"],
"tool": "delete_vm",
"args": {"vm_name": "web-server"},
"required_permission": "vm_lifecycle",
"event": "PERMISSION_DENIED"
}
```
---
## MCP Transport: Streamable HTTP
## Configuration
OAuth requires HTTP transport (not stdio). Use Streamable HTTP:
```python
# In server.py or via environment
mcp = FastMCP(
"vSphere MCP Server",
auth=auth,
)
def main():
# Run with HTTP transport for OAuth support
mcp.run(transport="streamable-http", host="0.0.0.0", port=8000)
```
**Transport characteristics:**
- Single HTTP endpoint (`/mcp`)
- POST requests with JSON-RPC body
- Server-Sent Events (SSE) for streaming responses
- `Authorization: Bearer <token>` header on every request
- `Mcp-Session-Id` header for session continuity
---
## Environment Variables
### Environment Variables
```bash
# Authentik OIDC
AUTHENTIK_OIDC_URL=https://auth.example.com/application/o/vsphere-mcp/.well-known/openid-configuration
AUTHENTIK_CLIENT_ID=<from-authentik-application>
AUTHENTIK_CLIENT_SECRET=<from-authentik-application>
# ═══════════════════════════════════════════════════════════════
# OAuth Configuration
# ═══════════════════════════════════════════════════════════════
OAUTH_ENABLED=true
OAUTH_ISSUER_URL=https://auth.example.com/application/o/mcvsphere/
OAUTH_CLIENT_ID=<from-oidc-provider>
OAUTH_CLIENT_SECRET=<from-oidc-provider>
OAUTH_BASE_URL=https://mcp.example.com # Public URL for callbacks
OAUTH_SCOPES='["openid", "profile", "email", "groups"]'
# MCP Server
MCP_BASE_URL=https://mcp.example.com # Public URL for OAuth redirects
# ═══════════════════════════════════════════════════════════════
# Transport (must be HTTP for OAuth)
# ═══════════════════════════════════════════════════════════════
MCP_TRANSPORT=streamable-http
MCP_HOST=0.0.0.0
MCP_PORT=8080
# vCenter Connection (service account fallback)
# ═══════════════════════════════════════════════════════════════
# vCenter Connection (service account)
# ═══════════════════════════════════════════════════════════════
VCENTER_HOST=vcenter.example.com
VCENTER_USER=svc-mcp@vsphere.local
VCENTER_USER=svc-mcvsphere@vsphere.local
VCENTER_PASSWORD=<service-account-password>
VCENTER_INSECURE=true
VCENTER_INSECURE=false
# Per-user credentials (optional, for testing)
VCENTER_PASSWORD_RYAN=<ryan's-vcenter-password>
VCENTER_PASSWORD_ALICE=<alice's-vcenter-password>
# ═══════════════════════════════════════════════════════════════
# Optional
# ═══════════════════════════════════════════════════════════════
LOG_LEVEL=INFO
```
# User Mapping Mode
USER_MAPPING_MODE=service_account # or 'per_user', 'ldap_sync'
### Server Startup Banner
When OAuth is enabled, the server shows:
```
mcvsphere v0.2.1
────────────────────────────────────────
Starting HTTP transport on 0.0.0.0:8080
OAuth: ENABLED via https://auth.example.com/application/o/mcvsphere/
RBAC: ENABLED - permissions enforced via groups
────────────────────────────────────────
```
---
## Authentik Setup (Quick Reference)
## OIDC Provider Setup
### Authentik (Recommended)
1. **Create OAuth2/OIDC Provider:**
- Name: `vsphere-mcp`
- Client Type: Confidential
- Name: `mcvsphere`
- Client Type: **Confidential**
- Redirect URIs:
- `http://localhost:*/callback`
- `http://localhost:*/callback` (for local dev)
- `https://mcp.example.com/auth/callback`
- Scopes: `openid`, `profile`, `email`
- Signing Key: Select or create RS256 key
- Signing Key: Select RS256 certificate
2. **Create Application:**
- Name: `vSphere MCP Server`
- Slug: `vsphere-mcp`
- Provider: Select the provider above
- Note the **Client ID** and **Client Secret**
- Name: `mcvsphere`
- Slug: `mcvsphere`
- Provider: Select provider from step 1
3. **Configure Groups (optional):**
- `vsphere-admins` - Full access
- `vsphere-operators` - Limited access
- Groups are included in JWT `groups` claim
3. **Create Groups:**
- `vsphere-readers`
- `vsphere-operators`
- `vsphere-admins`
- `vsphere-host-admins`
- `vsphere-super-admins`
4. **Add Scope Mapping for Groups:**
- Ensure `groups` claim is included in tokens
- Authentik includes this by default
5. **Note Credentials:**
- Copy Client ID and Client Secret
- Discovery URL: `https://auth.example.com/application/o/mcvsphere/.well-known/openid-configuration`
### Other Providers
The same pattern works with Keycloak, Auth0, Okta, etc. Key requirements:
- OIDC Discovery endpoint (`.well-known/openid-configuration`)
- JWT access tokens (not opaque)
- `groups` claim in tokens with group names
---
## OAuth Flow (End-to-End)
## OAuth Flow
```
1. Claude Code connects to MCP Server
→ GET /mcp
→ Server returns 401 Unauthorized
→ WWW-Authenticate header includes OAuth metadata URL
1. Client connects to mcvsphere
→ POST /mcp (no auth)
→ Server returns 401 + OAuth metadata URL
2. Claude Code fetches OAuth metadata
→ Discovers Authentik authorization URL
→ Discovers required scopes
2. Client initiates OAuth flow
→ Opens browser to OIDC provider
→ User logs in
→ Provider redirects with authorization code
3. Claude Code initiates OAuth flow
→ Opens browser to Authentik login page
→ User enters credentials
→ Authentik redirects back with authorization code
3. Client exchanges code for tokens
→ POST to provider token endpoint
→ Receives JWT access token
4. Claude Code exchanges code for tokens
→ POST to Authentik token endpoint
→ Receives JWT access token + refresh token
5. Claude Code reconnects with Bearer token
4. Client reconnects with token
→ POST /mcp with Authorization: Bearer <jwt>
→ Server validates JWT via Authentik JWKS
→ Server extracts user identity
→ User can now invoke tools
→ Server validates JWT via JWKS
→ Server extracts user + groups
→ RBACMiddleware checks permissions
→ User can invoke allowed tools
6. Tool invocation
→ Client: "power on web-server VM"
Server: Validates token, maps user to vCenter creds
Server: Executes pyvmomi call
Server: Logs "User ryan@example.com powered on web-server"
→ Client: Receives success response
5. Tool invocation
→ Client: "power on web-server"
Middleware: Validate user has POWER_OPS
Tool: Execute pyvmomi call
Audit: Log with user identity
→ Client: Receive response
```
---
## Implementation Checklist
## Implementation Status
### Phase 1: Prepare Server
- [ ] Add `fastmcp[auth]` to dependencies
- [ ] Create `auth.py` with OIDCProxy configuration
- [ ] Create `credential_broker.py` for user mapping
- [ ] Add audit logging to all tools
- [ ] Update server.py to use HTTP transport
### Completed
### Phase 2: Deploy Authentik
- [ ] Docker Compose for Authentik
- [ ] Create OIDC provider and application
- [ ] Configure redirect URIs
- [ ] Note client credentials
- [ ] Test OIDC flow manually with curl
- [x] OIDCProxy configuration (`auth.py`)
- [x] Permission levels and tool mappings (`permissions.py`)
- [x] RBACMiddleware with FastMCP integration (`middleware.py`)
- [x] Audit logging with user context (`audit.py`)
- [x] Server integration with OAuth + RBAC (`server.py`)
- [x] Startup banner showing OAuth/RBAC status
- [x] Security fix: deny-by-default for no groups
- [x] Authentik setup with 5 vsphere-* groups
### Phase 3: Integration
- [ ] Configure environment variables
- [ ] Test with `fastmcp dev` (OAuth mode)
- [ ] Test with Claude Code (`claude mcp add --auth oauth`)
- [ ] Verify audit logs show correct user identity
### Future Enhancements
### Phase 4: Production
- [ ] HTTPS via Caddy reverse proxy
- [ ] Secrets in Docker secrets / Vault
- [ ] Service account with minimal vCenter permissions
- [ ] Log aggregation and monitoring
- [ ] Per-user vCenter credential mapping (Vault integration)
- [ ] Rate limiting per user
- [ ] Session management and token refresh
- [ ] Admin tools for permission management
- [ ] Prometheus metrics for RBAC decisions
---
## Files to Create/Modify
## Security Considerations
```
esxi-mcp-server/
├── src/esxi_mcp_server/
│ ├── auth.py # NEW: OIDCProxy configuration
│ ├── credential_broker.py # NEW: OAuth → vCenter credential mapping
│ ├── server.py # MODIFY: Add auth, HTTP transport
│ └── mixins/
│ └── *.py # MODIFY: Add ctx.request_context.user logging
├── .env.example # MODIFY: Add OAuth variables
└── docker-compose.yml # MODIFY: Add Authentik services
```
1. **Default Deny**: Users without recognized groups get NO access
2. **Token Validation**: JWTs validated via OIDC provider's JWKS endpoint
3. **Audit Trail**: All operations logged with user identity
4. **Secrets**: Client secrets should be stored securely (env vars, Docker secrets, Vault)
5. **HTTPS**: Production deployments should use TLS (via Caddy, nginx, etc.)
6. **Service Account**: Use minimal vCenter permissions for the service account
---
## Key Insight
## Troubleshooting
The "middleman" role of the MCP server is critical:
### "401 Unauthorized" on all requests
- Check `OAUTH_ISSUER_URL` points to valid OIDC discovery endpoint
- Verify client ID and secret match provider configuration
- Ensure token hasn't expired
```
OAuth Token (Authentik) ──┐
┌─────────────┐
│ MCP Server │ ← Validates OAuth, maps to vCenter creds
│ (Middleman) │ ← Logs audit trail with OAuth identity
└─────────────┘
vCenter API (pyvmomi)
```
### "Permission denied" errors
- Check user's group memberships in OIDC provider
- Verify groups claim is included in JWT (decode at jwt.io)
- Confirm group names match exactly (e.g., `vsphere-admins` not `vsphere_admins`)
The MCP server doesn't pass OAuth tokens to vCenter. Instead, it:
1. **Authenticates** users via OAuth (trusts Authentik)
2. **Authorizes** by mapping OAuth identity to vCenter credentials
3. **Audits** by logging all actions with the OAuth user identity
4. **Executes** vCenter API calls using mapped credentials
### Token validation fails
- Ensure OIDC provider issues JWTs (not opaque tokens)
- Check signing key is configured in provider
- Verify `OAUTH_BASE_URL` matches redirect URI in provider
This gives you SSO-like experience while working within vCenter 7.0.3's authentication limitations.
### Audit logs not showing user
- Check `groups` scope is requested
- Verify token contains `preferred_username` or `email` claim