docs: update OAuth/RBAC architecture documentation

Rewrites the architecture doc from design proposal to implementation
reference. Documents the complete RBAC system including:

- 5 permission levels (READ_ONLY → FULL_ADMIN)
- 5 OAuth groups with permission mappings
- RBACMiddleware implementation details
- Audit log format with user identity
- Configuration environment variables
- OIDC provider setup (Authentik example)
- Troubleshooting guide for common issues

Updates implementation checklist to reflect completed status.
This commit is contained in:
Ryan Malloy 2025-12-27 08:22:02 -07:00
parent 00857b1840
commit ab83c70c31

View File

@ -1,17 +1,17 @@
# OAuth Architecture for vSphere MCP Server # OAuth & RBAC Architecture for mcvsphere
## The Problem ## Overview
We need to add authentication to the MCP server so that: mcvsphere supports multi-user OAuth 2.1 authentication with Role-Based Access Control (RBAC). This enables:
1. Users authenticate via **OAuth 2.1 / OIDC** (using Authentik as IdP)
2. The MCP server knows WHO is making requests (for audit logging)
3. vCenter permissions are respected per-user
**Challenge:** vCenter 7.0.3 doesn't support OAuth token exchange (RFC 8693), so we can't pass OAuth tokens directly to vCenter. 1. **Single Sign-On** via any OIDC provider (Authentik, Keycloak, Auth0, etc.)
2. **User Identity** for audit logging - know WHO made each request
3. **Group-Based Permissions** - control what users can do based on OAuth groups
4. **Audit Trail** - every tool invocation logged with user identity and timing
--- ---
## Architecture Overview ## Architecture
``` ```
┌─────────────────────────────────────────────────────────────────┐ ┌─────────────────────────────────────────────────────────────────┐
@ -22,360 +22,352 @@ We need to add authentication to the MCP server so that:
│ (browser opens for login) │ (browser opens for login)
┌────────────────────────────▼────────────────────────────────────┐ ┌────────────────────────────▼────────────────────────────────────┐
Authentik OIDC Provider
(Self-hosted OIDC IdP) (Authentik, Keycloak, Auth0, etc.)
│ │ │ │
│ - Issues JWT access tokens │ │ - Issues JWT access tokens │
│ - Validates user credentials │ │ - Validates user credentials │
│ - Includes user identity in token (sub, email, groups) │ - Includes groups claim in token
└────────────────────────────┬────────────────────────────────────┘ └────────────────────────────┬────────────────────────────────────┘
│ 2. JWT Bearer token │ 2. JWT Bearer token
│ Authorization: Bearer <jwt> │ Authorization: Bearer <jwt>
┌────────────────────────────▼────────────────────────────────────┐ ┌────────────────────────────▼────────────────────────────────────┐
vSphere MCP Server mcvsphere
│ (FastMCP + pyvmomi) │ │ (FastMCP + pyvmomi) │
│ │ │ │
│ ┌─────────────────────────────────────────────────────────┐ │ │ ┌─────────────────────────────────────────────────────────┐ │
│ │ OIDCProxy (FastMCP) │ │ │ │ OIDCProxy (FastMCP) │ │
│ │ - Validates JWT signature via Authentik JWKS │ │ │ │ - Validates JWT signature via JWKS endpoint │ │
│ │ - Extracts user identity (preferred_username) │ │ │ │ - Extracts user identity (preferred_username, email) │ │
│ │ - Makes user available via ctx.request_context.user │ │ │ │ - Extracts groups from token claims │ │
│ └─────────────────────────────────────────────────────────┘ │ │ └─────────────────────────────────────────────────────────┘ │
│ │ │ │ │ │
│ ▼ │ │ ▼ │
│ ┌─────────────────────────────────────────────────────────┐ │ │ ┌─────────────────────────────────────────────────────────┐ │
│ │ Credential Broker │ │ │ │ RBACMiddleware │ │
│ │ - Maps OAuth user → vCenter credentials │ │ │ │ - Intercepts ALL tool calls via on_call_tool() │ │
│ │ - Caches pyvmomi connections per-user │ │ │ │ - Maps OAuth groups → Permission levels │ │
│ │ - Retrieves passwords from Vault / env vars │ │ │ │ - Denies access if user lacks required permission │ │
│ │ - Logs audit events with user identity │ │
│ └─────────────────────────────────────────────────────────┘ │ │ └─────────────────────────────────────────────────────────┘ │
│ │ │ │ │ │
│ ▼ │ │ ▼ │
│ ┌─────────────────────────────────────────────────────────┐ │ │ ┌─────────────────────────────────────────────────────────┐ │
│ │ Audit Logger │ │ │ │ VMware Tools (94) │ │
│ │ - Logs all tool invocations with OAuth identity │ │ │ │ - Execute vCenter/ESXi operations via pyvmomi │ │
│ │ - "User ryan@example.com powered on VM web-server" │ │ │ │ - Single service account connection to vCenter │ │
│ └─────────────────────────────────────────────────────────┘ │ │ └─────────────────────────────────────────────────────────┘ │
└────────────────────────────┬────────────────────────────────────┘ └────────────────────────────┬────────────────────────────────────┘
│ 3. pyvmomi (as mapped user) │ 3. pyvmomi (service account)
┌────────────────────────────▼────────────────────────────────────┐ ┌────────────────────────────▼────────────────────────────────────┐
│ vCenter 7.0.3 │ │ vCenter / ESXi │
│ - Receives API calls as the actual user │ │ - Receives API calls as service account │
│ - Native audit logs show real user identity │ │ - mcvsphere audit logs show real user identity │
│ - vCenter permissions apply naturally │
└─────────────────────────────────────────────────────────────────┘ └─────────────────────────────────────────────────────────────────┘
``` ```
--- ---
## User Mapping Strategies ## RBAC Permission Model
Since we can't exchange OAuth tokens for vCenter tokens, we need a "credential broker": ### Permission Levels
| Strategy | How it Works | Security | Use Case | mcvsphere defines 5 permission levels, from least to most privileged:
|----------|--------------|----------|----------|
| **Service Account** | All requests use one vCenter admin account | Medium | Simple/dev |
| **Per-User Mapping** | Map OAuth username → vCenter credentials from Vault | High | Production |
| **LDAP Sync** | Same username/password in Authentik and vCenter SSO | Medium | AD environments |
### Recommended: Per-User Mapping with Fallback | Level | Description | Example Tools |
|-------|-------------|---------------|
| `READ_ONLY` | View-only operations | `list_vms`, `get_vm_info`, `vm_screenshot` |
| `POWER_OPS` | Power and snapshot operations | `power_on`, `create_snapshot`, `reboot_guest` |
| `VM_LIFECYCLE` | Create/delete/modify VMs | `create_vm`, `clone_vm`, `add_disk`, `deploy_ovf` |
| `HOST_ADMIN` | ESXi host operations | `reboot_host`, `enter_maintenance_mode` |
| `FULL_ADMIN` | Everything including guest OS ops | `run_command_in_guest`, `restart_service` |
### OAuth Groups → Permissions
Users are granted permissions based on their OAuth group memberships:
| OAuth Group | Permissions Granted |
|-------------|---------------------|
| `vsphere-readers` | READ_ONLY |
| `vsphere-operators` | READ_ONLY, POWER_OPS |
| `vsphere-admins` | READ_ONLY, POWER_OPS, VM_LIFECYCLE |
| `vsphere-host-admins` | READ_ONLY, POWER_OPS, VM_LIFECYCLE, HOST_ADMIN |
| `vsphere-super-admins` | ALL (full access) |
**Security Note:** Users with NO recognized groups are denied ALL access. There is no default permission.
### Tool → Permission Mapping
All 94 tools are mapped to permission levels in `src/mcvsphere/permissions.py`:
```python ```python
class CredentialBroker: # READ_ONLY - 32 tools
"""Maps OAuth users to vCenter credentials.""" "list_vms", "get_vm_info", "list_snapshots", "get_vm_stats", ...
def __init__(self, vcenter_host: str, fallback_user: str = None, fallback_password: str = None): # POWER_OPS - 14 tools
self.vcenter_host = vcenter_host "power_on", "power_off", "create_snapshot", "revert_to_snapshot", ...
self.fallback_user = fallback_user # Service account fallback
self.fallback_password = fallback_password
self._connections: dict[str, ServiceInstance] = {}
def get_connection_for_user(self, oauth_user: dict) -> ServiceInstance: # VM_LIFECYCLE - 33 tools
"""Get pyvmomi connection for this OAuth user.""" "create_vm", "clone_vm", "delete_vm", "add_disk", "deploy_ovf", ...
username = oauth_user.get("preferred_username")
# Try per-user credentials first # HOST_ADMIN - 6 tools
try: "enter_maintenance_mode", "reboot_host", "shutdown_host", ...
vcenter_creds = self._lookup_credentials(username)
return self._get_or_create_connection(
vcenter_creds["user"],
vcenter_creds["password"]
)
except KeyError:
# Fall back to service account
if self.fallback_user:
return self._get_or_create_connection(
self.fallback_user,
self.fallback_password
)
raise ValueError(f"No vCenter credentials for user: {username}")
def _lookup_credentials(self, username: str) -> dict: # FULL_ADMIN - 11 tools
"""Look up vCenter credentials for OAuth user.""" "run_command_in_guest", "write_guest_file", "restart_service", ...
# Option 1: Environment variable
env_key = f"VCENTER_PASSWORD_{username.upper().replace('@', '_').replace('.', '_')}"
if password := os.environ.get(env_key):
return {"user": f"{username}@vsphere.local", "password": password}
# Option 2: HashiCorp Vault (production)
# return vault_client.read(f"secret/vcenter/users/{username}")
raise KeyError(f"No credentials found for {username}")
``` ```
--- ---
## FastMCP OAuth Integration ## Implementation Details
### 1. Add OIDCProxy to server.py ### Key Files
| File | Purpose |
|------|---------|
| `src/mcvsphere/auth.py` | OIDCProxy configuration |
| `src/mcvsphere/permissions.py` | Permission levels and tool mappings |
| `src/mcvsphere/middleware.py` | RBACMiddleware implementation |
| `src/mcvsphere/audit.py` | Audit logging with user context |
| `src/mcvsphere/server.py` | Server setup with OAuth + RBAC |
### RBACMiddleware Flow
```python ```python
import os class RBACMiddleware(Middleware):
from fastmcp import FastMCP """Intercepts all tool calls to enforce permissions."""
from fastmcp.server.auth import OIDCProxy
# Configure OAuth with Authentik async def on_call_tool(self, context, call_next):
auth = OIDCProxy( # 1. Extract user from OAuth token
# Authentik OIDC Discovery URL claims = self._extract_user_from_context(context.fastmcp_context)
config_url=os.environ["AUTHENTIK_OIDC_URL"], username = claims.get("preferred_username", "unknown")
# e.g., "https://auth.example.com/application/o/vsphere-mcp/.well-known/openid-configuration" groups = claims.get("groups", [])
# Application credentials from Authentik # 2. Check permission
client_id=os.environ["AUTHENTIK_CLIENT_ID"], tool_name = context.message.name
client_secret=os.environ["AUTHENTIK_CLIENT_SECRET"], if not check_permission(tool_name, groups):
required = get_required_permission(tool_name)
audit_permission_denied(tool_name, {...}, required.value)
raise PermissionDeniedError(username, tool_name, required)
# MCP Server base URL (for redirects) # 3. Execute tool with timing
base_url=os.environ.get("MCP_BASE_URL", "http://localhost:8000"), start = time.perf_counter()
result = await call_next(context)
duration_ms = (time.perf_counter() - start) * 1000
# Token validation # 4. Audit log
required_scopes=["openid", "profile", "email"], audit_log(tool_name, {...}, result="success", duration_ms=duration_ms)
return result
# Allow Claude Code localhost redirects
allowed_client_redirect_uris=["http://localhost:*", "http://127.0.0.1:*"],
)
# Create MCP server with OAuth
mcp = FastMCP(
"vSphere MCP Server",
auth=auth,
# Use Streamable HTTP transport for OAuth
)
``` ```
### 2. Access User Identity in Tools ### Audit Log Format
```python ```json
from fastmcp import Context {
"timestamp": "2025-12-27T08:15:32.123456+00:00",
"user": "ryan@example.com",
"groups": ["vsphere-admins", "vsphere-operators"],
"tool": "power_on",
"args": {"vm_name": "web-server"},
"duration_ms": 1234.56,
"result": "success"
}
```
@mcp.tool() Permission denied events:
async def power_on_vm(ctx: Context, vm_name: str) -> str: ```json
"""Power on a virtual machine.""" {
# Get authenticated user from OAuth token "timestamp": "2025-12-27T08:15:32.123456+00:00",
user = ctx.request_context.user "user": "guest@example.com",
username = user.get("preferred_username", user.get("sub")) "groups": ["vsphere-readers"],
"tool": "delete_vm",
# Get vCenter connection for this user "args": {"vm_name": "web-server"},
broker = get_credential_broker() "required_permission": "vm_lifecycle",
connection = broker.get_connection_for_user(user) "event": "PERMISSION_DENIED"
}
# Execute operation
content = connection.RetrieveContent()
vm = find_vm(content, vm_name)
vm.PowerOnVM_Task()
# Audit log
logger.info(f"User {username} powered on VM {vm_name}")
return f"VM '{vm_name}' is powering on"
``` ```
--- ---
## MCP Transport: Streamable HTTP ## Configuration
OAuth requires HTTP transport (not stdio). Use Streamable HTTP: ### Environment Variables
```python
# In server.py or via environment
mcp = FastMCP(
"vSphere MCP Server",
auth=auth,
)
def main():
# Run with HTTP transport for OAuth support
mcp.run(transport="streamable-http", host="0.0.0.0", port=8000)
```
**Transport characteristics:**
- Single HTTP endpoint (`/mcp`)
- POST requests with JSON-RPC body
- Server-Sent Events (SSE) for streaming responses
- `Authorization: Bearer <token>` header on every request
- `Mcp-Session-Id` header for session continuity
---
## Environment Variables
```bash ```bash
# Authentik OIDC # ═══════════════════════════════════════════════════════════════
AUTHENTIK_OIDC_URL=https://auth.example.com/application/o/vsphere-mcp/.well-known/openid-configuration # OAuth Configuration
AUTHENTIK_CLIENT_ID=<from-authentik-application> # ═══════════════════════════════════════════════════════════════
AUTHENTIK_CLIENT_SECRET=<from-authentik-application> OAUTH_ENABLED=true
OAUTH_ISSUER_URL=https://auth.example.com/application/o/mcvsphere/
OAUTH_CLIENT_ID=<from-oidc-provider>
OAUTH_CLIENT_SECRET=<from-oidc-provider>
OAUTH_BASE_URL=https://mcp.example.com # Public URL for callbacks
OAUTH_SCOPES='["openid", "profile", "email", "groups"]'
# MCP Server # ═══════════════════════════════════════════════════════════════
MCP_BASE_URL=https://mcp.example.com # Public URL for OAuth redirects # Transport (must be HTTP for OAuth)
# ═══════════════════════════════════════════════════════════════
MCP_TRANSPORT=streamable-http MCP_TRANSPORT=streamable-http
MCP_HOST=0.0.0.0
MCP_PORT=8080
# vCenter Connection (service account fallback) # ═══════════════════════════════════════════════════════════════
# vCenter Connection (service account)
# ═══════════════════════════════════════════════════════════════
VCENTER_HOST=vcenter.example.com VCENTER_HOST=vcenter.example.com
VCENTER_USER=svc-mcp@vsphere.local VCENTER_USER=svc-mcvsphere@vsphere.local
VCENTER_PASSWORD=<service-account-password> VCENTER_PASSWORD=<service-account-password>
VCENTER_INSECURE=true VCENTER_INSECURE=false
# Per-user credentials (optional, for testing) # ═══════════════════════════════════════════════════════════════
VCENTER_PASSWORD_RYAN=<ryan's-vcenter-password> # Optional
VCENTER_PASSWORD_ALICE=<alice's-vcenter-password> # ═══════════════════════════════════════════════════════════════
LOG_LEVEL=INFO
```
# User Mapping Mode ### Server Startup Banner
USER_MAPPING_MODE=service_account # or 'per_user', 'ldap_sync'
When OAuth is enabled, the server shows:
```
mcvsphere v0.2.1
────────────────────────────────────────
Starting HTTP transport on 0.0.0.0:8080
OAuth: ENABLED via https://auth.example.com/application/o/mcvsphere/
RBAC: ENABLED - permissions enforced via groups
────────────────────────────────────────
``` ```
--- ---
## Authentik Setup (Quick Reference) ## OIDC Provider Setup
### Authentik (Recommended)
1. **Create OAuth2/OIDC Provider:** 1. **Create OAuth2/OIDC Provider:**
- Name: `vsphere-mcp` - Name: `mcvsphere`
- Client Type: Confidential - Client Type: **Confidential**
- Redirect URIs: - Redirect URIs:
- `http://localhost:*/callback` - `http://localhost:*/callback` (for local dev)
- `https://mcp.example.com/auth/callback` - `https://mcp.example.com/auth/callback`
- Scopes: `openid`, `profile`, `email` - Signing Key: Select RS256 certificate
- Signing Key: Select or create RS256 key
2. **Create Application:** 2. **Create Application:**
- Name: `vSphere MCP Server` - Name: `mcvsphere`
- Slug: `vsphere-mcp` - Slug: `mcvsphere`
- Provider: Select the provider above - Provider: Select provider from step 1
- Note the **Client ID** and **Client Secret**
3. **Configure Groups (optional):** 3. **Create Groups:**
- `vsphere-admins` - Full access - `vsphere-readers`
- `vsphere-operators` - Limited access - `vsphere-operators`
- Groups are included in JWT `groups` claim - `vsphere-admins`
- `vsphere-host-admins`
- `vsphere-super-admins`
4. **Add Scope Mapping for Groups:**
- Ensure `groups` claim is included in tokens
- Authentik includes this by default
5. **Note Credentials:**
- Copy Client ID and Client Secret
- Discovery URL: `https://auth.example.com/application/o/mcvsphere/.well-known/openid-configuration`
### Other Providers
The same pattern works with Keycloak, Auth0, Okta, etc. Key requirements:
- OIDC Discovery endpoint (`.well-known/openid-configuration`)
- JWT access tokens (not opaque)
- `groups` claim in tokens with group names
--- ---
## OAuth Flow (End-to-End) ## OAuth Flow
``` ```
1. Claude Code connects to MCP Server 1. Client connects to mcvsphere
→ GET /mcp → POST /mcp (no auth)
→ Server returns 401 Unauthorized → Server returns 401 + OAuth metadata URL
→ WWW-Authenticate header includes OAuth metadata URL
2. Claude Code fetches OAuth metadata 2. Client initiates OAuth flow
→ Discovers Authentik authorization URL → Opens browser to OIDC provider
→ Discovers required scopes → User logs in
→ Provider redirects with authorization code
3. Claude Code initiates OAuth flow 3. Client exchanges code for tokens
→ Opens browser to Authentik login page → POST to provider token endpoint
→ User enters credentials → Receives JWT access token
→ Authentik redirects back with authorization code
4. Claude Code exchanges code for tokens 4. Client reconnects with token
→ POST to Authentik token endpoint
→ Receives JWT access token + refresh token
5. Claude Code reconnects with Bearer token
→ POST /mcp with Authorization: Bearer <jwt> → POST /mcp with Authorization: Bearer <jwt>
→ Server validates JWT via Authentik JWKS → Server validates JWT via JWKS
→ Server extracts user identity → Server extracts user + groups
→ User can now invoke tools → RBACMiddleware checks permissions
→ User can invoke allowed tools
6. Tool invocation 5. Tool invocation
→ Client: "power on web-server VM" → Client: "power on web-server"
Server: Validates token, maps user to vCenter creds Middleware: Validate user has POWER_OPS
Server: Executes pyvmomi call Tool: Execute pyvmomi call
Server: Logs "User ryan@example.com powered on web-server" Audit: Log with user identity
→ Client: Receives success response → Client: Receive response
``` ```
--- ---
## Implementation Checklist ## Implementation Status
### Phase 1: Prepare Server ### Completed
- [ ] Add `fastmcp[auth]` to dependencies
- [ ] Create `auth.py` with OIDCProxy configuration
- [ ] Create `credential_broker.py` for user mapping
- [ ] Add audit logging to all tools
- [ ] Update server.py to use HTTP transport
### Phase 2: Deploy Authentik - [x] OIDCProxy configuration (`auth.py`)
- [ ] Docker Compose for Authentik - [x] Permission levels and tool mappings (`permissions.py`)
- [ ] Create OIDC provider and application - [x] RBACMiddleware with FastMCP integration (`middleware.py`)
- [ ] Configure redirect URIs - [x] Audit logging with user context (`audit.py`)
- [ ] Note client credentials - [x] Server integration with OAuth + RBAC (`server.py`)
- [ ] Test OIDC flow manually with curl - [x] Startup banner showing OAuth/RBAC status
- [x] Security fix: deny-by-default for no groups
- [x] Authentik setup with 5 vsphere-* groups
### Phase 3: Integration ### Future Enhancements
- [ ] Configure environment variables
- [ ] Test with `fastmcp dev` (OAuth mode)
- [ ] Test with Claude Code (`claude mcp add --auth oauth`)
- [ ] Verify audit logs show correct user identity
### Phase 4: Production - [ ] Per-user vCenter credential mapping (Vault integration)
- [ ] HTTPS via Caddy reverse proxy - [ ] Rate limiting per user
- [ ] Secrets in Docker secrets / Vault - [ ] Session management and token refresh
- [ ] Service account with minimal vCenter permissions - [ ] Admin tools for permission management
- [ ] Log aggregation and monitoring - [ ] Prometheus metrics for RBAC decisions
--- ---
## Files to Create/Modify ## Security Considerations
``` 1. **Default Deny**: Users without recognized groups get NO access
esxi-mcp-server/ 2. **Token Validation**: JWTs validated via OIDC provider's JWKS endpoint
├── src/esxi_mcp_server/ 3. **Audit Trail**: All operations logged with user identity
│ ├── auth.py # NEW: OIDCProxy configuration 4. **Secrets**: Client secrets should be stored securely (env vars, Docker secrets, Vault)
│ ├── credential_broker.py # NEW: OAuth → vCenter credential mapping 5. **HTTPS**: Production deployments should use TLS (via Caddy, nginx, etc.)
│ ├── server.py # MODIFY: Add auth, HTTP transport 6. **Service Account**: Use minimal vCenter permissions for the service account
│ └── mixins/
│ └── *.py # MODIFY: Add ctx.request_context.user logging
├── .env.example # MODIFY: Add OAuth variables
└── docker-compose.yml # MODIFY: Add Authentik services
```
--- ---
## Key Insight ## Troubleshooting
The "middleman" role of the MCP server is critical: ### "401 Unauthorized" on all requests
- Check `OAUTH_ISSUER_URL` points to valid OIDC discovery endpoint
- Verify client ID and secret match provider configuration
- Ensure token hasn't expired
``` ### "Permission denied" errors
OAuth Token (Authentik) ──┐ - Check user's group memberships in OIDC provider
- Verify groups claim is included in JWT (decode at jwt.io)
- Confirm group names match exactly (e.g., `vsphere-admins` not `vsphere_admins`)
┌─────────────┐
│ MCP Server │ ← Validates OAuth, maps to vCenter creds
│ (Middleman) │ ← Logs audit trail with OAuth identity
└─────────────┘
vCenter API (pyvmomi)
```
The MCP server doesn't pass OAuth tokens to vCenter. Instead, it: ### Token validation fails
1. **Authenticates** users via OAuth (trusts Authentik) - Ensure OIDC provider issues JWTs (not opaque tokens)
2. **Authorizes** by mapping OAuth identity to vCenter credentials - Check signing key is configured in provider
3. **Audits** by logging all actions with the OAuth user identity - Verify `OAUTH_BASE_URL` matches redirect URI in provider
4. **Executes** vCenter API calls using mapped credentials
This gives you SSO-like experience while working within vCenter 7.0.3's authentication limitations. ### Audit logs not showing user
- Check `groups` scope is requested
- Verify token contains `preferred_username` or `email` claim