🔧 Fix output path security with MCP_PDF_ALLOWED_PATHS environment variable
BREAKING ISSUE FIXED: - Users reported "Output path not allowed: images" error - extract_images tool was rejecting relative paths due to overly restrictive security NEW SECURITY MODEL: - MCP_PDF_ALLOWED_PATHS environment variable controls allowed output directories - If unset: Allows any directory with "security theater" warnings - If set: Restricts outputs to specified colon-separated paths - Cross-platform compatible (: on Unix, ; on Windows) SECURITY PHILOSOPHY ENHANCED: - "TRUST NO ONE" - honest about application-level security limitations - Clear warnings that this is "security theater" - Emphasis on OS-level permissions and process isolation - Educational guidance on real security practices TECHNICAL CHANGES: - validate_output_path() rewritten with environment variable control - Path validation uses relative_to() for proper containment checking - Enhanced warning messages with security education - Updated documentation with honest security assessment DOCUMENTATION UPDATES: - Added MCP_PDF_ALLOWED_PATHS to configuration section - New "REAL Security" section with OS-level recommendations - Clear explanation of security theater vs actual protection Version: 1.1.1 (patch version for critical bugfix)
This commit is contained in:
parent
856dd41996
commit
8cbf542df1
22
CLAUDE.md
22
CLAUDE.md
@ -134,11 +134,19 @@ Critical system dependencies:
|
|||||||
Environment variables (optional):
|
Environment variables (optional):
|
||||||
- `TESSDATA_PREFIX`: Tesseract language data location
|
- `TESSDATA_PREFIX`: Tesseract language data location
|
||||||
- `PDF_TEMP_DIR`: Temporary file processing directory (defaults to `/tmp/mcp-pdf-processing`)
|
- `PDF_TEMP_DIR`: Temporary file processing directory (defaults to `/tmp/mcp-pdf-processing`)
|
||||||
|
- `MCP_PDF_ALLOWED_PATHS`: Colon-separated list of allowed output directories (e.g., `/tmp:/home/user/documents:/var/output`)
|
||||||
|
- If unset: Allows writes to any directory with security warnings
|
||||||
|
- If set: Restricts file outputs to specified directories only
|
||||||
|
- **SECURITY NOTE**: This is "security theater" - real protection requires OS-level permissions and process isolation
|
||||||
- `DEBUG`: Enable debug logging
|
- `DEBUG`: Enable debug logging
|
||||||
|
|
||||||
### Security Features
|
### Security Features
|
||||||
|
|
||||||
The server implements comprehensive security hardening:
|
**🔒 "TRUST NO ONE" Security Philosophy**
|
||||||
|
|
||||||
|
This server implements defense-in-depth, but remember: **application-level security is "theater" - real security comes from the operating system and deployment practices.**
|
||||||
|
|
||||||
|
**Application-Level Protections (Security Theater):**
|
||||||
|
|
||||||
**Input Validation:**
|
**Input Validation:**
|
||||||
- File size limits: 100MB for PDFs, 50MB for images
|
- File size limits: 100MB for PDFs, 50MB for images
|
||||||
@ -167,6 +175,18 @@ The server implements comprehensive security hardening:
|
|||||||
- GitHub Actions workflow for continuous security monitoring
|
- GitHub Actions workflow for continuous security monitoring
|
||||||
- Daily automated vulnerability assessments
|
- Daily automated vulnerability assessments
|
||||||
|
|
||||||
|
**⚡ REAL Security (What Actually Matters):**
|
||||||
|
|
||||||
|
1. **Process Isolation**: Run as non-privileged user with minimal permissions
|
||||||
|
2. **OS-Level Controls**: Use chroot/containers/systemd to limit filesystem access
|
||||||
|
3. **Network Isolation**: Firewall rules, network namespaces, air-gapped environments
|
||||||
|
4. **Resource Limits**: ulimit, cgroups, memory/CPU quotas at the OS level
|
||||||
|
5. **File Permissions**: Proper Unix permissions (chmod/chown) on directories and files
|
||||||
|
6. **Monitoring**: System-level audit logs, not application logs
|
||||||
|
7. **Regular Updates**: Keep OS, libraries, and dependencies patched
|
||||||
|
|
||||||
|
**Remember**: If an attacker has code execution, application-level restrictions are meaningless. Defense-in-depth starts with the operating system.
|
||||||
|
|
||||||
## Development Notes
|
## Development Notes
|
||||||
|
|
||||||
### Testing Strategy
|
### Testing Strategy
|
||||||
|
|||||||
@ -1,6 +1,6 @@
|
|||||||
[project]
|
[project]
|
||||||
name = "mcp-pdf"
|
name = "mcp-pdf"
|
||||||
version = "1.1.0"
|
version = "1.1.1"
|
||||||
description = "Secure FastMCP server for comprehensive PDF processing - text extraction, OCR, table extraction, forms, annotations, and more"
|
description = "Secure FastMCP server for comprehensive PDF processing - text extraction, OCR, table extraction, forms, annotations, and more"
|
||||||
authors = [{name = "Ryan Malloy", email = "ryan@malloys.us"}]
|
authors = [{name = "Ryan Malloy", email = "ryan@malloys.us"}]
|
||||||
readme = "README.md"
|
readme = "README.md"
|
||||||
|
|||||||
@ -82,10 +82,32 @@ def validate_output_path(path: str) -> Path:
|
|||||||
if '../' in str(path) or '\\..\\' in str(path):
|
if '../' in str(path) or '\\..\\' in str(path):
|
||||||
raise ValueError("Path traversal detected in output path")
|
raise ValueError("Path traversal detected in output path")
|
||||||
|
|
||||||
# Ensure path is within safe directories
|
# Check allowed output paths from environment variable
|
||||||
safe_prefixes = ['/tmp', '/var/tmp', str(CACHE_DIR.resolve())]
|
allowed_paths = os.getenv('MCP_PDF_ALLOWED_PATHS')
|
||||||
if not any(str(resolved_path).startswith(prefix) for prefix in safe_prefixes):
|
|
||||||
raise ValueError(f"Output path not allowed: {path}")
|
if allowed_paths is None:
|
||||||
|
# No restriction set - warn user but allow any path
|
||||||
|
logger.warning(f"MCP_PDF_ALLOWED_PATHS not set - allowing write to any directory: {resolved_path}")
|
||||||
|
logger.warning("SECURITY NOTE: This restriction is 'security theater' - real protection comes from OS-level permissions")
|
||||||
|
logger.warning("Recommended: Set MCP_PDF_ALLOWED_PATHS='/tmp:/var/tmp:/home/user/documents' AND use proper file permissions")
|
||||||
|
logger.warning("For true security: Run this server with limited user permissions, not as root/admin")
|
||||||
|
return resolved_path
|
||||||
|
|
||||||
|
# Parse allowed paths (semicolon or colon separated for cross-platform compatibility)
|
||||||
|
separator = ';' if os.name == 'nt' else ':'
|
||||||
|
allowed_prefixes = [Path(p.strip()).resolve() for p in allowed_paths.split(separator) if p.strip()]
|
||||||
|
|
||||||
|
# Check if resolved path is within any allowed directory
|
||||||
|
for allowed_prefix in allowed_prefixes:
|
||||||
|
try:
|
||||||
|
resolved_path.relative_to(allowed_prefix)
|
||||||
|
return resolved_path # Path is within allowed directory
|
||||||
|
except ValueError:
|
||||||
|
continue # Path is not within this allowed directory
|
||||||
|
|
||||||
|
# Path not allowed
|
||||||
|
allowed_paths_str = separator.join(str(p) for p in allowed_prefixes)
|
||||||
|
raise ValueError(f"Output path not allowed: {resolved_path}. Allowed paths: {allowed_paths_str}")
|
||||||
|
|
||||||
return resolved_path
|
return resolved_path
|
||||||
|
|
||||||
|
|||||||
4
uv.lock
generated
4
uv.lock
generated
@ -1,5 +1,5 @@
|
|||||||
version = 1
|
version = 1
|
||||||
revision = 2
|
revision = 3
|
||||||
requires-python = ">=3.10"
|
requires-python = ">=3.10"
|
||||||
resolution-markers = [
|
resolution-markers = [
|
||||||
"python_full_version >= '3.13' and sys_platform == 'darwin'",
|
"python_full_version >= '3.13' and sys_platform == 'darwin'",
|
||||||
@ -1032,7 +1032,7 @@ wheels = [
|
|||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "mcp-pdf"
|
name = "mcp-pdf"
|
||||||
version = "1.0.1"
|
version = "1.1.0"
|
||||||
source = { editable = "." }
|
source = { editable = "." }
|
||||||
dependencies = [
|
dependencies = [
|
||||||
{ name = "camelot-py", extra = ["cv"] },
|
{ name = "camelot-py", extra = ["cv"] },
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user