enhanced-mcp-tools/docs/archive/EMERGENCY_LOGGING_GUIDE.md
Ryan Malloy de512018cf refactor: Clean up and organize root directory documentation
🧹 Root Directory Cleanup:
- Remove 9 outdated .md files from root directory
- Keep only essential docs in root (README.md, TODO.md)

📚 Reorganized Documentation:
- Move important docs to docs/: SACRED_TRUST_SAFETY.md, UV_BUILD_GUIDE.md, PACKAGE_READY.md
- Archive historical files in docs/archive/: implementation status docs, fix summaries
- Remove duplicate TODO file (kept TODO.md as primary)

 Result: Clean root directory with logical documentation structure
📁 Structure: root (essential) → docs/ (reference) → docs/archive/ (historical)

Improves project maintainability and reduces root directory clutter.
2025-06-23 13:50:17 -06:00

5.2 KiB
Raw Permalink Blame History

🚨 Enhanced Logging Severity Guide

📊 Proper Logging Hierarchy

You're absolutely right! Here's the correct severity categorization:

🚨 EMERGENCY - ctx.emergency() / log_emergency()

RESERVED FOR TRUE EMERGENCIES

  • Data corruption detected or likely
  • Security breaches or unauthorized access
  • System instability that could affect other processes
  • Backup/recovery failures during critical operations

Examples where we SHOULD use emergency:

# Data corruption scenarios
if checksum_mismatch:
    await self.log_emergency("File checksum mismatch - data corruption detected", ctx=ctx)

# Security issues  
if unauthorized_access_detected:
    await self.log_emergency("Unauthorized file system access attempted", ctx=ctx)

# Critical backup failures
if backup_failed_during_destructive_operation:
    await self.log_emergency("Backup failed during bulk rename - data loss risk", ctx=ctx)

🔴 CRITICAL - ctx.error() with CRITICAL prefix

Tool completely fails but no data corruption

  • Complete tool method failure
  • Unexpected exceptions that prevent completion
  • Resource exhaustion or system limits hit
  • Network failures for critical operations

Examples (our current usage):

# Complete tool failure
except Exception as e:
    await ctx.error(f"CRITICAL: Git grep failed | Exception: {type(e).__name__}")

# Resource exhaustion
if memory_usage > critical_threshold:
    await ctx.error("CRITICAL: Memory usage exceeded safe limits")

⚠️ ERROR - ctx.error()

Expected failures and recoverable errors

  • Invalid input parameters
  • File not found scenarios
  • Permission denied cases
  • Configuration errors

🟡 WARNING - ctx.warning()

Non-fatal issues and degraded functionality

  • Fallback mechanisms activated
  • Performance degradation
  • Missing optional dependencies
  • Deprecated feature usage

INFO - ctx.info()

Normal operational information

  • Operation progress
  • Successful completions
  • Configuration changes

🔧 DEBUG - ctx.debug()

Detailed diagnostic information

  • Variable values
  • Execution flow details
  • Performance timings

🎯 Where We Should Add Emergency Logging

Archive Operations

# In archive extraction - check for path traversal attacks
if "../" in member_path or member_path.startswith("/"):
    await self.log_emergency("Path traversal attack detected in archive", ctx=ctx)
    return {"error": "Security violation: path traversal detected"}

# In file compression - verify integrity
if original_size != decompressed_size:
    await self.log_emergency("File compression integrity check failed - data corruption", ctx=ctx)

File Operations

# In bulk rename - backup verification
if not verify_backup_integrity():
    await self.log_emergency("Backup integrity check failed before bulk operation", ctx=ctx)
    return {"error": "Cannot proceed - backup verification failed"}

# In file operations - unexpected permission changes
if file_permissions_changed_unexpectedly:
    await self.log_emergency("Unexpected permission changes detected - security issue", ctx=ctx)

Git Operations

# In git operations - repository corruption
if git_status_shows_corruption:
    await self.log_emergency("Git repository corruption detected", ctx=ctx)
    return {"error": "Repository integrity compromised"}

🔧 Implementation Strategy

Current FastMCP Compatibility

async def log_emergency(self, message: str, exception: Exception = None, ctx: Optional[Context] = None):
    # Future-proof: check if emergency() becomes available
    if hasattr(ctx, 'emergency'):
        await ctx.emergency(f"EMERGENCY: {message}")
    else:
        # Fallback to error with EMERGENCY prefix
        await ctx.error(f"EMERGENCY: {message}")
        
    # Additional emergency actions:
    # - Write to emergency log file
    # - Send alerts to monitoring systems
    # - Trigger backup procedures if needed

Severity Decision Tree

Is data corrupted or at risk? 
├─ YES → EMERGENCY
└─ NO → Is the tool completely broken?
        ├─ YES → CRITICAL (error with prefix)
        └─ NO → Is it an expected failure?
                ├─ YES → ERROR
                └─ NO → Is functionality degraded?
                        ├─ YES → WARNING  
                        └─ NO → INFO/DEBUG

📋 Action Items

  1. DONE - Updated base class with log_emergency() method
  2. 🔄 TODO - Identify specific emergency scenarios in our tools
  3. 🔄 TODO - Add integrity checks to destructive operations
  4. 🔄 TODO - Implement emergency actions (logging, alerts)

You're absolutely right about emergency() being the most severe!

Even though FastMCP 2.8.1 doesn't have it yet, we should:

  • Prepare for it with proper severity categorization
  • Use emergency logging only for true emergencies (data corruption, security)
  • Keep critical logging for complete tool failures
  • Future-proof our implementation for when emergency() becomes available

Great catch on the logging hierarchy! 🎯