Ryan Malloy de512018cf refactor: Clean up and organize root directory documentation

🧹 Root Directory Cleanup:
- Remove 9 outdated .md files from root directory
- Keep only essential docs in root (README.md, TODO.md)

📚 Reorganized Documentation:
- Move important docs to docs/: SACRED_TRUST_SAFETY.md, UV_BUILD_GUIDE.md, PACKAGE_READY.md
- Archive historical files in docs/archive/: implementation status docs, fix summaries
- Remove duplicate TODO file (kept TODO.md as primary)

✨ Result: Clean root directory with logical documentation structure
📁 Structure: root (essential) → docs/ (reference) → docs/archive/ (historical)

Improves project maintainability and reduces root directory clutter.

2025-06-23 13:50:17 -06:00

5.2 KiB

Raw Permalink Blame History

🚨 Enhanced Logging Severity Guide

📊 Proper Logging Hierarchy

You're absolutely right! Here's the correct severity categorization:

🚨 EMERGENCY - `ctx.emergency()` / `log_emergency()`

RESERVED FOR TRUE EMERGENCIES

Data corruption detected or likely
Security breaches or unauthorized access
System instability that could affect other processes
Backup/recovery failures during critical operations

Examples where we SHOULD use emergency:

# Data corruption scenarios
if checksum_mismatch:
    await self.log_emergency("File checksum mismatch - data corruption detected", ctx=ctx)

# Security issues  
if unauthorized_access_detected:
    await self.log_emergency("Unauthorized file system access attempted", ctx=ctx)

# Critical backup failures
if backup_failed_during_destructive_operation:
    await self.log_emergency("Backup failed during bulk rename - data loss risk", ctx=ctx)

🔴 CRITICAL - `ctx.error()` with CRITICAL prefix

Tool completely fails but no data corruption

Complete tool method failure
Unexpected exceptions that prevent completion
Resource exhaustion or system limits hit
Network failures for critical operations

Examples (our current usage):

# Complete tool failure
except Exception as e:
    await ctx.error(f"CRITICAL: Git grep failed | Exception: {type(e).__name__}")

# Resource exhaustion
if memory_usage > critical_threshold:
    await ctx.error("CRITICAL: Memory usage exceeded safe limits")

⚠️ ERROR - `ctx.error()`

Expected failures and recoverable errors

Invalid input parameters
File not found scenarios
Permission denied cases
Configuration errors

🟡 WARNING - `ctx.warning()`

Non-fatal issues and degraded functionality

Fallback mechanisms activated
Performance degradation
Missing optional dependencies
Deprecated feature usage

ℹ️ INFO - `ctx.info()`

Normal operational information

Operation progress
Successful completions
Configuration changes

🔧 DEBUG - `ctx.debug()`

Detailed diagnostic information

Variable values
Execution flow details
Performance timings

🎯 Where We Should Add Emergency Logging

Archive Operations

# In archive extraction - check for path traversal attacks
if "../" in member_path or member_path.startswith("/"):
    await self.log_emergency("Path traversal attack detected in archive", ctx=ctx)
    return {"error": "Security violation: path traversal detected"}

# In file compression - verify integrity
if original_size != decompressed_size:
    await self.log_emergency("File compression integrity check failed - data corruption", ctx=ctx)

File Operations

# In bulk rename - backup verification
if not verify_backup_integrity():
    await self.log_emergency("Backup integrity check failed before bulk operation", ctx=ctx)
    return {"error": "Cannot proceed - backup verification failed"}

# In file operations - unexpected permission changes
if file_permissions_changed_unexpectedly:
    await self.log_emergency("Unexpected permission changes detected - security issue", ctx=ctx)

Git Operations

# In git operations - repository corruption
if git_status_shows_corruption:
    await self.log_emergency("Git repository corruption detected", ctx=ctx)
    return {"error": "Repository integrity compromised"}

🔧 Implementation Strategy

Current FastMCP Compatibility

async def log_emergency(self, message: str, exception: Exception = None, ctx: Optional[Context] = None):
    # Future-proof: check if emergency() becomes available
    if hasattr(ctx, 'emergency'):
        await ctx.emergency(f"EMERGENCY: {message}")
    else:
        # Fallback to error with EMERGENCY prefix
        await ctx.error(f"EMERGENCY: {message}")
        
    # Additional emergency actions:
    # - Write to emergency log file
    # - Send alerts to monitoring systems
    # - Trigger backup procedures if needed

Severity Decision Tree

Is data corrupted or at risk? 
├─ YES → EMERGENCY
└─ NO → Is the tool completely broken?
        ├─ YES → CRITICAL (error with prefix)
        └─ NO → Is it an expected failure?
                ├─ YES → ERROR
                └─ NO → Is functionality degraded?
                        ├─ YES → WARNING  
                        └─ NO → INFO/DEBUG

📋 Action Items

✅ DONE - Updated base class with log_emergency() method
🔄 TODO - Identify specific emergency scenarios in our tools
🔄 TODO - Add integrity checks to destructive operations
🔄 TODO - Implement emergency actions (logging, alerts)

You're absolutely right about emergency() being the most severe!

Even though FastMCP 2.8.1 doesn't have it yet, we should:

Prepare for it with proper severity categorization
Use emergency logging only for true emergencies (data corruption, security)
Keep critical logging for complete tool failures
Future-proof our implementation for when emergency() becomes available

Great catch on the logging hierarchy! 🎯

5.2 KiB Raw Permalink Blame History Unescape Escape