enhanced-mcp-tools/docs/archive/EMERGENCY_LOGGING_GUIDE.md
Ryan Malloy de512018cf refactor: Clean up and organize root directory documentation
🧹 Root Directory Cleanup:
- Remove 9 outdated .md files from root directory
- Keep only essential docs in root (README.md, TODO.md)

📚 Reorganized Documentation:
- Move important docs to docs/: SACRED_TRUST_SAFETY.md, UV_BUILD_GUIDE.md, PACKAGE_READY.md
- Archive historical files in docs/archive/: implementation status docs, fix summaries
- Remove duplicate TODO file (kept TODO.md as primary)

 Result: Clean root directory with logical documentation structure
📁 Structure: root (essential) → docs/ (reference) → docs/archive/ (historical)

Improves project maintainability and reduces root directory clutter.
2025-06-23 13:50:17 -06:00

156 lines
5.2 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# 🚨 Enhanced Logging Severity Guide
## 📊 **Proper Logging Hierarchy**
You're absolutely right! Here's the correct severity categorization:
### 🚨 **EMERGENCY** - `ctx.emergency()` / `log_emergency()`
**RESERVED FOR TRUE EMERGENCIES**
- **Data corruption detected or likely**
- **Security breaches or unauthorized access**
- **System instability that could affect other processes**
- **Backup/recovery failures during critical operations**
**Examples where we SHOULD use emergency:**
```python
# Data corruption scenarios
if checksum_mismatch:
await self.log_emergency("File checksum mismatch - data corruption detected", ctx=ctx)
# Security issues
if unauthorized_access_detected:
await self.log_emergency("Unauthorized file system access attempted", ctx=ctx)
# Critical backup failures
if backup_failed_during_destructive_operation:
await self.log_emergency("Backup failed during bulk rename - data loss risk", ctx=ctx)
```
### 🔴 **CRITICAL** - `ctx.error()` with CRITICAL prefix
**Tool completely fails but no data corruption**
- **Complete tool method failure**
- **Unexpected exceptions that prevent completion**
- **Resource exhaustion or system limits hit**
- **Network failures for critical operations**
**Examples (our current usage):**
```python
# Complete tool failure
except Exception as e:
await ctx.error(f"CRITICAL: Git grep failed | Exception: {type(e).__name__}")
# Resource exhaustion
if memory_usage > critical_threshold:
await ctx.error("CRITICAL: Memory usage exceeded safe limits")
```
### ⚠️ **ERROR** - `ctx.error()`
**Expected failures and recoverable errors**
- **Invalid input parameters**
- **File not found scenarios**
- **Permission denied cases**
- **Configuration errors**
### 🟡 **WARNING** - `ctx.warning()`
**Non-fatal issues and degraded functionality**
- **Fallback mechanisms activated**
- **Performance degradation**
- **Missing optional dependencies**
- **Deprecated feature usage**
### **INFO** - `ctx.info()`
**Normal operational information**
- **Operation progress**
- **Successful completions**
- **Configuration changes**
### 🔧 **DEBUG** - `ctx.debug()`
**Detailed diagnostic information**
- **Variable values**
- **Execution flow details**
- **Performance timings**
## 🎯 **Where We Should Add Emergency Logging**
### Archive Operations
```python
# In archive extraction - check for path traversal attacks
if "../" in member_path or member_path.startswith("/"):
await self.log_emergency("Path traversal attack detected in archive", ctx=ctx)
return {"error": "Security violation: path traversal detected"}
# In file compression - verify integrity
if original_size != decompressed_size:
await self.log_emergency("File compression integrity check failed - data corruption", ctx=ctx)
```
### File Operations
```python
# In bulk rename - backup verification
if not verify_backup_integrity():
await self.log_emergency("Backup integrity check failed before bulk operation", ctx=ctx)
return {"error": "Cannot proceed - backup verification failed"}
# In file operations - unexpected permission changes
if file_permissions_changed_unexpectedly:
await self.log_emergency("Unexpected permission changes detected - security issue", ctx=ctx)
```
### Git Operations
```python
# In git operations - repository corruption
if git_status_shows_corruption:
await self.log_emergency("Git repository corruption detected", ctx=ctx)
return {"error": "Repository integrity compromised"}
```
## 🔧 **Implementation Strategy**
### Current FastMCP Compatibility
```python
async def log_emergency(self, message: str, exception: Exception = None, ctx: Optional[Context] = None):
# Future-proof: check if emergency() becomes available
if hasattr(ctx, 'emergency'):
await ctx.emergency(f"EMERGENCY: {message}")
else:
# Fallback to error with EMERGENCY prefix
await ctx.error(f"EMERGENCY: {message}")
# Additional emergency actions:
# - Write to emergency log file
# - Send alerts to monitoring systems
# - Trigger backup procedures if needed
```
### Severity Decision Tree
```
Is data corrupted or at risk?
├─ YES → EMERGENCY
└─ NO → Is the tool completely broken?
├─ YES → CRITICAL (error with prefix)
└─ NO → Is it an expected failure?
├─ YES → ERROR
└─ NO → Is functionality degraded?
├─ YES → WARNING
└─ NO → INFO/DEBUG
```
## 📋 **Action Items**
1. **✅ DONE** - Updated base class with `log_emergency()` method
2. **🔄 TODO** - Identify specific emergency scenarios in our tools
3. **🔄 TODO** - Add integrity checks to destructive operations
4. **🔄 TODO** - Implement emergency actions (logging, alerts)
---
**You're absolutely right about emergency() being the most severe!**
Even though FastMCP 2.8.1 doesn't have it yet, we should:
- **Prepare for it** with proper severity categorization
- **Use emergency logging** only for true emergencies (data corruption, security)
- **Keep critical logging** for complete tool failures
- **Future-proof** our implementation for when emergency() becomes available
Great catch on the logging hierarchy! 🎯