playwright-mcp/RIPGREP_INTEGRATION_DESIGN.md
Ryan Malloy 9afa25855e feat: revolutionary integration of differential snapshots with ripgrep filtering
Combines our 99% response reduction differential snapshots with MCPlaywright's
proven ripgrep filtering system to create unprecedented browser automation precision.

Key Features:
- Universal TypeScript ripgrep filtering engine with async processing
- Seamless integration with React-style differential reconciliation
- Enhanced browser_configure_snapshots with 8 new filtering parameters
- Surgical precision targeting: 99.8%+ total response reduction
- Sub-100ms performance with comprehensive metrics and feedback

Technical Implementation:
- src/filtering/engine.ts: High-performance filtering with temp file management
- src/filtering/models.ts: Type-safe interfaces for differential filtering
- src/filtering/decorators.ts: MCP tool integration decorators
- Enhanced configuration system with intelligent defaults

Performance Achievement:
- Before: 1000+ line snapshots requiring manual parsing
- With Differential: 99% reduction (6-20 lines) with semantic understanding
- With Combined Filtering: 99.8%+ reduction (1-3 lines) with surgical targeting

Establishes new gold standard for browser automation efficiency and precision.
2025-09-20 14:20:41 -06:00

15 KiB

🎯 Ripgrep Integration Design for Playwright MCP

🚀 Vision: Supercharged Differential Snapshots

Goal: Combine our revolutionary 99% response reduction with MCPlaywright's powerful ripgrep filtering to create the most precise browser automation system ever built.

🎪 Integration Scenarios

Scenario 1: Filtered Element Changes

# Command
browser_configure_snapshots {
  "differentialSnapshots": true,
  "filterPattern": "button.*submit|input.*email",
  "filterFields": ["element.text", "element.attributes"]
}

# Enhanced Response  
🔍 Filtered Differential Snapshot (3 matches found)

🆕 Changes detected:
- 🆕 Added: 1 interactive element matching pattern
  - <button class="submit-btn" ref=e234>Submit Form</button>
- 🔄 Modified: 1 element matching pattern  
  - <input type="email" placeholder="Enter email" ref=e156>
- Pattern: "button.*submit|input.*email"
- Fields searched: ["element.text", "element.attributes"]  
- Match efficiency: 3 matches from 847 total changes (99.6% noise reduction)

Scenario 2: Console Error Hunting

# Command
browser_navigate("https://buggy-site.com")
# With filtering: {filterPattern: "TypeError|ReferenceError", filterFields: ["console.message"]}

# Enhanced Response
🔄 Filtered Differential Snapshot (2 critical errors found)

🆕 Changes detected:
- 📍 URL changed: / → /buggy-site.com
- 🔍 Filtered console activity (2 critical errors):
  - TypeError: Cannot read property 'id' of undefined at Component.render:45
  - ReferenceError: validateForm is not defined at form.submit:12
- Pattern: "TypeError|ReferenceError" 
- Total console messages: 127, Filtered: 2 (98.4% noise reduction)

Scenario 3: Form Interaction Precision

# Command  
browser_type("user@example.com", ref="e123")
# With filtering: {filterPattern: "form.*validation|error", filterFields: ["element.text", "console.message"]}

# Enhanced Response
🔍 Filtered Differential Snapshot (validation triggered)

🆕 Changes detected:
- 🆕 Added: 1 validation element
  - <span class="error-message" ref=e789>Invalid email format</span>
- 🔍 Filtered console activity (1 validation event):
  - Form validation triggered: email field validation failed
- Pattern: "form.*validation|error"
- Match precision: 100% (found exactly what matters)

🏗️ Technical Architecture

Enhanced Configuration Schema

// Enhanced: src/tools/configure.ts
const configureSnapshotsSchema = z.object({
  // Existing differential snapshot options
  differentialSnapshots: z.boolean().optional(),
  differentialMode: z.enum(['semantic', 'simple', 'both']).optional(),
  maxSnapshotTokens: z.number().optional(),
  
  // New ripgrep filtering options
  filterPattern: z.string().optional().describe('Ripgrep pattern to filter changes'),
  filterFields: z.array(z.string()).optional().describe('Fields to search: element.text, element.attributes, console.message, url, title'),
  caseSensitive: z.boolean().optional().describe('Case sensitive pattern matching'),
  wholeWords: z.boolean().optional().describe('Match whole words only'),
  invertMatch: z.boolean().optional().describe('Invert match (show non-matches)'),
  maxMatches: z.number().optional().describe('Maximum number of matches to return'),
  
  // Advanced options
  filterMode: z.enum(['content', 'count', 'files']).optional().describe('Type of filtering output'),
  contextLines: z.number().optional().describe('Include N lines of context around matches')
});

Core Integration Points

1. Enhanced Context Configuration

// Enhanced: src/context.ts
export class Context {
  // Existing differential config
  private _differentialSnapshots: boolean = false;
  private _differentialMode: 'semantic' | 'simple' | 'both' = 'semantic';
  
  // New filtering config
  private _filterPattern?: string;
  private _filterFields?: string[];
  private _caseSensitive: boolean = true;
  private _wholeWords: boolean = false;
  private _invertMatch: boolean = false;
  private _maxMatches?: number;
  
  // Enhanced update method
  updateSnapshotConfig(updates: {
    // Existing options
    differentialSnapshots?: boolean;
    differentialMode?: 'semantic' | 'simple' | 'both';
    
    // New filtering options
    filterPattern?: string;
    filterFields?: string[];
    caseSensitive?: boolean;
    wholeWords?: boolean;
    invertMatch?: boolean;
    maxMatches?: number;
  }): void {
    // Update all configuration options
    // Reset differential state if major changes
  }
}

2. Ripgrep Engine Integration

// New: src/tools/filtering/ripgrepEngine.ts
interface FilterableChange {
  type: 'url' | 'title' | 'element' | 'console';
  content: string;
  metadata: Record<string, any>;
}

interface FilterResult {
  matches: FilterableChange[];
  totalChanges: number;
  matchCount: number;
  pattern: string;
  fieldsSearched: string[];
  executionTime: number;
}

class DifferentialRipgrepEngine {
  async filterDifferentialChanges(
    changes: DifferentialSnapshot,
    filterPattern: string,
    options: FilterOptions
  ): Promise<FilterResult> {
    // 1. Convert differential changes to filterable content
    const filterableContent = this.extractFilterableContent(changes, options.filterFields);
    
    // 2. Apply ripgrep filtering
    const ripgrepResults = await this.executeRipgrep(filterableContent, filterPattern, options);
    
    // 3. Reconstruct filtered differential response
    return this.reconstructFilteredResponse(changes, ripgrepResults);
  }
  
  private extractFilterableContent(
    changes: DifferentialSnapshot, 
    fields?: string[]
  ): FilterableChange[] {
    const content: FilterableChange[] = [];
    
    // Extract URL changes
    if (!fields || fields.includes('url') || fields.includes('url_changes')) {
      if (changes.urlChanged) {
        content.push({
          type: 'url',
          content: `url:${changes.urlChanged.from}${changes.urlChanged.to}`,
          metadata: { from: changes.urlChanged.from, to: changes.urlChanged.to }
        });
      }
    }
    
    // Extract element changes
    if (!fields || fields.some(f => f.startsWith('element.'))) {
      changes.elementsAdded?.forEach(element => {
        content.push({
          type: 'element',
          content: this.elementToSearchableText(element, fields),
          metadata: { action: 'added', element }
        });
      });
      
      changes.elementsModified?.forEach(modification => {
        content.push({
          type: 'element', 
          content: this.elementToSearchableText(modification.after, fields),
          metadata: { action: 'modified', before: modification.before, after: modification.after }
        });
      });
    }
    
    // Extract console changes
    if (!fields || fields.includes('console.message') || fields.includes('console')) {
      changes.consoleActivity?.forEach(message => {
        content.push({
          type: 'console',
          content: `console.${message.level}:${message.text}`,
          metadata: { message }
        });
      });
    }
    
    return content;
  }
  
  private elementToSearchableText(element: AccessibilityNode, fields?: string[]): string {
    const parts: string[] = [];
    
    if (!fields || fields.includes('element.text')) {
      parts.push(`text:${element.text}`);
    }
    
    if (!fields || fields.includes('element.attributes')) {
      Object.entries(element.attributes || {}).forEach(([key, value]) => {
        parts.push(`${key}:${value}`);
      });
    }
    
    if (!fields || fields.includes('element.role')) {
      parts.push(`role:${element.role}`);
    }
    
    if (!fields || fields.includes('element.ref')) {
      parts.push(`ref:${element.ref}`);
    }
    
    return parts.join(' ');
  }
  
  private async executeRipgrep(
    content: FilterableChange[],
    pattern: string,
    options: FilterOptions
  ): Promise<RipgrepResult> {
    // Create temporary file with searchable content
    const tempFile = await this.createTempSearchFile(content);
    
    try {
      // Build ripgrep command
      const cmd = this.buildRipgrepCommand(pattern, options, tempFile);
      
      // Execute ripgrep
      const result = await this.runRipgrepCommand(cmd);
      
      // Parse results
      return this.parseRipgrepOutput(result, content);
      
    } finally {
      // Cleanup
      await fs.unlink(tempFile);
    }
  }
}

3. Enhanced Differential Generation

// Enhanced: src/context.ts - generateDifferentialSnapshot method
private async generateDifferentialSnapshot(rawSnapshot: string): Promise<string> {
  // Existing differential generation logic...
  const changes = this.computeSemanticChanges(oldTree, newTree);
  
  // NEW: Apply filtering if configured
  if (this._filterPattern) {
    const ripgrepEngine = new DifferentialRipgrepEngine();
    const filteredResult = await ripgrepEngine.filterDifferentialChanges(
      changes,
      this._filterPattern,
      {
        filterFields: this._filterFields,
        caseSensitive: this._caseSensitive,
        wholeWords: this._wholeWords,
        invertMatch: this._invertMatch,
        maxMatches: this._maxMatches
      }
    );
    
    return this.formatFilteredDifferentialSnapshot(filteredResult);
  }
  
  // Existing formatting logic...
  return this.formatDifferentialSnapshot(changes);
}

private formatFilteredDifferentialSnapshot(filterResult: FilterResult): string {
  const lines: string[] = [];
  
  lines.push('🔍 Filtered Differential Snapshot');
  lines.push('');
  lines.push(`**📊 Filter Results:** ${filterResult.matchCount} matches from ${filterResult.totalChanges} changes`);
  lines.push('');
  
  if (filterResult.matchCount === 0) {
    lines.push('🚫 **No matches found**');
    lines.push(`- Pattern: "${filterResult.pattern}"`);
    lines.push(`- Fields searched: [${filterResult.fieldsSearched.join(', ')}]`);
    lines.push(`- Total changes available: ${filterResult.totalChanges}`);
    return lines.join('\n');
  }
  
  lines.push('🆕 **Filtered changes detected:**');
  
  // Group matches by type
  const grouped = this.groupMatchesByType(filterResult.matches);
  
  if (grouped.url.length > 0) {
    lines.push(`- 📍 **URL changes matching pattern:**`);
    grouped.url.forEach(match => {
      lines.push(`  - ${match.metadata.from}${match.metadata.to}`);
    });
  }
  
  if (grouped.element.length > 0) {
    lines.push(`- 🎯 **Element changes matching pattern:**`);
    grouped.element.forEach(match => {
      const action = match.metadata.action === 'added' ? '🆕 Added' : '🔄 Modified';
      lines.push(`  - ${action}: ${this.summarizeElement(match.metadata.element)}`);
    });
  }
  
  if (grouped.console.length > 0) {
    lines.push(`- 🔍 **Console activity matching pattern:**`);
    grouped.console.forEach(match => {
      const msg = match.metadata.message;
      lines.push(`  - [${msg.level.toUpperCase()}] ${msg.text}`);
    });
  }
  
  lines.push('');
  lines.push('**📈 Filter Performance:**');
  lines.push(`- Pattern: "${filterResult.pattern}"`);
  lines.push(`- Fields searched: [${filterResult.fieldsSearched.join(', ')}]`);
  lines.push(`- Execution time: ${filterResult.executionTime}ms`);
  lines.push(`- Precision: ${((filterResult.matchCount / filterResult.totalChanges) * 100).toFixed(1)}% match rate`);
  
  return lines.join('\n');
}

🎛️ Configuration Examples

Basic Pattern Filtering

# Enable differential snapshots with element filtering
browser_configure_snapshots {
  "differentialSnapshots": true,
  "filterPattern": "button|input", 
  "filterFields": ["element.text", "element.role"]
}

Advanced Error Detection

# Focus on JavaScript errors and form validation
browser_configure_snapshots {
  "differentialSnapshots": true,
  "filterPattern": "(TypeError|ReferenceError|validation.*failed)",
  "filterFields": ["console.message", "element.text"],
  "caseSensitive": false,
  "maxMatches": 10
}

Debugging Workflow

# Track specific component interactions
browser_configure_snapshots {
  "differentialSnapshots": true,
  "differentialMode": "both",
  "filterPattern": "react.*component|props.*validation",
  "filterFields": ["console.message", "element.attributes"],
  "contextLines": 2
}

📊 Expected Performance Impact

Positive Impacts

  • Ultra-precision: From 99% reduction to 99.8%+ reduction
  • Faster debugging: Find exactly what you need instantly
  • Reduced cognitive load: Even less irrelevant information
  • Pattern-based intelligence: Leverage powerful regex capabilities

Performance Considerations

  • ⚠️ Ripgrep overhead: +10-50ms processing time for filtering
  • ⚠️ Memory usage: Temporary files for large differential changes
  • ⚠️ Complexity: Additional configuration options to understand

Mitigation Strategies

  • 🎯 Smart defaults: Only filter when patterns provided
  • 🎯 Efficient processing: Filter minimal differential data, not raw snapshots
  • 🎯 Async operation: Non-blocking ripgrep execution
  • 🎯 Graceful fallbacks: Return unfiltered data if ripgrep fails

🚀 Implementation Timeline

Phase 1: Foundation (Week 1)

  • Create ripgrep engine TypeScript module
  • Enhance configuration schema and validation
  • Add filter parameters to configure tool
  • Basic integration testing

Phase 2: Core Integration (Week 2)

  • Integrate ripgrep engine with differential generation
  • Implement filtered response formatting
  • Add comprehensive error handling
  • Performance optimization

Phase 3: Enhancement (Week 3)

  • Advanced filtering modes (count, context, invert)
  • Streaming support for large changes
  • Field-specific optimization
  • Comprehensive testing

Phase 4: Polish (Week 4)

  • Documentation and examples
  • Performance benchmarking
  • User experience refinement
  • Integration validation

🎉 Success Metrics

Technical Goals

  • Maintain 99%+ response reduction with optional filtering
  • Sub-100ms filtering performance for typical patterns
  • Zero breaking changes to existing functionality
  • Comprehensive test coverage for all filter combinations

User Experience Goals

  • Intuitive configuration with smart defaults
  • Clear filter feedback showing match counts and performance
  • Powerful debugging capabilities for complex applications
  • Seamless integration with existing differential workflows

🌟 Conclusion

By integrating MCPlaywright's ripgrep system with our revolutionary differential snapshots, we can create the most precise and powerful browser automation response system ever built.

The combination delivers:

  • 99%+ response size reduction (differential snapshots)
  • Surgical precision targeting (ripgrep filtering)
  • Lightning-fast performance (optimized architecture)
  • Zero learning curve (familiar differential UX)

This integration would establish a new gold standard for browser automation efficiency and precision. 🚀