crawailer/TESTING_GUIDE.md
Crawailer Developer fd836c90cf Complete Phase 1 critical test coverage expansion and begin Phase 2
Phase 1 Achievements (47 new test scenarios):
• Modern Framework Integration Suite (20 scenarios)
  - React 18 with hooks, state management, component interactions
  - Vue 3 with Composition API, reactivity system, watchers
  - Angular 17 with services, RxJS observables, reactive forms
  - Cross-framework compatibility and performance comparison

• Mobile Browser Compatibility Suite (15 scenarios)
  - iPhone 13/SE, Android Pixel/Galaxy, iPad Air configurations
  - Touch events, gesture support, viewport adaptation
  - Mobile-specific APIs (orientation, battery, network)
  - Safari/Chrome mobile quirks and optimizations

• Advanced User Interaction Suite (12 scenarios)
  - Multi-step form workflows with validation
  - Drag-and-drop file handling and complex interactions
  - Keyboard navigation and ARIA accessibility
  - Multi-page e-commerce workflow simulation

Phase 2 Started - Production Network Resilience:
• Enterprise proxy/firewall scenarios with content filtering
• CDN failover strategies with geographic load balancing
• HTTP connection pooling optimization
• DNS failure recovery mechanisms

Infrastructure Enhancements:
• Local test server with React/Vue/Angular demo applications
• Production-like SPAs with complex state management
• Cross-platform mobile/tablet/desktop configurations
• Network resilience testing framework

Coverage Impact:
• Before: ~70% production coverage (280+ scenarios)
• After Phase 1: ~85% production coverage (327+ scenarios)
• Target Phase 2: ~92% production coverage (357+ scenarios)

Critical gaps closed for modern framework support (90% of websites)
and mobile browser compatibility (60% of traffic).
2025-09-18 09:35:31 -06:00

16 KiB

Crawailer JavaScript API - Comprehensive Testing Guide

This guide provides complete instructions for running and understanding the production-grade test suite for the Crawailer JavaScript API enhancement.

🎯 Test Suite Overview

The test suite consists of 6 comprehensive test modules covering all aspects of production readiness:

Test Categories

Category File Focus Tests Priority
Edge Cases test_edge_cases.py Error scenarios, malformed inputs, encoding 50+ HIGH
Performance test_performance_stress.py Stress testing, resource usage, benchmarks 40+ HIGH
Security test_security_penetration.py Injection attacks, XSS, privilege escalation 60+ CRITICAL
Compatibility test_browser_compatibility.py Cross-browser, viewport, user agents 45+ MEDIUM
Production test_production_scenarios.py Real-world workflows, integrations 35+ HIGH
Regression test_regression_suite.py Comprehensive validation, backwards compatibility 50+ CRITICAL

Total: 280+ comprehensive test cases

🚀 Quick Start

Prerequisites

# Install test dependencies
uv pip install -e ".[dev]"

# Additional testing dependencies (optional but recommended)
uv pip install pytest-asyncio pytest-timeout pytest-cov pytest-html memory-profiler psutil

Running Tests

1. Smoke Tests (Development)

# Quick validation - runs in ~2 minutes
python run_comprehensive_tests.py smoke

2. Critical Tests (Pre-release)

# Essential functionality - runs in ~15 minutes
python run_comprehensive_tests.py critical

3. Full Test Suite (Release validation)

# Complete validation - runs in ~45 minutes
python run_comprehensive_tests.py full

4. Performance Benchmarking

# Performance analysis with resource monitoring
python run_comprehensive_tests.py performance

5. Security Audit

# Security penetration testing
python run_comprehensive_tests.py security

6. CI/CD Pipeline

# Optimized for automated testing
python run_comprehensive_tests.py ci

📊 Test Execution Modes

Smoke Tests

  • Purpose: Quick validation during development
  • Duration: ~2 minutes
  • Coverage: Basic functionality, core features
  • Command: python run_comprehensive_tests.py smoke

Critical Tests

  • Purpose: Pre-release validation
  • Duration: ~15 minutes
  • Coverage: Security, core functionality, error handling
  • Command: python run_comprehensive_tests.py critical

Full Suite

  • Purpose: Complete production readiness validation
  • Duration: ~45 minutes
  • Coverage: All test categories
  • Command: python run_comprehensive_tests.py full

Performance Benchmark

  • Purpose: Performance regression testing
  • Duration: ~20 minutes
  • Coverage: Stress tests, resource monitoring, benchmarks
  • Command: python run_comprehensive_tests.py performance

Security Audit

  • Purpose: Security vulnerability assessment
  • Duration: ~10 minutes
  • Coverage: Injection attacks, privilege escalation, data exfiltration
  • Command: python run_comprehensive_tests.py security

CI/CD Pipeline

  • Purpose: Automated testing in CI environments
  • Duration: ~10 minutes
  • Coverage: Non-slow tests, optimized for automation
  • Command: python run_comprehensive_tests.py ci

🔍 Individual Test Categories

Edge Cases (test_edge_cases.py)

Tests boundary conditions and error scenarios:

# Run edge case tests
pytest tests/test_edge_cases.py -v

# Run specific edge case categories
pytest tests/test_edge_cases.py::TestMalformedJavaScriptCodes -v
pytest tests/test_edge_cases.py::TestNetworkFailureScenarios -v
pytest tests/test_edge_cases.py::TestConcurrencyAndResourceLimits -v

Key Test Classes:

  • TestMalformedJavaScriptCodes - Syntax errors, infinite loops, memory exhaustion
  • TestNetworkFailureScenarios - Timeouts, DNS failures, SSL errors
  • TestConcurrencyAndResourceLimits - Concurrent execution, resource cleanup
  • TestInvalidParameterCombinations - Invalid URLs, empty scripts, timeouts
  • TestEncodingAndSpecialCharacterHandling - Unicode, binary data, control characters

Performance & Stress (test_performance_stress.py)

Tests performance characteristics and resource usage:

# Run performance tests
pytest tests/test_performance_stress.py -v -s

# Run with resource monitoring
pytest tests/test_performance_stress.py::TestHighConcurrencyStress -v -s

Key Test Classes:

  • TestLargeScriptExecution - Large code, large results, complex DOM processing
  • TestHighConcurrencyStress - 100+ concurrent executions, memory usage
  • TestLongRunningScriptTimeouts - Timeout precision, recovery patterns
  • TestResourceLeakDetection - Memory leaks, cleanup verification
  • TestPerformanceRegression - Baseline metrics, throughput measurement

Security Penetration (test_security_penetration.py)

Tests security vulnerabilities and attack prevention:

# Run security tests
pytest tests/test_security_penetration.py -v

# Run specific security categories
pytest tests/test_security_penetration.py::TestScriptInjectionPrevention -v
pytest tests/test_security_penetration.py::TestDataExfiltrationPrevention -v

Key Test Classes:

  • TestScriptInjectionPrevention - Code injection, XSS, CSP bypass
  • TestPrivilegeEscalationPrevention - File access, cross-origin, Node.js escape
  • TestInformationDisclosurePrevention - Sensitive data, fingerprinting, timing attacks
  • TestResourceExhaustionAttacks - Infinite loops, memory bombs, DOM bombing
  • TestDataExfiltrationPrevention - Network exfiltration, covert channels, DNS tunneling

Browser Compatibility (test_browser_compatibility.py)

Tests cross-browser and device compatibility:

# Run compatibility tests
pytest tests/test_browser_compatibility.py -v

# Test specific browser engines
pytest tests/test_browser_compatibility.py::TestPlaywrightBrowserEngines -v

Key Test Classes:

  • TestPlaywrightBrowserEngines - Chromium, Firefox, WebKit differences
  • TestHeadlessVsHeadedBehavior - Mode differences, window properties
  • TestViewportAndDeviceEmulation - Responsive design, device pixel ratios
  • TestUserAgentAndFingerprinting - UA consistency, automation detection
  • TestCrossFrameAndDomainBehavior - iframe access, CORS restrictions

Production Scenarios (test_production_scenarios.py)

Tests real-world production workflows:

# Run production scenario tests
pytest tests/test_production_scenarios.py -v -s

# Test specific workflows
pytest tests/test_production_scenarios.py::TestComplexWorkflows -v

Key Test Classes:

  • TestComplexWorkflows - E-commerce monitoring, social media analysis, news aggregation
  • TestDatabaseIntegrationEdgeCases - Transaction handling, connection failures
  • TestFileSystemInteractionEdgeCases - File downloads, large files, permissions
  • TestNetworkInterruptionHandling - Timeout recovery, partial failures
  • TestProductionErrorScenarios - Cascading failures, resource exhaustion

Regression Suite (test_regression_suite.py)

Comprehensive validation and backwards compatibility:

# Run regression tests
pytest tests/test_regression_suite.py -v

# Test specific aspects
pytest tests/test_regression_suite.py::TestVersionCompatibility -v
pytest tests/test_regression_suite.py::TestContinuousIntegration -v

Key Test Classes:

  • TestRegressionSuite - Full regression validation
  • TestVersionCompatibility - Feature evolution, migration paths
  • TestContinuousIntegration - CI/CD smoke tests, resource cleanup

📈 Performance Benchmarks

The test suite establishes performance baselines:

Execution Time Benchmarks

  • Basic Script Execution: < 100ms average
  • DOM Query Operations: < 200ms average
  • Data Processing (1K items): < 300ms average
  • Concurrent Operations (10): < 2s total
  • Large Data Handling (10MB): < 30s total

Resource Usage Thresholds

  • Memory Growth: < 100MB per 100 operations
  • Thread Leakage: < 5 threads delta after cleanup
  • File Descriptor Leaks: < 20 FDs delta
  • CPU Usage: < 80% average during stress tests

Throughput Targets

  • Serial Execution: > 10 operations/second
  • Concurrent Execution: > 20 operations/second
  • Speedup Ratio: > 1.5x concurrent vs serial

🔒 Security Test Coverage

The security test suite covers:

Injection Attacks

  • JavaScript code injection
  • XSS payload testing
  • SQL injection attempts
  • Command injection prevention

Privilege Escalation

  • File system access attempts
  • Cross-origin resource access
  • Node.js context escape attempts
  • Prototype pollution attacks

Information Disclosure

  • Sensitive data access attempts
  • Browser fingerprinting prevention
  • Timing attack prevention
  • Error message sanitization

Resource Exhaustion

  • Infinite loop protection
  • Memory bomb prevention
  • DOM bombing protection
  • Network flood prevention

Data Exfiltration

  • Network-based exfiltration
  • Covert channel prevention
  • DNS tunneling prevention
  • Encoding bypass attempts

🎯 Quality Metrics & Thresholds

Pass Rate Requirements

  • Critical Tests: 100% pass rate required
  • Performance Tests: 90% pass rate required
  • Security Tests: 100% pass rate required
  • Compatibility Tests: 85% pass rate required

Performance Thresholds

  • Test Execution Time: < 45 minutes for full suite
  • Memory Usage: < 500MB peak during testing
  • CPU Usage: < 90% peak during stress tests
  • Resource Cleanup: 100% successful cleanup

Coverage Requirements

  • Code Coverage: > 90% (with pytest-cov)
  • Feature Coverage: 100% of JavaScript API features
  • Error Scenario Coverage: > 95% of error conditions
  • Browser Coverage: Chrome, Firefox, Safari equivalents

🛠️ Advanced Testing Options

Custom Pytest Arguments

# Run with custom markers
pytest -m "security and critical" -v

# Run with coverage reporting
pytest --cov=src/crawailer --cov-report=html

# Run with performance profiling
pytest --tb=short --durations=0

# Run with parallel execution
pytest -n auto  # Requires pytest-xdist

# Run with timeout protection
pytest --timeout=300  # Requires pytest-timeout

Environment Variables

# Skip slow tests
export PYTEST_SKIP_SLOW=1

# Increase verbosity
export PYTEST_VERBOSITY=2

# Custom test timeout
export PYTEST_TIMEOUT=600

# Generate HTML reports
export PYTEST_HTML_REPORT=1

Custom Test Configurations

Create custom pytest configurations in pytest.ini:

[tool:pytest]
# Custom marker for your specific needs
markers =
    custom: marks tests for custom scenarios

# Custom test paths
testpaths = tests custom_tests

# Custom output format
addopts = --tb=long --capture=no

📋 Continuous Integration Setup

GitHub Actions Example

name: Comprehensive Test Suite

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        python-version: [3.11, 3.12]
    
    steps:
    - uses: actions/checkout@v3
    
    - name: Set up Python
      uses: actions/setup-python@v4
      with:
        python-version: ${{ matrix.python-version }}
    
    - name: Install dependencies
      run: |
        pip install uv
        uv pip install -e ".[dev]"
        playwright install chromium
    
    - name: Run smoke tests
      run: python run_comprehensive_tests.py smoke
    
    - name: Run critical tests
      run: python run_comprehensive_tests.py critical
    
    - name: Run security audit
      run: python run_comprehensive_tests.py security
    
    - name: Upload test results
      if: always()
      uses: actions/upload-artifact@v3
      with:
        name: test-results
        path: test-results.xml

Jenkins Pipeline Example

pipeline {
    agent any
    
    stages {
        stage('Setup') {
            steps {
                sh 'pip install uv'
                sh 'uv pip install -e ".[dev]"'
                sh 'playwright install chromium'
            }
        }
        
        stage('Smoke Tests') {
            steps {
                sh 'python run_comprehensive_tests.py smoke'
            }
        }
        
        stage('Critical Tests') {
            steps {
                sh 'python run_comprehensive_tests.py critical'
            }
        }
        
        stage('Security Audit') {
            when { branch 'main' }
            steps {
                sh 'python run_comprehensive_tests.py security'
            }
        }
        
        stage('Full Suite') {
            when { branch 'release/*' }
            steps {
                sh 'python run_comprehensive_tests.py full'
            }
        }
    }
    
    post {
        always {
            publishTestResults testResultsPattern: 'test-results.xml'
            archiveArtifacts artifacts: 'test_results_*.json'
        }
    }
}

🐛 Troubleshooting

Common Issues

Test Timeouts

# Increase timeout for slow environments
pytest --timeout=600 tests/

# Skip timeout-prone tests
pytest -m "not slow" tests/

Memory Issues

# Run tests with memory monitoring
python run_comprehensive_tests.py performance --save-results

# Check for memory leaks
pytest tests/test_performance_stress.py::TestResourceLeakDetection -v -s

Browser Issues

# Reinstall browser binaries
playwright install chromium

# Run tests with headed browsers for debugging
pytest tests/test_browser_compatibility.py -v -s

Concurrency Issues

# Run tests serially
pytest -n 1 tests/

# Check for race conditions
pytest tests/test_edge_cases.py::TestConcurrencyAndResourceLimits -v -s

Debug Mode

Enable verbose debugging:

# Maximum verbosity
pytest -vvv -s --tb=long tests/

# Show test setup/teardown
pytest --setup-show tests/

# Show test durations
pytest --durations=0 tests/

# Debug specific test
pytest tests/test_edge_cases.py::TestMalformedJavaScriptCodes::test_syntax_error_javascript -vvv -s

📊 Test Reporting

Generate Comprehensive Reports

# Generate HTML report
python run_comprehensive_tests.py full --report-file test_report.html

# Save detailed results
python run_comprehensive_tests.py full --save-results

# Generate JUnit XML for CI
pytest --junitxml=test-results.xml tests/

# Generate coverage report
pytest --cov=src/crawailer --cov-report=html tests/

Report Formats

The test suite generates multiple report formats:

  • Console Output: Real-time progress and results
  • JSON Results: Machine-readable test data
  • HTML Reports: Detailed visual reports
  • JUnit XML: CI/CD integration format
  • Coverage Reports: Code coverage analysis

🎯 Best Practices

For Developers

  1. Run smoke tests before committing code
  2. Run critical tests before merging to main
  3. Check performance impact for optimization changes
  4. Verify security for any API modifications
  5. Update tests when adding new features

For Release Managers

  1. Run full suite before any release
  2. Review security audit results carefully
  3. Check performance benchmarks for regressions
  4. Validate browser compatibility across targets
  5. Ensure all critical tests pass at 100%

For CI/CD Setup

  1. Use appropriate test modes for different triggers
  2. Set proper timeouts for your environment
  3. Archive test results for historical analysis
  4. Configure notifications for critical failures
  5. Run security audits on every release branch

📞 Support

For questions about the test suite:

  1. Check the test output for specific error messages
  2. Review the troubleshooting section above
  3. Run tests in debug mode for detailed information
  4. Check the individual test file documentation
  5. Review the CI/CD pipeline logs for environment issues

The comprehensive test suite ensures production readiness of the Crawailer JavaScript API enhancement with 280+ test cases covering all aspects of functionality, security, performance, and compatibility.