# Crawailer JavaScript API - Comprehensive Testing Guide This guide provides complete instructions for running and understanding the production-grade test suite for the Crawailer JavaScript API enhancement. ## 🎯 Test Suite Overview The test suite consists of **6 comprehensive test modules** covering all aspects of production readiness: ### Test Categories | Category | File | Focus | Tests | Priority | |----------|------|-------|-------|----------| | **Edge Cases** | `test_edge_cases.py` | Error scenarios, malformed inputs, encoding | 50+ | HIGH | | **Performance** | `test_performance_stress.py` | Stress testing, resource usage, benchmarks | 40+ | HIGH | | **Security** | `test_security_penetration.py` | Injection attacks, XSS, privilege escalation | 60+ | CRITICAL | | **Compatibility** | `test_browser_compatibility.py` | Cross-browser, viewport, user agents | 45+ | MEDIUM | | **Production** | `test_production_scenarios.py` | Real-world workflows, integrations | 35+ | HIGH | | **Regression** | `test_regression_suite.py` | Comprehensive validation, backwards compatibility | 50+ | CRITICAL | **Total: 280+ comprehensive test cases** ## 🚀 Quick Start ### Prerequisites ```bash # Install test dependencies uv pip install -e ".[dev]" # Additional testing dependencies (optional but recommended) uv pip install pytest-asyncio pytest-timeout pytest-cov pytest-html memory-profiler psutil ``` ### Running Tests #### 1. Smoke Tests (Development) ```bash # Quick validation - runs in ~2 minutes python run_comprehensive_tests.py smoke ``` #### 2. Critical Tests (Pre-release) ```bash # Essential functionality - runs in ~15 minutes python run_comprehensive_tests.py critical ``` #### 3. Full Test Suite (Release validation) ```bash # Complete validation - runs in ~45 minutes python run_comprehensive_tests.py full ``` #### 4. Performance Benchmarking ```bash # Performance analysis with resource monitoring python run_comprehensive_tests.py performance ``` #### 5. Security Audit ```bash # Security penetration testing python run_comprehensive_tests.py security ``` #### 6. CI/CD Pipeline ```bash # Optimized for automated testing python run_comprehensive_tests.py ci ``` ## 📊 Test Execution Modes ### Smoke Tests - **Purpose**: Quick validation during development - **Duration**: ~2 minutes - **Coverage**: Basic functionality, core features - **Command**: `python run_comprehensive_tests.py smoke` ### Critical Tests - **Purpose**: Pre-release validation - **Duration**: ~15 minutes - **Coverage**: Security, core functionality, error handling - **Command**: `python run_comprehensive_tests.py critical` ### Full Suite - **Purpose**: Complete production readiness validation - **Duration**: ~45 minutes - **Coverage**: All test categories - **Command**: `python run_comprehensive_tests.py full` ### Performance Benchmark - **Purpose**: Performance regression testing - **Duration**: ~20 minutes - **Coverage**: Stress tests, resource monitoring, benchmarks - **Command**: `python run_comprehensive_tests.py performance` ### Security Audit - **Purpose**: Security vulnerability assessment - **Duration**: ~10 minutes - **Coverage**: Injection attacks, privilege escalation, data exfiltration - **Command**: `python run_comprehensive_tests.py security` ### CI/CD Pipeline - **Purpose**: Automated testing in CI environments - **Duration**: ~10 minutes - **Coverage**: Non-slow tests, optimized for automation - **Command**: `python run_comprehensive_tests.py ci` ## 🔍 Individual Test Categories ### Edge Cases (`test_edge_cases.py`) Tests boundary conditions and error scenarios: ```bash # Run edge case tests pytest tests/test_edge_cases.py -v # Run specific edge case categories pytest tests/test_edge_cases.py::TestMalformedJavaScriptCodes -v pytest tests/test_edge_cases.py::TestNetworkFailureScenarios -v pytest tests/test_edge_cases.py::TestConcurrencyAndResourceLimits -v ``` **Key Test Classes:** - `TestMalformedJavaScriptCodes` - Syntax errors, infinite loops, memory exhaustion - `TestNetworkFailureScenarios` - Timeouts, DNS failures, SSL errors - `TestConcurrencyAndResourceLimits` - Concurrent execution, resource cleanup - `TestInvalidParameterCombinations` - Invalid URLs, empty scripts, timeouts - `TestEncodingAndSpecialCharacterHandling` - Unicode, binary data, control characters ### Performance & Stress (`test_performance_stress.py`) Tests performance characteristics and resource usage: ```bash # Run performance tests pytest tests/test_performance_stress.py -v -s # Run with resource monitoring pytest tests/test_performance_stress.py::TestHighConcurrencyStress -v -s ``` **Key Test Classes:** - `TestLargeScriptExecution` - Large code, large results, complex DOM processing - `TestHighConcurrencyStress` - 100+ concurrent executions, memory usage - `TestLongRunningScriptTimeouts` - Timeout precision, recovery patterns - `TestResourceLeakDetection` - Memory leaks, cleanup verification - `TestPerformanceRegression` - Baseline metrics, throughput measurement ### Security Penetration (`test_security_penetration.py`) Tests security vulnerabilities and attack prevention: ```bash # Run security tests pytest tests/test_security_penetration.py -v # Run specific security categories pytest tests/test_security_penetration.py::TestScriptInjectionPrevention -v pytest tests/test_security_penetration.py::TestDataExfiltrationPrevention -v ``` **Key Test Classes:** - `TestScriptInjectionPrevention` - Code injection, XSS, CSP bypass - `TestPrivilegeEscalationPrevention` - File access, cross-origin, Node.js escape - `TestInformationDisclosurePrevention` - Sensitive data, fingerprinting, timing attacks - `TestResourceExhaustionAttacks` - Infinite loops, memory bombs, DOM bombing - `TestDataExfiltrationPrevention` - Network exfiltration, covert channels, DNS tunneling ### Browser Compatibility (`test_browser_compatibility.py`) Tests cross-browser and device compatibility: ```bash # Run compatibility tests pytest tests/test_browser_compatibility.py -v # Test specific browser engines pytest tests/test_browser_compatibility.py::TestPlaywrightBrowserEngines -v ``` **Key Test Classes:** - `TestPlaywrightBrowserEngines` - Chromium, Firefox, WebKit differences - `TestHeadlessVsHeadedBehavior` - Mode differences, window properties - `TestViewportAndDeviceEmulation` - Responsive design, device pixel ratios - `TestUserAgentAndFingerprinting` - UA consistency, automation detection - `TestCrossFrameAndDomainBehavior` - iframe access, CORS restrictions ### Production Scenarios (`test_production_scenarios.py`) Tests real-world production workflows: ```bash # Run production scenario tests pytest tests/test_production_scenarios.py -v -s # Test specific workflows pytest tests/test_production_scenarios.py::TestComplexWorkflows -v ``` **Key Test Classes:** - `TestComplexWorkflows` - E-commerce monitoring, social media analysis, news aggregation - `TestDatabaseIntegrationEdgeCases` - Transaction handling, connection failures - `TestFileSystemInteractionEdgeCases` - File downloads, large files, permissions - `TestNetworkInterruptionHandling` - Timeout recovery, partial failures - `TestProductionErrorScenarios` - Cascading failures, resource exhaustion ### Regression Suite (`test_regression_suite.py`) Comprehensive validation and backwards compatibility: ```bash # Run regression tests pytest tests/test_regression_suite.py -v # Test specific aspects pytest tests/test_regression_suite.py::TestVersionCompatibility -v pytest tests/test_regression_suite.py::TestContinuousIntegration -v ``` **Key Test Classes:** - `TestRegressionSuite` - Full regression validation - `TestVersionCompatibility` - Feature evolution, migration paths - `TestContinuousIntegration` - CI/CD smoke tests, resource cleanup ## 📈 Performance Benchmarks The test suite establishes performance baselines: ### Execution Time Benchmarks - **Basic Script Execution**: < 100ms average - **DOM Query Operations**: < 200ms average - **Data Processing (1K items)**: < 300ms average - **Concurrent Operations (10)**: < 2s total - **Large Data Handling (10MB)**: < 30s total ### Resource Usage Thresholds - **Memory Growth**: < 100MB per 100 operations - **Thread Leakage**: < 5 threads delta after cleanup - **File Descriptor Leaks**: < 20 FDs delta - **CPU Usage**: < 80% average during stress tests ### Throughput Targets - **Serial Execution**: > 10 operations/second - **Concurrent Execution**: > 20 operations/second - **Speedup Ratio**: > 1.5x concurrent vs serial ## 🔒 Security Test Coverage The security test suite covers: ### Injection Attacks - JavaScript code injection - XSS payload testing - SQL injection attempts - Command injection prevention ### Privilege Escalation - File system access attempts - Cross-origin resource access - Node.js context escape attempts - Prototype pollution attacks ### Information Disclosure - Sensitive data access attempts - Browser fingerprinting prevention - Timing attack prevention - Error message sanitization ### Resource Exhaustion - Infinite loop protection - Memory bomb prevention - DOM bombing protection - Network flood prevention ### Data Exfiltration - Network-based exfiltration - Covert channel prevention - DNS tunneling prevention - Encoding bypass attempts ## 🎯 Quality Metrics & Thresholds ### Pass Rate Requirements - **Critical Tests**: 100% pass rate required - **Performance Tests**: 90% pass rate required - **Security Tests**: 100% pass rate required - **Compatibility Tests**: 85% pass rate required ### Performance Thresholds - **Test Execution Time**: < 45 minutes for full suite - **Memory Usage**: < 500MB peak during testing - **CPU Usage**: < 90% peak during stress tests - **Resource Cleanup**: 100% successful cleanup ### Coverage Requirements - **Code Coverage**: > 90% (with pytest-cov) - **Feature Coverage**: 100% of JavaScript API features - **Error Scenario Coverage**: > 95% of error conditions - **Browser Coverage**: Chrome, Firefox, Safari equivalents ## 🛠️ Advanced Testing Options ### Custom Pytest Arguments ```bash # Run with custom markers pytest -m "security and critical" -v # Run with coverage reporting pytest --cov=src/crawailer --cov-report=html # Run with performance profiling pytest --tb=short --durations=0 # Run with parallel execution pytest -n auto # Requires pytest-xdist # Run with timeout protection pytest --timeout=300 # Requires pytest-timeout ``` ### Environment Variables ```bash # Skip slow tests export PYTEST_SKIP_SLOW=1 # Increase verbosity export PYTEST_VERBOSITY=2 # Custom test timeout export PYTEST_TIMEOUT=600 # Generate HTML reports export PYTEST_HTML_REPORT=1 ``` ### Custom Test Configurations Create custom pytest configurations in `pytest.ini`: ```ini [tool:pytest] # Custom marker for your specific needs markers = custom: marks tests for custom scenarios # Custom test paths testpaths = tests custom_tests # Custom output format addopts = --tb=long --capture=no ``` ## 📋 Continuous Integration Setup ### GitHub Actions Example ```yaml name: Comprehensive Test Suite on: [push, pull_request] jobs: test: runs-on: ubuntu-latest strategy: matrix: python-version: [3.11, 3.12] steps: - uses: actions/checkout@v3 - name: Set up Python uses: actions/setup-python@v4 with: python-version: ${{ matrix.python-version }} - name: Install dependencies run: | pip install uv uv pip install -e ".[dev]" playwright install chromium - name: Run smoke tests run: python run_comprehensive_tests.py smoke - name: Run critical tests run: python run_comprehensive_tests.py critical - name: Run security audit run: python run_comprehensive_tests.py security - name: Upload test results if: always() uses: actions/upload-artifact@v3 with: name: test-results path: test-results.xml ``` ### Jenkins Pipeline Example ```groovy pipeline { agent any stages { stage('Setup') { steps { sh 'pip install uv' sh 'uv pip install -e ".[dev]"' sh 'playwright install chromium' } } stage('Smoke Tests') { steps { sh 'python run_comprehensive_tests.py smoke' } } stage('Critical Tests') { steps { sh 'python run_comprehensive_tests.py critical' } } stage('Security Audit') { when { branch 'main' } steps { sh 'python run_comprehensive_tests.py security' } } stage('Full Suite') { when { branch 'release/*' } steps { sh 'python run_comprehensive_tests.py full' } } } post { always { publishTestResults testResultsPattern: 'test-results.xml' archiveArtifacts artifacts: 'test_results_*.json' } } } ``` ## 🐛 Troubleshooting ### Common Issues #### Test Timeouts ```bash # Increase timeout for slow environments pytest --timeout=600 tests/ # Skip timeout-prone tests pytest -m "not slow" tests/ ``` #### Memory Issues ```bash # Run tests with memory monitoring python run_comprehensive_tests.py performance --save-results # Check for memory leaks pytest tests/test_performance_stress.py::TestResourceLeakDetection -v -s ``` #### Browser Issues ```bash # Reinstall browser binaries playwright install chromium # Run tests with headed browsers for debugging pytest tests/test_browser_compatibility.py -v -s ``` #### Concurrency Issues ```bash # Run tests serially pytest -n 1 tests/ # Check for race conditions pytest tests/test_edge_cases.py::TestConcurrencyAndResourceLimits -v -s ``` ### Debug Mode Enable verbose debugging: ```bash # Maximum verbosity pytest -vvv -s --tb=long tests/ # Show test setup/teardown pytest --setup-show tests/ # Show test durations pytest --durations=0 tests/ # Debug specific test pytest tests/test_edge_cases.py::TestMalformedJavaScriptCodes::test_syntax_error_javascript -vvv -s ``` ## 📊 Test Reporting ### Generate Comprehensive Reports ```bash # Generate HTML report python run_comprehensive_tests.py full --report-file test_report.html # Save detailed results python run_comprehensive_tests.py full --save-results # Generate JUnit XML for CI pytest --junitxml=test-results.xml tests/ # Generate coverage report pytest --cov=src/crawailer --cov-report=html tests/ ``` ### Report Formats The test suite generates multiple report formats: - **Console Output**: Real-time progress and results - **JSON Results**: Machine-readable test data - **HTML Reports**: Detailed visual reports - **JUnit XML**: CI/CD integration format - **Coverage Reports**: Code coverage analysis ## 🎯 Best Practices ### For Developers 1. **Run smoke tests** before committing code 2. **Run critical tests** before merging to main 3. **Check performance impact** for optimization changes 4. **Verify security** for any API modifications 5. **Update tests** when adding new features ### For Release Managers 1. **Run full suite** before any release 2. **Review security audit** results carefully 3. **Check performance benchmarks** for regressions 4. **Validate browser compatibility** across targets 5. **Ensure all critical tests pass** at 100% ### For CI/CD Setup 1. **Use appropriate test modes** for different triggers 2. **Set proper timeouts** for your environment 3. **Archive test results** for historical analysis 4. **Configure notifications** for critical failures 5. **Run security audits** on every release branch --- ## 📞 Support For questions about the test suite: 1. Check the test output for specific error messages 2. Review the troubleshooting section above 3. Run tests in debug mode for detailed information 4. Check the individual test file documentation 5. Review the CI/CD pipeline logs for environment issues The comprehensive test suite ensures production readiness of the Crawailer JavaScript API enhancement with 280+ test cases covering all aspects of functionality, security, performance, and compatibility.