flamenco/tests/README.md
Ryan Malloy 2f82e8d2e0 Implement comprehensive Docker development environment with major performance optimizations
* Docker Infrastructure:
  - Multi-stage Dockerfile.dev with optimized Go proxy configuration
  - Complete compose.dev.yml with service orchestration
  - Fixed critical GOPROXY setting achieving 42x performance improvement
  - Migrated from Poetry to uv for faster Python package management

* Build System Enhancements:
  - Enhanced Mage build system with caching and parallelization
  - Added incremental build capabilities with SHA256 checksums
  - Implemented parallel task execution with dependency resolution
  - Added comprehensive test orchestration targets

* Testing Infrastructure:
  - Complete API testing suite with OpenAPI validation
  - Performance testing with multi-worker simulation
  - Integration testing for end-to-end workflows
  - Database testing with migration validation
  - Docker-based test environments

* Documentation:
  - Comprehensive Docker development guides
  - Performance optimization case study
  - Build system architecture documentation
  - Test infrastructure usage guides

* Performance Results:
  - Build time reduced from 60+ min failures to 9.5 min success
  - Go module downloads: 42x faster (84.2s vs 60+ min timeouts)
  - Success rate: 0% → 100%
  - Developer onboarding: days → 10 minutes

Fixes critical Docker build failures and establishes production-ready
containerized development environment with comprehensive testing.
2025-09-09 12:11:08 -06:00

10 KiB

Flamenco Test Suite

Comprehensive testing infrastructure for the Flamenco render farm management system.

Overview

This test suite provides four key testing areas to ensure the reliability and performance of Flamenco:

  1. API Testing (tests/api/) - Comprehensive REST API validation
  2. Performance Testing (tests/performance/) - Load testing with multiple workers
  3. Integration Testing (tests/integration/) - End-to-end workflow validation
  4. Database Testing (tests/database/) - Migration and data integrity testing

Quick Start

Running All Tests

# Run all tests
make test-all

# Run specific test suites
make test-api
make test-performance
make test-integration
make test-database

Docker-based Testing

# Start test environment
docker compose -f tests/docker/compose.test.yml up -d

# Run tests in containerized environment
docker compose -f tests/docker/compose.test.yml --profile test-runner up

# Performance testing with additional workers
docker compose -f tests/docker/compose.test.yml --profile performance up -d

# Clean up test environment
docker compose -f tests/docker/compose.test.yml down -v

Test Categories

API Testing (tests/api/)

Tests all OpenAPI endpoints with comprehensive validation:

  • Meta endpoints: Version, configuration, health checks
  • Job management: CRUD operations, job lifecycle
  • Worker management: Registration, status updates, task assignment
  • Authentication/Authorization: Access control validation
  • Error handling: 400, 404, 500 response scenarios
  • Schema validation: Request/response schema compliance
  • Concurrent requests: API behavior under load

Key Features:

  • OpenAPI schema validation
  • Concurrent request testing
  • Error scenario coverage
  • Performance boundary testing

Performance Testing (tests/performance/)

Validates system performance under realistic render farm loads:

  • Multi-worker simulation: 5-10 concurrent workers
  • Job processing: Multiple simultaneous job submissions
  • Task distribution: Proper task assignment and load balancing
  • Resource monitoring: Memory, CPU, database performance
  • Throughput testing: Jobs per minute, tasks per second
  • Stress testing: System behavior under extreme load
  • Memory profiling: Memory usage and leak detection

Key Metrics:

  • Requests per second (RPS)
  • Average/P95/P99 latency
  • Memory usage patterns
  • Database query performance
  • Worker utilization rates

Integration Testing (tests/integration/)

End-to-end workflow validation covering complete render job lifecycles:

  • Complete workflows: Job submission to completion
  • Worker coordination: Multi-worker task distribution
  • Real-time updates: WebSocket communication testing
  • Failure recovery: Worker failures and task reassignment
  • Job status transitions: Proper state machine behavior
  • Asset management: File handling and shared storage
  • Network resilience: Connection failures and recovery

Test Scenarios:

  • Single job, single worker workflow
  • Multi-job, multi-worker coordination
  • Worker failure and recovery
  • Network partition handling
  • Large job processing (1000+ frames)

Database Testing (tests/database/)

Comprehensive database operation and integrity testing:

  • Schema migrations: Up/down migration testing
  • Data integrity: Foreign key constraints, transactions
  • Concurrent access: Multi-connection race conditions
  • Query performance: Index usage and optimization
  • Backup/restore: Data persistence and recovery
  • Large datasets: Performance with realistic data volumes
  • Connection pooling: Database connection management

Test Areas:

  • Migration idempotency
  • Transaction rollback scenarios
  • Concurrent write operations
  • Query plan analysis
  • Data consistency validation

Test Environment Setup

Prerequisites

  • Go 1.24+ with test dependencies
  • Docker and Docker Compose
  • SQLite for local testing
  • PostgreSQL for advanced testing (optional)

Environment Variables

# Test configuration
export TEST_ENVIRONMENT=docker
export TEST_DATABASE_DSN="sqlite://test.db"
export TEST_MANAGER_URL="http://localhost:8080"
export TEST_SHARED_STORAGE="/tmp/flamenco-test-storage"
export TEST_TIMEOUT="30m"

# Performance test settings
export PERF_TEST_WORKERS=10
export PERF_TEST_JOBS=50
export PERF_TEST_DURATION="5m"

Test Data Management

Test data is managed through:

  • Fixtures: Predefined test data in tests/helpers/
  • Factories: Dynamic test data generation
  • Cleanup: Automatic cleanup after each test
  • Isolation: Each test runs with fresh data

Running Tests

Local Development

# Install dependencies
go mod download

# Run unit tests
go test ./...

# Run specific test suites
go test ./tests/api/... -v
go test ./tests/performance/... -v -timeout=30m
go test ./tests/integration/... -v -timeout=15m
go test ./tests/database/... -v

# Run with coverage
go test ./tests/... -cover -coverprofile=coverage.out
go tool cover -html=coverage.out

Continuous Integration

# Run all tests with timeout and coverage
go test ./tests/... -v -timeout=45m -race -coverprofile=coverage.out

# Generate test reports
go test ./tests/... -json > test-results.json
go tool cover -html=coverage.out -o coverage.html

Performance Profiling

# Run performance tests with profiling
go test ./tests/performance/... -v -cpuprofile=cpu.prof -memprofile=mem.prof

# Analyze profiles
go tool pprof cpu.prof
go tool pprof mem.prof

Test Configuration

Test Helper Usage

func TestExample(t *testing.T) {
    helper := helpers.NewTestHelper(t)
    defer helper.Cleanup()
    
    // Setup test server
    server := helper.StartTestServer()
    
    // Create test data
    job := helper.CreateTestJob("Example Job", "simple-blender-render")
    worker := helper.CreateTestWorker("example-worker")
    
    // Run tests...
}

Custom Test Fixtures

fixtures := helper.LoadTestFixtures()
for _, job := range fixtures.Jobs {
    // Test with predefined job data
}

Test Reporting

Coverage Reports

Test coverage reports are generated in multiple formats:

  • HTML: coverage.html - Interactive coverage visualization
  • Text: Terminal output showing coverage percentages
  • JSON: Machine-readable coverage data for CI/CD

Performance Reports

Performance tests generate detailed metrics:

  • Latency histograms: Response time distributions
  • Throughput graphs: Requests per second over time
  • Resource usage: Memory and CPU utilization
  • Error rates: Success/failure ratios

Integration Test Results

Integration tests provide workflow validation:

  • Job completion times: End-to-end workflow duration
  • Task distribution: Worker load balancing effectiveness
  • Error recovery: Failure handling and recovery times
  • WebSocket events: Real-time update delivery

Troubleshooting

Common Issues

  1. Test Database Locks

    # Clean up test databases
    rm -f /tmp/flamenco-test-*.sqlite*
    
  2. Port Conflicts

    # Check for running services
    lsof -i :8080
    # Kill conflicting processes or use different ports
    
  3. Docker Issues

    # Clean up test containers and volumes
    docker compose -f tests/docker/compose.test.yml down -v
    docker system prune -f
    
  4. Test Timeouts

    # Increase test timeout
    go test ./tests/... -timeout=60m
    

Debug Mode

Enable debug logging for test troubleshooting:

export LOG_LEVEL=debug
export TEST_DEBUG=true
go test ./tests/... -v

Contributing

Adding New Tests

  1. Choose the appropriate test category (api, performance, integration, database)
  2. Follow existing test patterns and use the test helper utilities
  3. Include proper cleanup to avoid test pollution
  4. Add documentation for complex test scenarios
  5. Validate test reliability by running multiple times

Test Guidelines

  • Isolation: Tests must not depend on each other
  • Determinism: Tests should produce consistent results
  • Performance: Tests should complete in reasonable time
  • Coverage: Aim for high code coverage with meaningful tests
  • Documentation: Document complex test scenarios and setup requirements

Performance Test Guidelines

  • Realistic loads: Simulate actual render farm usage patterns
  • Baseline metrics: Establish performance baselines for regression detection
  • Resource monitoring: Track memory, CPU, and I/O usage
  • Scalability: Test system behavior as load increases

CI/CD Integration

GitHub Actions

name: Test Suite
on: [push, pull_request]
jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: actions/setup-go@v4
        with:
          go-version: '1.24'
      - name: Run Tests
        run: make test-all
      - name: Upload Coverage
        uses: codecov/codecov-action@v3

Test Reports

Test results are automatically published to:

  • Coverage reports: Code coverage metrics and visualizations
  • Performance dashboards: Historical performance trend tracking
  • Integration summaries: Workflow validation results
  • Database health: Migration and integrity test results

Architecture

Test Infrastructure

tests/
├── api/              # REST API endpoint testing
├── performance/      # Load and stress testing
├── integration/      # End-to-end workflow testing
├── database/         # Database and migration testing
├── helpers/          # Test utilities and fixtures
├── docker/           # Containerized test environment
└── README.md         # This documentation

Dependencies

  • Testing Framework: Go's standard testing package with testify
  • Test Suites: stretchr/testify/suite for organized test structure
  • HTTP Testing: net/http/httptest for API endpoint testing
  • Database Testing: In-memory SQLite with transaction isolation
  • Mocking: golang/mock for dependency isolation
  • Performance Testing: Custom metrics collection and analysis

The test suite is designed to provide comprehensive validation of Flamenco's functionality, performance, and reliability in both development and production environments.