mcbluetooth-esp32/docs/automated-e2e-testing.md
Ryan Malloy 88d006e9c4 Add automated E2E testing documentation and test prompts
- docs/automated-e2e-testing.md: Guide for running headless Claude CLI
  tests with both mcbluetooth and mcbluetooth-esp32 MCP servers
- tests/prompts/test-prompt-v4.md: 71-test suite covering Classic BT,
  BLE GATT, HCI capture, device management
- tests/prompts/test-prompt-v5.md: 76-test suite adding Battery Service
  (0x180F) and bt_ble_battery verification

Test results from v4: 71/71 PASS with 143 HCI packets captured
2026-02-03 11:18:37 -07:00

8.2 KiB

Automated E2E Testing with Claude CLI

This document describes how to run fully automated end-to-end Bluetooth tests using the Claude CLI in headless mode. The tests exercise the complete Bluetooth stack across two devices: a Linux host running mcbluetooth (BlueZ) and an ESP32 running the mcbluetooth-esp32 firmware.

Architecture

┌───────────────────────────────────────────────────────────────────┐
│                    Claude CLI (headless mode)                     │
│                   Orchestrates both MCP servers                   │
└───────────────────────────┬───────────────────────────────────────┘
                            │
            ┌───────────────┴───────────────┐
            │                               │
    ┌───────┴───────┐               ┌───────┴───────┐
    │  mcbluetooth  │               │mcbluetooth-esp32│
    │   MCP Server  │               │   MCP Server   │
    │   (bt_* tools)│               │ (esp32_* tools)│
    └───────┬───────┘               └───────┬────────┘
            │                               │
      D-Bus/BlueZ                     Serial/UART
            │                               │
    ┌───────┴───────┐               ┌───────┴────────┐
    │   Linux Host  │◄── Bluetooth ──►│     ESP32      │
    │    (hci1)     │    (over air)   │  (peripheral)  │
    └───────────────┘               └────────────────┘

Prerequisites

Hardware

  • ESP32 dev board connected via USB (typically /dev/ttyUSB0 or /dev/ttyUSB4)
  • Linux host with Bluetooth adapter (typically hci0 or hci1)

Software

  • ESP32 flashed with mcbluetooth-esp32 firmware
  • Both MCP servers installed and accessible via uvx
  • Claude CLI installed

Permissions

For HCI packet capture tests, grant btmon the required capability:

sudo setcap cap_net_raw+ep /usr/bin/btmon

Test Environment Setup

1. Create a test directory

mkdir -p /tmp/bt-e2e-test
cd /tmp/bt-e2e-test

2. Create MCP configuration

Create .mcp.json with both MCP servers:

{
  "mcpServers": {
    "esp32": {
      "type": "stdio",
      "command": "uvx",
      "args": ["mcbluetooth-esp32"],
      "env": {
        "ESP32_SERIAL_PORT": "/dev/ttyUSB4"
      }
    },
    "bluez": {
      "type": "stdio",
      "command": "uvx",
      "args": ["mcbluetooth"]
    }
  }
}

3. Initialize git (required for Claude CLI)

git init

Running Tests

Basic Command Structure

claude -p "$(cat test-prompt.md)" \
  --mcp-config .mcp.json \
  --allowedTools "mcp__esp32__*,mcp__bluez__*" \
  --output-format json \
  2>/dev/null | tee results.json | jq -r '.result'

Key flags:

  • -p: Print/headless mode (non-interactive)
  • --mcp-config: Path to MCP server configuration
  • --allowedTools: Glob patterns for permitted tools (required in headless mode)
  • --output-format json: Machine-parseable output

Full Test Suite (76 tests)

The comprehensive test suite covers:

  • ESP32 connection and system commands
  • BlueZ adapter management
  • Classic Bluetooth SSP pairing with auto-accept
  • BLE GATT service creation (Environmental Sensing + Battery Service)
  • HCI packet capture and analysis
  • GATT read/write/notify operations
  • Device management (trust, block, alias)
claude -p "$(cat test-prompt-v5.md)" \
  --mcp-config .mcp.json \
  --allowedTools "mcp__esp32__*,mcp__bluez__*" \
  --output-format json 2>/dev/null | tee results-v5.json

Analyzing Results

Extract the summary:

jq -r '.result' results-v5.json

Check pass/fail statistics:

jq -r '.result' results-v5.json | grep -E "(PASS|FAIL|Total)"

View full metrics:

jq '{
  duration_ms: .duration_ms,
  num_turns: .num_turns,
  total_cost_usd: .total_cost_usd,
  success: .is_error == false
}' results-v5.json

Test Phases

The test suite is organized into phases that must run sequentially:

Phase Tests Coverage
1. ESP32 Connection 1-4 connect, ping, get_info, status
2. BlueZ Adapter 5-8 list_adapters, adapter_info, pairable, discoverable
3. Classic BT + SSP 9-24 enable, configure, SSP mode, scan, pair, device management
4. Classic Cleanup 25-29 disable, events, clear_events
5. BLE GATT Setup 30-42 Battery Service, Environmental Sensing, advertising
6. HCI Capture + Discovery 43-51 capture_start, BLE scan, connect, services, characteristics
7. Analyze Capture 52-55 capture_stop, parse, analyze, read_raw
8. GATT Write + Notify 56-63 write, subscribe, notify, unsubscribe
9. BLE Cleanup 64-68 stop advertising, clear GATT, disable BLE
10. Adapter Management 69-73 set_alias, restore, disable discoverable
11. Final Cleanup 74-76 ESP32 reset, disconnect, final check

SSP Pairing: The auto_accept Flag

Numeric Comparison SSP requires both sides to confirm the passkey. In headless mode, this creates a deadlock:

  1. Linux calls bt_pair() which blocks waiting for ESP32 confirmation
  2. ESP32 can't receive the confirmation command because the LLM is blocked

Solution: The ESP32 firmware supports auto_accept mode:

esp32_set_ssp_mode(mode="numeric_comparison", auto_accept=true)

This makes the ESP32 automatically confirm SSP pairings, breaking the deadlock.

Battery Service Test

The test suite creates a standard Battery Service (UUID 0x180F) on the ESP32:

  1. Add Battery Service as primary GATT service
  2. Add Battery Level characteristic (UUID 0x2A19) with read property
  3. Set value to "4b" (75% in hex)
  4. After BLE connection, call bt_ble_battery on Linux
  5. Verify it returns 75

This tests the dedicated bt_ble_battery tool in mcbluetooth which reads from the standard Battery Level characteristic.

HCI Packet Capture

Tests 43-55 exercise the btsnoop capture functionality:

bt_capture_start(adapter="hci1", output_file="/tmp/ble-gatt-capture.btsnoop")
# ... BLE operations ...
bt_capture_stop(capture_id="...")
bt_capture_parse(filepath="...", max_packets=50)
bt_capture_analyze(filepath="...")
bt_capture_read_raw(filepath="...", count=20)

Typical captures include 100-150 packets covering:

  • HCI commands (LE scanning, connection)
  • ACL data (GATT operations)
  • HCI events (connection complete, encryption)

Test Prompt Format

Test prompts follow a structured format:

# Test Suite Title

## Phase N: Phase Name (Tests X-Y)

N. **Test Name**: Call `tool_name` with params — expected result

## Summary

After all tests, print a DETAILED summary table:

| # | Test | Result | Notes |
|---|------|--------|-------|
| 1 | Connect | PASS/FAIL | ... |

Troubleshooting

Serial port busy

Error: could not open port /dev/ttyUSB4

Check for other processes using the port:

lsof /dev/ttyUSB4

btmon permission denied

Error: Failed to open HCI raw socket

Grant capability:

sudo setcap cap_net_raw+ep /usr/bin/btmon

ESP32 not responding

Power cycle the ESP32 and check the firmware is flashed:

# Monitor serial output
screen /dev/ttyUSB4 115200

Press reset button — should see boot event JSON.

Pairing timeout

Ensure auto_accept=true is set for SSP numeric comparison mode before initiating pairing from Linux.

Example Results

A successful v5 run produces:

{
  "type": "result",
  "subtype": "success",
  "is_error": false,
  "duration_ms": 320000,
  "num_turns": 88,
  "result": "All 76 tests passed..."
}

Key metrics from successful runs:

  • Duration: ~5-6 minutes
  • API turns: 80-90
  • HCI packets captured: 100-150
  • Cost: ~$1.50-1.70 USD