- docs/automated-e2e-testing.md: Guide for running headless Claude CLI tests with both mcbluetooth and mcbluetooth-esp32 MCP servers - tests/prompts/test-prompt-v4.md: 71-test suite covering Classic BT, BLE GATT, HCI capture, device management - tests/prompts/test-prompt-v5.md: 76-test suite adding Battery Service (0x180F) and bt_ble_battery verification Test results from v4: 71/71 PASS with 143 HCI packets captured
288 lines
8.2 KiB
Markdown
288 lines
8.2 KiB
Markdown
# Automated E2E Testing with Claude CLI
|
|
|
|
This document describes how to run fully automated end-to-end Bluetooth tests using the Claude CLI in headless mode. The tests exercise the complete Bluetooth stack across two devices: a Linux host running `mcbluetooth` (BlueZ) and an ESP32 running the `mcbluetooth-esp32` firmware.
|
|
|
|
## Architecture
|
|
|
|
```
|
|
┌───────────────────────────────────────────────────────────────────┐
|
|
│ Claude CLI (headless mode) │
|
|
│ Orchestrates both MCP servers │
|
|
└───────────────────────────┬───────────────────────────────────────┘
|
|
│
|
|
┌───────────────┴───────────────┐
|
|
│ │
|
|
┌───────┴───────┐ ┌───────┴───────┐
|
|
│ mcbluetooth │ │mcbluetooth-esp32│
|
|
│ MCP Server │ │ MCP Server │
|
|
│ (bt_* tools)│ │ (esp32_* tools)│
|
|
└───────┬───────┘ └───────┬────────┘
|
|
│ │
|
|
D-Bus/BlueZ Serial/UART
|
|
│ │
|
|
┌───────┴───────┐ ┌───────┴────────┐
|
|
│ Linux Host │◄── Bluetooth ──►│ ESP32 │
|
|
│ (hci1) │ (over air) │ (peripheral) │
|
|
└───────────────┘ └────────────────┘
|
|
```
|
|
|
|
## Prerequisites
|
|
|
|
### Hardware
|
|
- ESP32 dev board connected via USB (typically `/dev/ttyUSB0` or `/dev/ttyUSB4`)
|
|
- Linux host with Bluetooth adapter (typically `hci0` or `hci1`)
|
|
|
|
### Software
|
|
- ESP32 flashed with mcbluetooth-esp32 firmware
|
|
- Both MCP servers installed and accessible via `uvx`
|
|
- Claude CLI installed
|
|
|
|
### Permissions
|
|
For HCI packet capture tests, grant btmon the required capability:
|
|
|
|
```bash
|
|
sudo setcap cap_net_raw+ep /usr/bin/btmon
|
|
```
|
|
|
|
## Test Environment Setup
|
|
|
|
### 1. Create a test directory
|
|
|
|
```bash
|
|
mkdir -p /tmp/bt-e2e-test
|
|
cd /tmp/bt-e2e-test
|
|
```
|
|
|
|
### 2. Create MCP configuration
|
|
|
|
Create `.mcp.json` with both MCP servers:
|
|
|
|
```json
|
|
{
|
|
"mcpServers": {
|
|
"esp32": {
|
|
"type": "stdio",
|
|
"command": "uvx",
|
|
"args": ["mcbluetooth-esp32"],
|
|
"env": {
|
|
"ESP32_SERIAL_PORT": "/dev/ttyUSB4"
|
|
}
|
|
},
|
|
"bluez": {
|
|
"type": "stdio",
|
|
"command": "uvx",
|
|
"args": ["mcbluetooth"]
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### 3. Initialize git (required for Claude CLI)
|
|
|
|
```bash
|
|
git init
|
|
```
|
|
|
|
## Running Tests
|
|
|
|
### Basic Command Structure
|
|
|
|
```bash
|
|
claude -p "$(cat test-prompt.md)" \
|
|
--mcp-config .mcp.json \
|
|
--allowedTools "mcp__esp32__*,mcp__bluez__*" \
|
|
--output-format json \
|
|
2>/dev/null | tee results.json | jq -r '.result'
|
|
```
|
|
|
|
**Key flags:**
|
|
- `-p`: Print/headless mode (non-interactive)
|
|
- `--mcp-config`: Path to MCP server configuration
|
|
- `--allowedTools`: Glob patterns for permitted tools (required in headless mode)
|
|
- `--output-format json`: Machine-parseable output
|
|
|
|
### Full Test Suite (76 tests)
|
|
|
|
The comprehensive test suite covers:
|
|
- ESP32 connection and system commands
|
|
- BlueZ adapter management
|
|
- Classic Bluetooth SSP pairing with auto-accept
|
|
- BLE GATT service creation (Environmental Sensing + Battery Service)
|
|
- HCI packet capture and analysis
|
|
- GATT read/write/notify operations
|
|
- Device management (trust, block, alias)
|
|
|
|
```bash
|
|
claude -p "$(cat test-prompt-v5.md)" \
|
|
--mcp-config .mcp.json \
|
|
--allowedTools "mcp__esp32__*,mcp__bluez__*" \
|
|
--output-format json 2>/dev/null | tee results-v5.json
|
|
```
|
|
|
|
### Analyzing Results
|
|
|
|
Extract the summary:
|
|
|
|
```bash
|
|
jq -r '.result' results-v5.json
|
|
```
|
|
|
|
Check pass/fail statistics:
|
|
|
|
```bash
|
|
jq -r '.result' results-v5.json | grep -E "(PASS|FAIL|Total)"
|
|
```
|
|
|
|
View full metrics:
|
|
|
|
```bash
|
|
jq '{
|
|
duration_ms: .duration_ms,
|
|
num_turns: .num_turns,
|
|
total_cost_usd: .total_cost_usd,
|
|
success: .is_error == false
|
|
}' results-v5.json
|
|
```
|
|
|
|
## Test Phases
|
|
|
|
The test suite is organized into phases that must run sequentially:
|
|
|
|
| Phase | Tests | Coverage |
|
|
|-------|-------|----------|
|
|
| 1. ESP32 Connection | 1-4 | connect, ping, get_info, status |
|
|
| 2. BlueZ Adapter | 5-8 | list_adapters, adapter_info, pairable, discoverable |
|
|
| 3. Classic BT + SSP | 9-24 | enable, configure, SSP mode, scan, pair, device management |
|
|
| 4. Classic Cleanup | 25-29 | disable, events, clear_events |
|
|
| 5. BLE GATT Setup | 30-42 | Battery Service, Environmental Sensing, advertising |
|
|
| 6. HCI Capture + Discovery | 43-51 | capture_start, BLE scan, connect, services, characteristics |
|
|
| 7. Analyze Capture | 52-55 | capture_stop, parse, analyze, read_raw |
|
|
| 8. GATT Write + Notify | 56-63 | write, subscribe, notify, unsubscribe |
|
|
| 9. BLE Cleanup | 64-68 | stop advertising, clear GATT, disable BLE |
|
|
| 10. Adapter Management | 69-73 | set_alias, restore, disable discoverable |
|
|
| 11. Final Cleanup | 74-76 | ESP32 reset, disconnect, final check |
|
|
|
|
## SSP Pairing: The auto_accept Flag
|
|
|
|
Numeric Comparison SSP requires **both sides** to confirm the passkey. In headless mode, this creates a deadlock:
|
|
|
|
1. Linux calls `bt_pair()` which blocks waiting for ESP32 confirmation
|
|
2. ESP32 can't receive the confirmation command because the LLM is blocked
|
|
|
|
**Solution:** The ESP32 firmware supports `auto_accept` mode:
|
|
|
|
```
|
|
esp32_set_ssp_mode(mode="numeric_comparison", auto_accept=true)
|
|
```
|
|
|
|
This makes the ESP32 automatically confirm SSP pairings, breaking the deadlock.
|
|
|
|
## Battery Service Test
|
|
|
|
The test suite creates a standard Battery Service (UUID 0x180F) on the ESP32:
|
|
|
|
1. Add Battery Service as primary GATT service
|
|
2. Add Battery Level characteristic (UUID 0x2A19) with read property
|
|
3. Set value to "4b" (75% in hex)
|
|
4. After BLE connection, call `bt_ble_battery` on Linux
|
|
5. Verify it returns 75
|
|
|
|
This tests the dedicated `bt_ble_battery` tool in mcbluetooth which reads from the standard Battery Level characteristic.
|
|
|
|
## HCI Packet Capture
|
|
|
|
Tests 43-55 exercise the btsnoop capture functionality:
|
|
|
|
```
|
|
bt_capture_start(adapter="hci1", output_file="/tmp/ble-gatt-capture.btsnoop")
|
|
# ... BLE operations ...
|
|
bt_capture_stop(capture_id="...")
|
|
bt_capture_parse(filepath="...", max_packets=50)
|
|
bt_capture_analyze(filepath="...")
|
|
bt_capture_read_raw(filepath="...", count=20)
|
|
```
|
|
|
|
Typical captures include 100-150 packets covering:
|
|
- HCI commands (LE scanning, connection)
|
|
- ACL data (GATT operations)
|
|
- HCI events (connection complete, encryption)
|
|
|
|
## Test Prompt Format
|
|
|
|
Test prompts follow a structured format:
|
|
|
|
```markdown
|
|
# Test Suite Title
|
|
|
|
## Phase N: Phase Name (Tests X-Y)
|
|
|
|
N. **Test Name**: Call `tool_name` with params — expected result
|
|
|
|
## Summary
|
|
|
|
After all tests, print a DETAILED summary table:
|
|
|
|
| # | Test | Result | Notes |
|
|
|---|------|--------|-------|
|
|
| 1 | Connect | PASS/FAIL | ... |
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### Serial port busy
|
|
|
|
```
|
|
Error: could not open port /dev/ttyUSB4
|
|
```
|
|
|
|
Check for other processes using the port:
|
|
```bash
|
|
lsof /dev/ttyUSB4
|
|
```
|
|
|
|
### btmon permission denied
|
|
|
|
```
|
|
Error: Failed to open HCI raw socket
|
|
```
|
|
|
|
Grant capability:
|
|
```bash
|
|
sudo setcap cap_net_raw+ep /usr/bin/btmon
|
|
```
|
|
|
|
### ESP32 not responding
|
|
|
|
Power cycle the ESP32 and check the firmware is flashed:
|
|
```bash
|
|
# Monitor serial output
|
|
screen /dev/ttyUSB4 115200
|
|
```
|
|
|
|
Press reset button — should see boot event JSON.
|
|
|
|
### Pairing timeout
|
|
|
|
Ensure `auto_accept=true` is set for SSP numeric comparison mode before initiating pairing from Linux.
|
|
|
|
## Example Results
|
|
|
|
A successful v5 run produces:
|
|
|
|
```json
|
|
{
|
|
"type": "result",
|
|
"subtype": "success",
|
|
"is_error": false,
|
|
"duration_ms": 320000,
|
|
"num_turns": 88,
|
|
"result": "All 76 tests passed..."
|
|
}
|
|
```
|
|
|
|
Key metrics from successful runs:
|
|
- Duration: ~5-6 minutes
|
|
- API turns: 80-90
|
|
- HCI packets captured: 100-150
|
|
- Cost: ~$1.50-1.70 USD
|