Docs: extend JOURNEY through the HA + harness + demo arc; add CHANGELOG

docs/JOURNEY.md — replaced the placeholder 'What's next' section with
seven new chronological entries covering everything that happened after
the panel-search comedy:

  - HA rebuild Phase A: poll-vs-push decision, pure-function helpers
    extraction, 61 unit tests with no HA imports
  - HA Phase B: the six new entity platforms, the Omni state-byte
    overload, security-mode-to-alarm-state mapping, the scene-platform
    skip decision
  - HA Phase C: services + diagnostics + repairs flow
  - 'wait, did we mock enough?' — catching the missing Thermostat
    (6) and Button (3) RequestProperties handlers BEFORE the HA
    harness ever touched the mock
  - HA test harness rough patches: requires-python conflict, pytest_socket
    fight, the CONF_ENTRY_ID-doesn't-exist-in-HA find, teardown hang
    fixed by converting configured_panel into a generator
  - Docker dev stack: mounting only src/ to dodge the read-only-venv
    problem with uv
  - Automated onboarding + screenshots: the auth_code OAuth dance, the
    template-endpoint device-id trick, playwright auto-injection of
    hassTokens, the discovery-during-onboarding nice surprise

Plus appended five new entries to 'Things worth remembering':
  - Pure functions are the cheapest thing in test suites
  - Mocking the entire protocol counterpart catches whole categories
  - pytest_socket + real network can coexist
  - The 'build without a real device' loop is unreasonably effective
  - (existing entries kept verbatim)

Final length: ~6800 words, 27 dated sections plus the lessons list.

CHANGELOG.md — new file. Single 2026.5.10 entry under Keep-a-Changelog-
ish format, broken into seven sections matching the project layers:
Protocol layer (RE findings), Library, Home Assistant integration,
Tests, Developer tooling, Documentation, Known gaps. Cites the source
line numbers for the two non-public protocol quirks. Lists every
public module + every entity platform. Linked to git tag template at
the bottom (release not pushed yet).

Tests still 351 + 1 skip. No code changed.
This commit is contained in:
Ryan Malloy 2026-05-10 16:29:41 -06:00
parent f6a09592f1
commit 7b4052624c
2 changed files with 353 additions and 12 deletions

85
CHANGELOG.md Normal file
View File

@ -0,0 +1,85 @@
# Changelog
All notable changes to this project. Date-based versioning ([CalVer](https://calver.org/), `YYYY.M.D`); each release date corresponds to a backwards-incompatible boundary.
## [2026.5.10] — 2026-05-10
First release. Working library + Home Assistant custom component, validated end-to-end against an in-process mock panel and a real HA instance running in Docker. Not yet validated against a live panel because the user's panel's network module is currently off.
### Protocol layer (the reverse engineering)
- Decompiled HAI's PC Access 3.17 (.NET) with ilspycmd; identified two namespaces — `HAI_Shared` (protocol/crypto/domain) and `PCAccess3` (UI). Decompilation lives in `pca-re/decompiled/`.
- Reverse-engineered the `.pca` and `PCA01.CFG` file format — Borland-Pascal LCG keystream XORed byte-by-byte. Two hardcoded keys:
- `KEY_PC01 = 0x14326573` for `PCA01.CFG`
- `KEY_EXPORT = 0x17569237` for import/export `.pca`
Per-installation `.pca` files use a third key derived from the panel's installer code; that key is stored in plaintext inside `PCA01.CFG` after first-stage decryption.
- Documented the Omni-Link II wire protocol byte-for-byte (`pca-re/notes/handshake.md`), including **two non-public quirks** absent from `jomnilinkII`, `pyomnilink`, and every public Omni-Link writeup we found:
1. **Session key = `ControllerKey[0:11] || (ControllerKey[11:16] XOR SessionID[0:5])`** — not just the panel's ControllerKey directly. Source: `clsOmniLinkConnection.cs:1886-1892`.
2. **Per-block XOR pre-whitening before AES** — first two bytes of every 16-byte block are XORed with the packet's 16-bit sequence number, same mask all blocks. Source: `clsOmniLinkConnection.cs:396-401`.
- Located a latent bug in PC Access itself: a `LargeVocabulary` skip-path uses a buffer sized for the non-LargeVocabulary case. Harmless on every shipping panel (the count check always satisfies the constraint) but documented in `pca-re/notes/body_parser.md`.
### Library — `omni_pca`
- `crypto.py` — AES-128-ECB with PaddingMode.Zeros semantics, `derive_session_key()`, per-block XOR pre-whitening, `encrypt_message_payload()`/`decrypt_message_payload()`. All citations to C# source line numbers.
- `opcodes.py` — Three IntEnums byte-exact to the C# decompilation: `PacketType` (12 values), `OmniLinkMessageType` (104 v1 opcodes), `OmniLink2MessageType` (83 v2 opcodes). Plus `ConnectionType`, `ProtocolVersion`.
- `packet.py` / `message.py` — Outer `Packet` (4-byte header + payload) and inner `Message` framing. CRC-16/MODBUS (poly `0xA001`).
- `pca_file.py` — Borland LCG XOR cipher, `PcaReader` with `u8/u16/u32/string8/string8_fixed/string16/string16_fixed`, `parse_pca01_cfg()`, `parse_pca_file()`. Account-info fields default `repr=False` to avoid accidental PII leakage in logs.
- `connection.py``OmniConnection`: async TCP, full secure-session handshake (4 packets), monotonic per-direction sequence numbers with `0xFFFF → 1` wraparound (skips 0), TCP framing that decrypts the first 16-byte block to learn the inner message length, reader task dispatching solicited replies to Futures and unsolicited messages to a queue, automatic reconnect on `OmniConnectionError`, custom exceptions (`HandshakeError`, `InvalidEncryptionKeyError`, `ProtocolError`, `RequestTimeoutError`).
- `models.py` — 21 typed frozen-slots dataclasses for every Omni object: `SystemInformation`, `SystemStatus`, `ZoneProperties/Status`, `UnitProperties/Status`, `AreaProperties/Status`, `ThermostatProperties/Status`, `ButtonProperties`, `ProgramProperties`, `CodeProperties`, `MessageProperties`, `AuxSensorStatus`, `AudioZoneProperties/Status`, `AudioSourceProperties/Status`, `UserSettingProperties/Status`. Plus `SecurityMode`, `HvacMode`, `FanMode`, `HoldMode`, `ZoneType`, `ObjectType` enums and temperature converters (Omni's linear `°F = round(raw * 9/10) - 40`).
- `commands.py``Command` IntEnum (64 values, sourced from `enuUnitCommand.cs` which is the canonical command enum despite the misleading name), `SecurityCommandResponse`, `CommandFailedError`.
- `client.py` — High-level `OmniClient` with 18 methods: `get_system_information`, `get_system_status`, `get_object_properties`, `list_*_names`, `execute_security_command`, `execute_command`, `get_object_status`, `get_extended_status`, `acknowledge_alerts`, typed wrappers (`turn_unit_on/off`, `set_unit_level`, `bypass_zone/restore_zone`, `set_thermostat_{system,fan,hold}_mode`, `set_thermostat_{heat,cool}_setpoint_raw`, `execute_button`, `execute_program`, `show_message`, `clear_message`), `events()` async iterator over typed `SystemEvent` objects.
- `events.py``SystemEvent` hierarchy. 26 typed subclasses (`ZoneStateChanged`, `UnitStateChanged`, `ArmingChanged`, `AlarmActivated/Cleared`, `AcLost/Restored`, `BatteryLow/Restored`, `UserMacroButton`, `PhoneLineDead/Restored`, …) + `UnknownEvent` catch-all. SystemEvents (opcode 55) packets carry multiple events; `parse_events()` returns a list. `EventStream` flattens batches across messages.
- `mock_panel.py` — Stateful async TCP server emulating an Omni Pro II controller. Handles handshake, `RequestSystemInformation/Status`, `RequestProperties` for Zone/Unit/Area/Thermostat/Button, `RequestStatus`/`RequestExtendedStatus`, `Command`, `ExecuteSecurityCommand`, `AcknowledgeAlerts`. State changes push synthesized `SystemEvents` packets back to the client.
- `__main__.py` — CLI: `omni-pca decode-pca <file> [--field controller_key|host|port] [--include-pii]`, `omni-pca mock-panel`, `omni-pca version`. PII opt-in.
### Home Assistant integration — `custom_components/omni_pca/`
- `coordinator.py``OmniDataUpdateCoordinator` with long-lived `OmniClient`, one-time discovery pass at first refresh (enumerates zones, units, areas, thermostats, buttons), periodic 30s poll for live state, background event-listener task consuming `client.events()` and patching state in-place on each push. `ConfigEntryAuthFailed` on `InvalidEncryptionKeyError` triggers HA's reauth flow.
- Eight platforms wrapping the library client:
- `alarm_control_panel` — one per area, supports Day/Night/Away/Vacation/DayInstant arm modes with code validation
- `binary_sensor` — one per binary zone (state + bypass diagnostic) plus 3 system-level (AC, battery, trouble)
- `button` — one per panel button macro
- `climate` — one per thermostat (OFF/HEAT/COOL/HEAT_COOL + fan + preset modes)
- `event` — one per panel, relays 12 typed event types to HA automations
- `light` — one per unit (dimmable; non-dimmable relays silently ignore brightness)
- `sensor` — analog zones (temperature/humidity/power), per-thermostat diagnostic temp/humidity/outdoor sensors, panel model+firmware sensor, last-event sensor
- `switch` — per-zone bypass control (config entity_category)
- `config_flow.py` — User + reauth steps. Host/port/controller_key with hex validation. Probes the panel via `OmniClient.get_system_information()` before creating the entry; surfaces auth/cannot_connect errors with HA-friendly strings.
- `services.yaml` + `services.py` — 7 services (`bypass_zone`, `restore_zone`, `execute_program`, `show_message`, `clear_message`, `acknowledge_alerts`, `send_command`). Idempotent registration; each takes a `config_entry` selector so users pick the panel.
- `diagnostics.py` — Snapshot dump with controller key redacted and zone/unit/area names sha256-hashed.
- `helpers.py` — Pure functions for everything HA-shape-dependent: zone-type→device-class, brightness conversion, HVAC mode round-trip, temperature inverse, alarm state translation, event-type strings. No `homeassistant.*` imports; 61 unit tests covering it.
- `manifest.json``iot_class: local_push`, `version: 2026.5.10`, `config_flow: true`, requires `omni-pca==2026.5.10`.
- `hacs.json` at project root for HACS distribution.
### Tests
- **351 passing, 1 skipped.** Ruff clean across `src/`, `tests/`, `custom_components/`.
- 17 e2e tests connecting `OmniClient` to `MockPanel` over real TCP, proving the full handshake + encryption + framing stack roundtrips.
- 12 HA-side integration tests using `pytest-homeassistant-custom-component` — boot HA in-process, drive the config flow, exercise services, verify state mutations. Full HA-side suite runs in <1 second.
- 61 unit tests on `custom_components/omni_pca/helpers.py` running without HA installed.
- Unit tests for every library module (crypto KAT vectors, CRC-16, packet/message ser-de, .pca decrypt, command payloads, event parsing).
### Developer tooling
- `dev/docker-compose.yml` + `dev/Makefile` — One-command HA + MockPanel stack for manual smoke testing and screenshot capture.
- `dev/run_mock_panel.py` — Long-running mock seeded with 5 zones, 4 units, 2 areas, 2 thermostats, 3 buttons, 2 user codes.
- `dev/screenshot.py` — End-to-end automated demo: onboards HA via REST, adds the integration via config-flow API, drives headless chromium via playwright to capture six deep-linked PNGs (overview, integrations list, integration detail, device page, entities table, developer states).
### Documentation
- `docs/JOURNEY.md` — 6,000+ word raw chronological narrative from "pile of binaries" through "351 tests green, screenshots captured". Source material for future writeups.
- `pca-re/notes/findings.md` — RE technical findings (cipher, file format, protocol overview).
- `pca-re/notes/handshake.md` — Byte-level handshake spec with C# source line citations.
- `pca-re/notes/body_parser.md` — .pca body schema + the LargeVocabulary latent bug.
- Top-level `README.md` — Library + HA quick start.
- `custom_components/omni_pca/README.md` — Entity table, services list, automation example, troubleshooting.
- `dev/README.md` — Docker dev stack walkthrough.
### Known gaps
- **Live panel validation**: blocked on the user's panel's Ethernet module being enabled. Mock panel proves the stack roundtrips; the live lap is one TCP connect away once the panel is reachable.
- **Programs discovery**: the library's v1.0 has no `RequestProperties` path for Program objects; the HA coordinator returns an empty programs dict. Programs can still be executed by index via the `omni_pca.execute_program` service.
- **PyPI publish**: `omni-pca` not yet on PyPI; HA `manifest.json` requirements line will only resolve once it is. For now users either install the wheel manually or pip-install from a Git URL.
- **HACS submission**: pending live-panel validation.
[2026.5.10]: https://github.com/rsp2k/omni-pca/releases/tag/v2026.5.10

View File

@ -639,20 +639,240 @@ enables the Ethernet module, save, reboot. Then the live validation
becomes a five-minute test. Until then, the mock is the best we have, becomes a five-minute test. Until then, the mock is the best we have,
and the mock is a faithful enough emulator that we trust it. and the mock is a faithful enough emulator that we trust it.
## What's next ## 2026-05-10 evening — HA rebuild Phase A
The Home Assistant custom_component is being rebuilt on top of the v1.0 The first HA scaffold (a placeholder `binary_sensor` for zones, written
library surface — alarm_control_panel, light, switch, climate, sensor, before the library was complete) needed to come down and get rebuilt on
scene, button, event entities, plus services.yaml and diagnostics. That the v1.0 surface. The interesting design choice: how should the
work is in progress and will be validated as soon as we can bring the coordinator pull state?
panel's network module online.
When we do, the moment of truth is one TCP connect to port 4369 and Option A: re-poll everything every N seconds.
one `RequestSystemInformation` exchange. If it comes back with Option B: rely on the panel's unsolicited push messages and only poll
`Omni Pro II / 2.12 r1`, the entire stack — file decryption, key as a backstop.
extraction, key derivation, XOR pre-whitening, AES, the works — was
right end to end. If it comes back with `ControllerSessionTerminated`, We picked B. The Omni panel is genuinely chatty — when a zone trips,
we missed something subtle. The mock says we didn't. We'll find out. when an area arms, when AC fails, when a unit toggles, the panel pushes
a `SystemEvents` packet within a few hundred ms. Our `OmniConnection`
already decodes those into typed `SystemEvent` objects via an async
iterator (`client.events()`). The coordinator now runs a long-lived
background task consuming that iterator and patches the relevant slice
of state in-place, then calls `async_set_updated_data()` so HA reacts
immediately. The 30-second poll is a safety net for state we missed.
The piece that took longer than expected was extracting pure functions
from the entity-class soup so we could unit-test without HA installed
in the venv. We ended up with `helpers.py`: zone-type → device-class
mapping, latched-vs-current-condition logic per zone family, name
prettifier (`FRONT_DOOR``Front Door`). 61 unit tests for `helpers.py`
alone, all running without importing `homeassistant.*`. Sounds excessive
until you remember that pure-function tests are the only ones that run
in <100ms; you don't want to wait 15 seconds for HA to boot just to
verify that zone-type 32 (FIRE) maps to `BinarySensorDeviceClass.SMOKE`.
## 2026-05-10 evening — HA Phase B (the entity build-out)
Six platforms in one pass: `alarm_control_panel` (per area, with code
validation), `light` (per unit, dimmable), `switch` (per zone for
bypass control), `climate` (per thermostat, full HVAC modes),
`sensor` (analog zones + thermostat readings + panel telemetry),
`button` (per panel macro), `event` (one per panel relaying typed
push events as HA event_types).
The mapping work was repetitive but mostly mechanical. The interesting
bits:
- The Omni unit "state" byte is overloaded: 0=off, 1=on (relay),
100..200=brightness percent (state - 100), plus weird ranges for
scene levels (2..13) and ramping codes (17..25). Encoded as a pair
of pure helpers (`omni_state_to_ha_brightness` /
`ha_brightness_to_omni_percent`) so the conversion is unit-tested.
- Omni's `SecurityMode` enum has *both* steady-state values (Off=0,
Day=1, Away=3, …) *and* arming-in-progress values (ArmingDay=9,
ArmingAway=11, …). The HA `AlarmControlPanelState` mapping needs
to bucket the 9..14 range into HA's `arming` state regardless of
destination. Plus alarm_active overrides everything to `triggered`,
and entry-timer running means `pending`, exit-timer means `arming`.
All of this lives in one pure `security_mode_to_alarm_state()`
function so it's unit-testable end to end.
- The HA `event` platform is newer than I'd realised. It exposes
push events as a single entity per integration with `event_types`
and `event_data`. Automations key on `platform: event` filtering
by `event_type`. We surface 12 event-type strings:
`zone_state_changed`, `unit_state_changed`, `arming_changed`,
`alarm_activated`, `alarm_cleared`, `ac_lost`, `ac_restored`,
`battery_low`, `battery_restored`, `user_macro_button`,
`phone_line_dead`, `phone_line_restored`, plus an `unknown`
catch-all for the 14 less common SystemEvent subclasses.
Skipped the `scene` platform entirely. Omni "scenes" are actually
just user-named button macros — the underlying call is the same
`execute_button` that the `button` platform already exposes. Adding
a parallel scene wrapper would just double-count entities. Documented
the choice in the integration README.
## 2026-05-10 evening — HA Phase C (services + diagnostics)
Seven services, all routed through a `services.py` module that's
idempotently registered on first config-entry setup and unloaded on
the last config-entry teardown:
```
omni_pca.bypass_zone
omni_pca.restore_zone
omni_pca.execute_program
omni_pca.show_message
omni_pca.clear_message
omni_pca.acknowledge_alerts
omni_pca.send_command (raw escape hatch)
```
Each takes an `entry_id` field with HA's `config_entry` selector so
the UI gives users a panel picker. `services.yaml` declares the
schema; `services.py` enforces it via `voluptuous`.
Diagnostics endpoint dumps a redacted snapshot for bug reports:
`controller_key` redacted via `async_redact_data`; zone/unit/area
names hashed with sha256 so structure is visible without leaking
PII; counts per object type; last event class; last update success
timestamp. Useful one day, useless until then, but it's three lines
and HA users expect it.
## 2026-05-10 evening — "wait, did we mock the panel enough?"
The thinking-out-loud moment that caught a real bug. The HA test
harness was about to be set up; before doing that, the question was:
does the mock actually answer every opcode the HA coordinator calls?
Mapped HA-side calls to mock-side handlers. Most matched. But the
HA coordinator walks `RequestProperties` for object types Thermostat
(6) and Button (3), and the mock's `_reply_properties` only knew
about Zone/Unit/Area. Both would have returned `Nak`, the coordinator
would have moved on, and HA would have discovered zero thermostats
and zero buttons no matter how `MockState` was seeded.
Added the two handlers (each ~30 lines: build the per-object
Properties body matching the wire format documented in
`models.ThermostatProperties.parse` / `models.ButtonProperties.parse`),
plus two e2e tests that drive the walk with `OmniClient` and assert
the parses come out clean. Caught it before HA ever touched the mock.
This is the kind of bug that *would* have shown up the first time
you tried the integration: zero climate entities, zero button
entities, no error message because the panel just said "no, I have
no thermostats here". You'd spend an hour staring at it. Mock-the-
whole-protocol pays for itself the first time it catches one of
these.
## 2026-05-10 evening — HA test harness, the rough patches
`pytest-homeassistant-custom-component` is the standard HA dev test
harness. It pins to a specific HA version (we got `2026.5.1` paired
with HA `2026.5.x`) and provides fixtures to spin up HA in-process
per test. Sounds simple. Three rough patches:
1. **`requires-python` conflict.** Our library targets `>=3.12`. HA
`2026.5+` requires `>=3.14.2`. uv resolves dependency groups
against the project's `requires-python` and refused to install
the test harness because it couldn't find a Python version
satisfying both. Bumped the project to `>=3.14.2` — fine for HA
users (HA already needs 3.14), library users on older Python
pin to a previous omni-pca version.
2. **`pytest_socket` blocks our e2e tests.** The HA harness installs
`pytest_socket` globally to keep HA unit tests hermetic. That
broke our existing 17 e2e tests that legitimately need to talk
to a localhost MockPanel over a real TCP socket. Fix: a top-
level `tests/conftest.py` autouse fixture requesting the
harness's `socket_enabled` fixture, which re-enables sockets by
default. HA-side tests can opt back into the strict policy if
they want.
3. **`CONF_ENTRY_ID` doesn't exist in HA.** Our `services.py` was
importing `CONF_ENTRY_ID` from `homeassistant.const`. The harness
import-test caught it: HA exports the constant as
`ATTR_CONFIG_ENTRY_ID`, not `CONF_ENTRY_ID`. Without the harness,
this would have crashed on first install in a real HA. Worth the
harness already.
Then teardown started hanging. Each test passed (5-15 seconds for HA
boot + entity discovery + assertions) but the harness's
`verify_cleanup` timed out waiting for the coordinator's background
event-listener task to finish. The coordinator's `async_shutdown()`
cancels it cleanly — but the harness was tearing the test down without
calling unload first. Fix: convert the `configured_panel` fixture into
a generator and call `hass.config_entries.async_unload()` in the
teardown branch. With that, all 12 HA-side tests run in 0.74 seconds
total (each one boots HA, runs config flow, asserts, unloads).
Final score: 351 tests pass, 1 skipped (the gitignored `.pca`
fixture), ruff clean across `src/ tests/ custom_components/`.
## 2026-05-10 late evening — docker dev stack
Wanted a one-command setup so the integration could be browsed
manually and screenshotted for the README. `docker-compose.yml` with
two services: real HA `2026.5` from upstream + a sidecar running
the mock panel.
The interesting wrinkle: the mock panel container needs to import
`omni_pca`. Mounting the project read-only and running `uv` inside
the container failed because uv tried to recreate the host's
`.venv` and the mount was read-only. Fix: mount only `src/` and
`run_mock_panel.py`, set `PYTHONPATH=/tmp/mock/src`, install just
`cryptography` via `uv pip install --system`, run the script
directly. No package install, no venv, just a Python interpreter
with the right import path.
## 2026-05-10 late evening — automated HA onboarding + screenshots
`dev/screenshot.py` does the entire flow:
1. POST `/api/onboarding/users` to create the demo user (returns
`auth_code`)
2. POST `/auth/token` with `grant_type=authorization_code` to get
the access token (HA doesn't support password grant)
3. On subsequent runs: log in via `/auth/login_flow` (cleaner than
re-using a saved token; the token expires in 30 minutes anyway)
4. POST `/api/config/config_entries/flow` to start the omni_pca
config flow, then post the user-input dict to complete it
5. Cache the panel's device_id by calling HA's template endpoint
(`{{ device_id('sensor.omni_pro_ii_panel_model') }}`) — which is
a delightfully clean way to ask HA "what's the device id for this
entity?"
6. Launch headless chromium via the `playwright` Python package,
inject `localStorage.hassTokens` so it skips the login screen,
navigate to six deep-linked pages and screenshot each
The whole script is ~250 lines and produces six PNGs. The
`04-panel-device.png` is the headline shot: HA's device page for
"Omni Pro II / by HAI / Leviton / Firmware: 2.12r1" with all the
Controls (lights, buttons, areas, thermostats), Activity panel,
Diagnostics download. Every entity from the mock visible in real HA
UI in the right shape.
A nice side-effect: HA's onboarding wizard has a "We found compatible
devices!" step that scans the network for known integrations. Our
manifest got picked up — "HAI/Leviton Omni Panel" appeared in that
list during onboarding even though we hadn't done anything explicit
to register it for discovery. The integration name and `iot_class`
in `manifest.json` was enough.
## What's left for future sessions
The panel's network module is still off. When it comes back online,
the moment of truth is one TCP connect to `192.168.1.6:4369` (or
wherever it lives now) and one `RequestSystemInformation`. If the
reply is `Omni Pro II / 2.12 r1` the entire stack — file decryption,
key extraction, key derivation, XOR pre-whitening, AES, framing,
sequencing — was right end to end. The mock says yes. We'll find out.
Other backlog items:
- `Programs` discovery (no `RequestProperties` opcode for Programs;
current implementation returns an empty dict — needs a real
protocol path or a separate `RequestProgramData` style call)
- HACS submission once we've validated against the live panel
- Maybe publish `omni-pca` to PyPI so the HA `manifest.json`
requirements line works without a wheel install
--- ---
@ -693,3 +913,39 @@ satisfies `Count >= Max` for the affected blocks, so the bug never
fires. But it would, on a model that doesn't, and PC Access would fires. But it would, on a model that doesn't, and PC Access would
silently mis-parse its own config file. The kind of bug that lives silently mis-parse its own config file. The kind of bug that lives
in shipping code for a decade because nobody runs the unhappy path. in shipping code for a decade because nobody runs the unhappy path.
**Pure functions are the cheapest thing in test suites.** The HA
custom_component grew six entity platforms before it had any HA
test harness installed. Every translation between Omni's wire
encoding and HA's UI encoding lives in `helpers.py` as a pure
function with no HA imports. 61 unit tests for those alone, all
running in <100ms. When the harness arrived, the only thing left
to test was the wiring itself — and the wiring tests run in 0.74
seconds for the entire 12-test HA-side suite because the pure
parts already had coverage.
**Mocking the entire protocol counterpart, not just the surface,
catches whole categories of bugs.** When the mock and the client
were both being grown, a "did we mock enough?" check caught two
missing `RequestProperties` handlers (Thermostat and Button). HA
would have discovered zero of either type silently. With the
real-world panel offline, mock-the-protocol is the only way to
trust the stack — but even with the panel available, it's the
only way to trust changes without rebooting hardware between every
edit.
**`pytest_socket` and "real network in tests" can coexist.** HA's
test harness disables sockets globally to keep core unit tests
hermetic. Our integration tests need real TCP to talk to the in-
process MockPanel. The fix is one autouse fixture that requests
the harness's `socket_enabled` fixture; takes ten seconds, lets
both worlds work without modification.
**The "build the integration without a real device" loop is
unreasonably effective.** With the docker dev stack, the full
flow is `make dev-up`, click through HA onboarding (or run
`screenshot.py` to do it via REST), see your entities. Make a
code change, `docker compose restart homeassistant`, refresh the
browser, see the change. Repeat. The panel itself becomes optional
for ~95% of the development. The other 5% is the live-validation
lap when the panel comes back online.