Ryan Malloy 9703279bc8 Phase 19: resilience tests via fault injection (v2026.05.04.3)
Fills the highest-priority gap from the test-adequacy audit:
connection-failure recovery. 12 new integration tests using a
thread-based TCP proxy (ControlledProxy) that can be kill()'d at
any moment to simulate network drops or server crashes via TCP RST
(SO_LINGER=0).

Coverage:
* Network drop mid-SELECT — OperationalError, not hang
* Network drop after describe, before fetch
* Network drop during fetch (already-materialized rows still
  readable; fresh execute fails)
* Local socket forced-close (kernel-level disconnect simulation)
* I/O error marks connection unusable post-failure
* Pool evicts connection that died mid-`with` block (size drops)
* Pool revives after all idle connections died (health check on
  acquire mints fresh)
* Async cancellation via asyncio.wait_for — pool stays usable
* Cursor reusable after SQL error
* Connection survives cursor close after error
* Sustained pool load (50 acquire/release cycles, no leak)
* read_timeout fires on a hung connection within bounds

Catches the failure classes that bite production users:
* Hangs (waiting forever on dead socket)
* Silent corruption (EOF treated as valid tuple)
* Double-fault (cleanup raises after primary error)
* Pool poisoning (broken connection returned to pool)
* Stale cursor reuse across error boundaries

Helper:
* tests/_proxy.py — ControlledProxy: thread-based TCP forwarder
  with kill() for fault injection. Two-thread pump model. SO_LINGER=0
  for RST-on-close (mimics router drop).

Total: 69 unit + 203 integration = 272 tests.

Remaining gaps from the audit (UTF-8 multibyte locale, server-version
matrix, performance benchmarks) are real but lower-severity. Phase 19
addressed the one most likely to bite production deployments.
2026-05-04 16:57:06 -06:00
..
2026-05-04 14:50:27 -06:00
2026-05-04 14:46:53 -06:00