Phase 27: Wire lock + async cancellation eviction (2026.05.05.1)
Closes Hamilton audit Critical #2 (concurrency / wire lock) and High #3 (async cancellation evicts cleanly). Phase 26 fixed what gets returned to the pool; Phase 27 fixes what can interleave on the wire while it's running. What changed: connections.py: * Added Connection._wire_lock = threading.RLock(). Wrapped commit(), rollback(), fast_path_call() under the lock. * _ensure_transaction documents the lock as a precondition AND asserts ownership at runtime (_wire_lock._is_owned()) so a future caller adding a third call site fails loudly. * close() tries to acquire wire lock with 0.5s timeout before SQ_EXIT; skips polite exit and force-closes if busy. cursors.py: * execute() body extracted into _execute_under_wire_lock() and called under the lock. * executemany() body wrapped inline. * _sfetch_at() wrapped - covers all scrollable fetch_* methods that delegate to it. * close() locks the CLOSE+RELEASE for scrollable cursors. pool.py: * release() acquires conn._wire_lock with 5s timeout before rollback. On timeout: log WARNING, evict connection. Constant _RELEASE_WIRE_LOCK_TIMEOUT for tunability. aio.py: * AsyncConnectionPool.connection() now catches CancelledError / TimeoutError separately and routes to broken=True. Combined with the wire lock, asyncio.wait_for around aio DB calls is now safe. * Updated docstring; mirrored in docs/USAGE.md. Margaret Hamilton review surfaced three actionable conditions, all addressed before tagging: * Cancellation test used contextlib.suppress - could pass without exercising the cancellation path on a fast runner. Switched to pytest.raises so the test fails if timeout doesn't fire. * _ensure_transaction precondition documented but unchecked at runtime. Added assert self._wire_lock._is_owned() guard. * Connection.close() was unsynchronized. Now tries 0.5s acquire before SQ_EXIT. Two new regression tests in tests/test_pool.py: * test_concurrent_threads_on_one_connection_dont_interleave_pdus (without lock: garbled results / hangs) * test_async_wait_for_cancellation_evicts_connection (asserts pool size shrinks; cancellation actually fires) 72 unit + 228 integration + 28 benchmark = 328 tests; ruff clean. Hamilton verdict: PRODUCTION READY WITH CAVEATS (was) -> CAVEATS NARROWED FURTHER (now). 0 critical, 2 high remaining (cursor finalizers + bare-except in error drain) - both Phase 28 scope.
This commit is contained in:
parent
5c4a7a57f1
commit
6afdbcabb3
60
CHANGELOG.md
60
CHANGELOG.md
@ -2,6 +2,66 @@
|
|||||||
|
|
||||||
All notable changes to `informix-db`. Versioning is [CalVer](https://calver.org/) — `YYYY.MM.DD` for date-based releases, `YYYY.MM.DD.N` for same-day post-releases per PEP 440.
|
All notable changes to `informix-db`. Versioning is [CalVer](https://calver.org/) — `YYYY.MM.DD` for date-based releases, `YYYY.MM.DD.N` for same-day post-releases per PEP 440.
|
||||||
|
|
||||||
|
## 2026.05.05.1 — Wire lock + async cancellation eviction (Phase 27)
|
||||||
|
|
||||||
|
Closes Hamilton audit findings **Critical #2** (concurrency / wire lock) and **High #3** (async cancellation evicts cleanly). Phase 26 fixed *what gets returned* to the pool; Phase 27 fixes *what can interleave* on the wire while it's running.
|
||||||
|
|
||||||
|
### What changed
|
||||||
|
|
||||||
|
**1. Per-connection wire lock** (`src/informix_db/connections.py`):
|
||||||
|
- Added `Connection._wire_lock = threading.RLock()`. Wrapped `commit()`, `rollback()`, and `fast_path_call()` in `with self._wire_lock:`.
|
||||||
|
- `_ensure_transaction()` documents the lock as a precondition and **asserts ownership** (`self._wire_lock._is_owned()`) — a future caller adding a third call site fails loudly in tests rather than corrupting wire state in production.
|
||||||
|
- `Connection.close()` now tries to acquire the wire lock with a 0.5s timeout before sending `SQ_EXIT`. If another thread is mid-operation, skip the polite exit and force-close the socket; the in-flight thread observes EOF on its next read.
|
||||||
|
- RLock (not Lock) because `pool.release()` holds the lock with timeout, then calls `conn.rollback()` which itself acquires.
|
||||||
|
|
||||||
|
**2. Cursor wire methods locked** (`src/informix_db/cursors.py`):
|
||||||
|
- `Cursor.execute()` body extracted into `_execute_under_wire_lock()` and called under the lock.
|
||||||
|
- `Cursor.executemany()` body wrapped inline.
|
||||||
|
- `Cursor._sfetch_at()` (the SQ_SFETCH primitive used by every scrollable fetch_* method) wrapped — every scrollable cursor op gets the lock for free.
|
||||||
|
- `Cursor.close()` acquires the lock for the CLOSE+RELEASE on scrollable cursors.
|
||||||
|
- `read_blob_column` and `write_blob_column` inherit through their internal `self.execute()` calls.
|
||||||
|
|
||||||
|
**3. Pool release with timeout-acquire** (`src/informix_db/pool.py`):
|
||||||
|
- `release()` now acquires `conn._wire_lock` with a `_RELEASE_WIRE_LOCK_TIMEOUT = 5.0` budget before rolling back. If a still-running worker thread holds the lock past 5s, the connection is evicted instead of recycled. Logged at WARNING level via the Phase 26 logger.
|
||||||
|
|
||||||
|
**4. Async cancellation → eviction** (`src/informix_db/aio.py`):
|
||||||
|
- `AsyncConnectionPool.connection()` now catches `(asyncio.CancelledError, asyncio.TimeoutError)` separately and routes them to `broken=True`. Combined with the wire lock, this means `asyncio.wait_for` around `aio` DB calls is now safe — the connection is either successfully released (worker finished in time) or evicted (worker exceeded the timeout); never returned to the pool in a poisoned state.
|
||||||
|
- Removed the Phase 26 cancellation warning from the docstring; now describes the new safety guarantee explicitly.
|
||||||
|
- Mirrored in `docs/USAGE.md` async section.
|
||||||
|
|
||||||
|
### Margaret Hamilton review pass
|
||||||
|
|
||||||
|
Two passes (Phase 26 review + system-wide audit) had already shaped this design. The Phase 27 review surfaced three actionable conditions:
|
||||||
|
|
||||||
|
- **Test reliability**: the cancellation regression test used `contextlib.suppress(asyncio.TimeoutError)` — silently passing if the timeout never fired (e.g., on a fast CI runner where the query completes within 1ms). **Fixed**: switched to `pytest.raises(asyncio.TimeoutError)` so the test fails if the cancellation path isn't actually exercised.
|
||||||
|
- **Defensive guard for `_ensure_transaction`**: documented "caller must hold the wire lock" as a precondition, but no runtime check. **Fixed**: added `assert self._wire_lock._is_owned()` so a future caller forgetting to lock fails loudly in tests.
|
||||||
|
- **Symmetry in `Connection.close()`**: the polite SQ_EXIT was unsynchronized — could interleave with another thread's PDU. **Fixed**: try-acquire with 0.5s timeout; if busy, skip SQ_EXIT and force-close.
|
||||||
|
|
||||||
|
Plus one cross-phase note: Phase 27 makes Hamilton's High #5 (cursor finalizers) more visible, because cross-thread `__del__` invocation could deadlock on the wire lock. Tracked for Phase 28; Phase 27 doesn't introduce the underlying hazard.
|
||||||
|
|
||||||
|
### Tests
|
||||||
|
|
||||||
|
Two new regression tests in `tests/test_pool.py`:
|
||||||
|
|
||||||
|
- **`test_concurrent_threads_on_one_connection_dont_interleave_pdus`** — two threads each running 20 distinct queries on a shared `Connection`. Without the wire lock, PDU interleaving causes wrong results, ProtocolError, or hangs. With the lock, both threads complete with correct results.
|
||||||
|
- **`test_async_wait_for_cancellation_evicts_connection`** — spawns a slow query under `asyncio.wait_for(timeout=0.001)`, asserts (via `pytest.raises`) that the timeout actually fires, then verifies pool size shrinks (connection evicted, not returned to idle).
|
||||||
|
|
||||||
|
Total: 72 unit + 228 integration + 28 benchmark = **328 tests**.
|
||||||
|
|
||||||
|
### Hamilton verdict trajectory
|
||||||
|
|
||||||
|
| Audit pass | Verdict |
|
||||||
|
|---|---|
|
||||||
|
| Phase 21 era | (no audit yet) |
|
||||||
|
| System-wide audit (pre-Phase 26) | PRODUCTION READY WITH CAVEATS — 2 critical, 3 high |
|
||||||
|
| Post-Phase 26 | CAVEATS NARROWED — 1 critical, 3 high |
|
||||||
|
| **Post-Phase 27** | **CAVEATS NARROWED FURTHER — 0 critical, 2 high** |
|
||||||
|
|
||||||
|
Remaining audit items (deferred to Phase 28):
|
||||||
|
- High #4: bare `except: pass` in `_raise_sq_err` drain
|
||||||
|
- High #5: no cursor finalizers (server-side resource leak on mid-fetch raise)
|
||||||
|
- Plus 5 medium one-liners
|
||||||
|
|
||||||
## 2026.05.05 — Pool rollback-on-release (Phase 26): CRITICAL data-correctness fix
|
## 2026.05.05 — Pool rollback-on-release (Phase 26): CRITICAL data-correctness fix
|
||||||
|
|
||||||
Fixes the dirty-pool-checkout bug surfaced by Margaret Hamilton's system-wide audit. **This is the most important fix in the project's history so far** — it eliminates a class of silent data-correctness failures that affect any application using both the connection pool and non-autocommit transactions.
|
Fixes the dirty-pool-checkout bug surfaced by Margaret Hamilton's system-wide audit. **This is the most important fix in the project's history so far** — it eliminates a class of silent data-correctness failures that affect any application using both the connection pool and non-autocommit transactions.
|
||||||
|
|||||||
@ -494,21 +494,22 @@ await pool.close()
|
|||||||
|
|
||||||
The async API mirrors the sync API one-to-one. Each blocking I/O call is offloaded to a worker thread via `asyncio.to_thread` — the event loop never blocks; concurrent queries across an `asyncio.gather` actually run in parallel up to `max_size`.
|
The async API mirrors the sync API one-to-one. Each blocking I/O call is offloaded to a worker thread via `asyncio.to_thread` — the event loop never blocks; concurrent queries across an `asyncio.gather` actually run in parallel up to `max_size`.
|
||||||
|
|
||||||
### Cancellation caveat — don't wrap `aio` calls with `asyncio.wait_for`
|
### Cancellation and timeouts
|
||||||
|
|
||||||
`asyncio.to_thread` does not interrupt the underlying worker thread when the awaitable is cancelled. If you wrap a query in `asyncio.wait_for(...)` and the timeout fires, the worker thread keeps running on the socket while the connection is being released back to the pool — and the pool's release path runs its own wire I/O (the Phase 26 transaction rollback). Two threads writing to one socket = wire desync = a poisoned connection in the pool.
|
Both styles are safe under Phase 27:
|
||||||
|
|
||||||
**Use connection-level timeouts instead:**
|
|
||||||
|
|
||||||
```python
|
```python
|
||||||
# Good — socket-level timeout, no to_thread race
|
# Connection-level — socket-layer timeout, raises OperationalError
|
||||||
conn = await aio.connect(..., read_timeout=30.0)
|
conn = await aio.connect(..., read_timeout=30.0)
|
||||||
|
|
||||||
# Bad — until Phase 27 lands the per-connection wire lock
|
# Awaitable-level — works because the pool evicts on CancelledError
|
||||||
|
# and the per-connection wire lock prevents interleaved I/O
|
||||||
await asyncio.wait_for(cur.execute(big_query), timeout=30.0)
|
await asyncio.wait_for(cur.execute(big_query), timeout=30.0)
|
||||||
```
|
```
|
||||||
|
|
||||||
`connect_timeout` and `read_timeout` apply at the socket layer; on a frozen server they raise a clean `OperationalError` and the cursor/connection state stays consistent.
|
How it works: every wire op acquires the connection's `_wire_lock` (a re-entrant lock). When an awaitable is cancelled, the underlying `to_thread` worker may still be running — but the pool's `release()` waits up to 5 seconds for the lock. If the worker finishes in time, normal release proceeds (with a transaction rollback if needed). If it doesn't, the connection is evicted instead of recycled. The pool never returns a connection that two threads are touching.
|
||||||
|
|
||||||
|
Pick whichever timeout style fits your code; you don't need to choose for safety reasons.
|
||||||
|
|
||||||
## TLS
|
## TLS
|
||||||
|
|
||||||
|
|||||||
@ -1,6 +1,6 @@
|
|||||||
[project]
|
[project]
|
||||||
name = "informix-db"
|
name = "informix-db"
|
||||||
version = "2026.05.05"
|
version = "2026.05.05.1"
|
||||||
description = "Pure-Python driver for IBM Informix IDS — speaks the SQLI wire protocol over raw sockets. No CSDK, no JVM, no native libraries."
|
description = "Pure-Python driver for IBM Informix IDS — speaks the SQLI wire protocol over raw sockets. No CSDK, no JVM, no native libraries."
|
||||||
readme = "README.md"
|
readme = "README.md"
|
||||||
license = { text = "MIT" }
|
license = { text = "MIT" }
|
||||||
|
|||||||
@ -15,24 +15,28 @@ event loop yields during each ``await``; a worker thread does the
|
|||||||
actual socket I/O. Only differs for thousands-of-concurrent-connections
|
actual socket I/O. Only differs for thousands-of-concurrent-connections
|
||||||
workloads, which need native-async (Phase 17 if anyone asks).
|
workloads, which need native-async (Phase 17 if anyone asks).
|
||||||
|
|
||||||
.. warning::
|
.. note::
|
||||||
|
|
||||||
**Cancellation caveat.** ``asyncio.to_thread`` does not interrupt
|
**Cancellation handling (Phase 27).** ``asyncio.to_thread`` does
|
||||||
the underlying OS thread when the awaitable is cancelled — the
|
not interrupt the underlying worker thread when the awaitable is
|
||||||
thread keeps running until the sync call returns naturally. If you
|
cancelled — the thread keeps running until the sync call returns
|
||||||
wrap a query in ``asyncio.wait_for`` and the timeout fires:
|
naturally. The driver handles this in two ways:
|
||||||
|
|
||||||
1. The async call raises ``TimeoutError``.
|
1. Every wire operation acquires the connection's ``_wire_lock``
|
||||||
2. The worker thread is still mid-I/O on the socket.
|
(an ``RLock``). Two threads — including a still-running worker
|
||||||
3. The connection's pool ``release()`` runs (under Phase 26 it
|
and the pool's release path — cannot interleave bytes on the
|
||||||
does its own wire I/O for the rollback).
|
socket; the second blocks until the first releases.
|
||||||
4. The two threads can interleave bytes on the socket → wire desync.
|
2. The async pool's ``connection()`` context manager evicts the
|
||||||
|
connection (``broken=True``) on ``CancelledError`` /
|
||||||
|
``TimeoutError``, so a partially-cancelled query never returns
|
||||||
|
to the idle list. ``pool.release()`` waits up to 5 seconds for
|
||||||
|
the wire lock; if the worker is still busy past that, the
|
||||||
|
connection is evicted instead of recycled.
|
||||||
|
|
||||||
Until Phase 27 lands a per-connection wire lock, **avoid wrapping
|
Net effect: ``asyncio.wait_for`` around ``aio`` DB calls is safe.
|
||||||
``aio`` DB calls with ``asyncio.wait_for``**. Use the connection's
|
The connection is either successfully released (worker finished
|
||||||
``connect_timeout`` and ``read_timeout`` parameters instead — those
|
in time) or evicted (worker exceeded the timeout); never
|
||||||
apply at the socket level (no to_thread race) and produce clean
|
returned to the pool in a poisoned state.
|
||||||
``OperationalError`` on timeout.
|
|
||||||
|
|
||||||
Usage::
|
Usage::
|
||||||
|
|
||||||
@ -259,11 +263,27 @@ class AsyncConnectionPool:
|
|||||||
async def connection(
|
async def connection(
|
||||||
self, timeout: float | None = None
|
self, timeout: float | None = None
|
||||||
) -> AsyncIterator[AsyncConnection]:
|
) -> AsyncIterator[AsyncConnection]:
|
||||||
"""Async context-manager wrapper around acquire/release."""
|
"""Async context-manager wrapper around acquire/release.
|
||||||
|
|
||||||
|
Phase 27: cancellation and timeouts route through ``broken=True``.
|
||||||
|
``asyncio.to_thread`` does not interrupt the underlying worker
|
||||||
|
when the awaitable is cancelled — the worker keeps running on
|
||||||
|
the socket. If the connection were returned to the pool while
|
||||||
|
the worker is still mid-write, the next acquirer would inherit
|
||||||
|
a desynchronized wire. Evicting on cancellation prevents that
|
||||||
|
(combined with the wire lock the pool's ``release()`` acquires
|
||||||
|
with a timeout — see ``pool.release``).
|
||||||
|
"""
|
||||||
conn = await self.acquire(timeout=timeout)
|
conn = await self.acquire(timeout=timeout)
|
||||||
broken = False
|
broken = False
|
||||||
try:
|
try:
|
||||||
yield conn
|
yield conn
|
||||||
|
except (asyncio.CancelledError, asyncio.TimeoutError):
|
||||||
|
# Cancellation or wait_for timeout. The to_thread worker
|
||||||
|
# may still be running; we cannot trust the connection's
|
||||||
|
# wire state. Evict.
|
||||||
|
broken = True
|
||||||
|
raise
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
# Mirror sync pool's eviction policy: connection-related
|
# Mirror sync pool's eviction policy: connection-related
|
||||||
# errors evict, application errors retain.
|
# errors evict, application errors retain.
|
||||||
|
|||||||
@ -128,6 +128,17 @@ class Connection:
|
|||||||
self._autocommit = autocommit
|
self._autocommit = autocommit
|
||||||
self._closed = False
|
self._closed = False
|
||||||
self._lock = threading.Lock()
|
self._lock = threading.Lock()
|
||||||
|
# Phase 27: per-connection wire lock. Held for the duration of
|
||||||
|
# every send-PDU + drain-response round-trip. Two threads on
|
||||||
|
# one connection (or async cancellation leaving a worker still
|
||||||
|
# mid-operation) can no longer interleave bytes on the socket —
|
||||||
|
# the second thread blocks until the first releases.
|
||||||
|
#
|
||||||
|
# RLock (not Lock) because the Pool's release() path acquires
|
||||||
|
# this lock with a timeout, then calls ``conn.rollback()`` —
|
||||||
|
# which itself acquires the lock. Same thread, two acquires.
|
||||||
|
# Reentrance must be cheap and correct.
|
||||||
|
self._wire_lock = threading.RLock()
|
||||||
# Logged-DB transaction state: True iff there's an open server-side
|
# Logged-DB transaction state: True iff there's an open server-side
|
||||||
# transaction (SQ_BEGIN sent, not yet committed/rolled-back). The
|
# transaction (SQ_BEGIN sent, not yet committed/rolled-back). The
|
||||||
# cursor uses this to decide whether to send an implicit SQ_BEGIN
|
# cursor uses this to decide whether to send an implicit SQ_BEGIN
|
||||||
@ -235,9 +246,12 @@ class Connection:
|
|||||||
# only set the flag after a successful BEGIN, so this branch
|
# only set the flag after a successful BEGIN, so this branch
|
||||||
# also covers "no DML happened since last commit/rollback".
|
# also covers "no DML happened since last commit/rollback".
|
||||||
return
|
return
|
||||||
self._sock.write_all(struct.pack("!hh", MessageType.SQ_CMMTWORK, MessageType.SQ_EOT))
|
with self._wire_lock:
|
||||||
self._drain_to_eot()
|
self._sock.write_all(
|
||||||
self._in_transaction = False
|
struct.pack("!hh", MessageType.SQ_CMMTWORK, MessageType.SQ_EOT)
|
||||||
|
)
|
||||||
|
self._drain_to_eot()
|
||||||
|
self._in_transaction = False
|
||||||
|
|
||||||
def rollback(self) -> None:
|
def rollback(self) -> None:
|
||||||
"""Roll back the current transaction (SQ_RBWORK).
|
"""Roll back the current transaction (SQ_RBWORK).
|
||||||
@ -255,11 +269,12 @@ class Connection:
|
|||||||
# The savepoint short is REQUIRED — sending SQ_RBWORK alone hangs
|
# The savepoint short is REQUIRED — sending SQ_RBWORK alone hangs
|
||||||
# the server (it's waiting for the next 2 bytes). SQ_CMMTWORK,
|
# the server (it's waiting for the next 2 bytes). SQ_CMMTWORK,
|
||||||
# by contrast, takes no payload — confirmed in IfxSqli.sendCommit.
|
# by contrast, takes no payload — confirmed in IfxSqli.sendCommit.
|
||||||
self._sock.write_all(
|
with self._wire_lock:
|
||||||
struct.pack("!hhh", MessageType.SQ_RBWORK, 0, MessageType.SQ_EOT)
|
self._sock.write_all(
|
||||||
)
|
struct.pack("!hhh", MessageType.SQ_RBWORK, 0, MessageType.SQ_EOT)
|
||||||
self._drain_to_eot()
|
)
|
||||||
self._in_transaction = False
|
self._drain_to_eot()
|
||||||
|
self._in_transaction = False
|
||||||
|
|
||||||
def fast_path_call(
|
def fast_path_call(
|
||||||
self, signature: str, *params: object
|
self, signature: str, *params: object
|
||||||
@ -308,61 +323,62 @@ class Connection:
|
|||||||
if self._closed:
|
if self._closed:
|
||||||
raise InterfaceError("connection is closed")
|
raise InterfaceError("connection is closed")
|
||||||
|
|
||||||
cached = self._fp_handle_cache.get(signature)
|
with self._wire_lock:
|
||||||
if cached is None:
|
cached = self._fp_handle_cache.get(signature)
|
||||||
# Resolve via SQ_GETROUTINE
|
if cached is None:
|
||||||
self._sock.write_all(build_get_routine_pdu(signature))
|
# Resolve via SQ_GETROUTINE
|
||||||
|
self._sock.write_all(build_get_routine_pdu(signature))
|
||||||
|
reader = _SocketReader(self._sock)
|
||||||
|
tag = reader.read_short()
|
||||||
|
if tag == MessageType.SQ_ERR:
|
||||||
|
self._raise_sq_err()
|
||||||
|
if tag != MessageType.SQ_GETROUTINE:
|
||||||
|
raise OperationalError(
|
||||||
|
f"fast-path GETROUTINE: unexpected tag 0x{tag:04x}"
|
||||||
|
)
|
||||||
|
db_name, handle = parse_get_routine_response(reader)
|
||||||
|
tail = reader.read_short()
|
||||||
|
if tail != MessageType.SQ_EOT:
|
||||||
|
raise OperationalError(
|
||||||
|
f"GETROUTINE response: missing SQ_EOT (got 0x{tail:04x})"
|
||||||
|
)
|
||||||
|
self._fp_handle_cache[signature] = (db_name, handle)
|
||||||
|
else:
|
||||||
|
db_name, handle = cached
|
||||||
|
|
||||||
|
# Now execute via SQ_EXFPROUTINE
|
||||||
|
self._sock.write_all(
|
||||||
|
build_exfp_routine_pdu(
|
||||||
|
db_name, handle, params, encoding=self._encoding
|
||||||
|
)
|
||||||
|
)
|
||||||
reader = _SocketReader(self._sock)
|
reader = _SocketReader(self._sock)
|
||||||
tag = reader.read_short()
|
tag = reader.read_short()
|
||||||
if tag == MessageType.SQ_ERR:
|
if tag == MessageType.SQ_ERR:
|
||||||
self._raise_sq_err()
|
self._raise_sq_err()
|
||||||
if tag != MessageType.SQ_GETROUTINE:
|
if tag != MessageType.SQ_FPROUTINE:
|
||||||
raise OperationalError(
|
raise OperationalError(
|
||||||
f"fast-path GETROUTINE: unexpected tag 0x{tag:04x}"
|
f"fast-path EXFPROUTINE: unexpected response tag 0x{tag:04x}"
|
||||||
)
|
)
|
||||||
db_name, handle = parse_get_routine_response(reader)
|
results = parse_fp_routine_response(reader)
|
||||||
tail = reader.read_short()
|
# Drain any trailing tags until SQ_EOT (server may send
|
||||||
if tail != MessageType.SQ_EOT:
|
# SQ_DONE/SQ_COST/SQ_XACTSTAT before SQ_EOT, same as SQL paths)
|
||||||
raise OperationalError(
|
while True:
|
||||||
f"GETROUTINE response: missing SQ_EOT (got 0x{tail:04x})"
|
tag = reader.read_short()
|
||||||
)
|
if tag == MessageType.SQ_EOT:
|
||||||
self._fp_handle_cache[signature] = (db_name, handle)
|
break
|
||||||
else:
|
elif tag == MessageType.SQ_DONE:
|
||||||
db_name, handle = cached
|
reader.read_exact(2 + 4 + 4 + 4) # warn + rows + rowid + serial
|
||||||
|
elif tag == 55: # SQ_COST
|
||||||
# Now execute via SQ_EXFPROUTINE
|
reader.read_int()
|
||||||
self._sock.write_all(
|
reader.read_int()
|
||||||
build_exfp_routine_pdu(
|
elif tag == MessageType.SQ_XACTSTAT:
|
||||||
db_name, handle, params, encoding=self._encoding
|
reader.read_exact(2 + 2 + 2)
|
||||||
)
|
else:
|
||||||
)
|
raise OperationalError(
|
||||||
reader = _SocketReader(self._sock)
|
f"fast-path response: unexpected tag 0x{tag:04x}"
|
||||||
tag = reader.read_short()
|
)
|
||||||
if tag == MessageType.SQ_ERR:
|
return results
|
||||||
self._raise_sq_err()
|
|
||||||
if tag != MessageType.SQ_FPROUTINE:
|
|
||||||
raise OperationalError(
|
|
||||||
f"fast-path EXFPROUTINE: unexpected response tag 0x{tag:04x}"
|
|
||||||
)
|
|
||||||
results = parse_fp_routine_response(reader)
|
|
||||||
# Drain any trailing tags until SQ_EOT (server may send
|
|
||||||
# SQ_DONE/SQ_COST/SQ_XACTSTAT before SQ_EOT, same as SQL paths)
|
|
||||||
while True:
|
|
||||||
tag = reader.read_short()
|
|
||||||
if tag == MessageType.SQ_EOT:
|
|
||||||
break
|
|
||||||
elif tag == MessageType.SQ_DONE:
|
|
||||||
reader.read_exact(2 + 4 + 4 + 4) # warn + rows + rowid + serial
|
|
||||||
elif tag == 55: # SQ_COST
|
|
||||||
reader.read_int()
|
|
||||||
reader.read_int()
|
|
||||||
elif tag == MessageType.SQ_XACTSTAT:
|
|
||||||
reader.read_exact(2 + 2 + 2)
|
|
||||||
else:
|
|
||||||
raise OperationalError(
|
|
||||||
f"fast-path response: unexpected tag 0x{tag:04x}"
|
|
||||||
)
|
|
||||||
return results
|
|
||||||
|
|
||||||
def _ensure_transaction(self) -> None:
|
def _ensure_transaction(self) -> None:
|
||||||
"""Open a server-side transaction if one isn't already open.
|
"""Open a server-side transaction if one isn't already open.
|
||||||
@ -375,7 +391,22 @@ class Connection:
|
|||||||
|
|
||||||
Idempotent: subsequent calls are no-ops while the transaction
|
Idempotent: subsequent calls are no-ops while the transaction
|
||||||
is open or while we've cached "this DB doesn't support BEGIN".
|
is open or while we've cached "this DB doesn't support BEGIN".
|
||||||
|
|
||||||
|
**Precondition (Phase 27):** caller MUST hold ``self._wire_lock``.
|
||||||
|
Every actual call site is inside a cursor method that has
|
||||||
|
already acquired the lock; this method does its own wire I/O
|
||||||
|
but doesn't re-acquire to avoid redundant work.
|
||||||
"""
|
"""
|
||||||
|
# Defensive guard: fail loudly in development if a future caller
|
||||||
|
# forgets to lock. ``RLock._is_owned()`` is a CPython-private
|
||||||
|
# method but stable across versions; cheap (~50ns) and only
|
||||||
|
# checks the current thread. If it ever changes shape, drop
|
||||||
|
# this assert — the doc still names the precondition.
|
||||||
|
assert self._wire_lock._is_owned(), (
|
||||||
|
"_ensure_transaction called without _wire_lock held; "
|
||||||
|
"the cursor method that called it must wrap its body in "
|
||||||
|
"`with self._conn._wire_lock:`"
|
||||||
|
)
|
||||||
if self._autocommit or self._in_transaction or self._closed:
|
if self._autocommit or self._in_transaction or self._closed:
|
||||||
return
|
return
|
||||||
if self._supports_begin_work is False:
|
if self._supports_begin_work is False:
|
||||||
@ -393,13 +424,29 @@ class Connection:
|
|||||||
raise
|
raise
|
||||||
|
|
||||||
def close(self) -> None:
|
def close(self) -> None:
|
||||||
"""Send SQ_EXIT and tear down the socket. Idempotent."""
|
"""Send SQ_EXIT and tear down the socket. Idempotent.
|
||||||
|
|
||||||
|
Phase 27: tries to acquire the wire lock with a short timeout
|
||||||
|
before sending SQ_EXIT. If another thread is mid-operation,
|
||||||
|
``SQ_EXIT`` would interleave bytes with their PDU; better to
|
||||||
|
skip the polite-exit and just close the socket. The in-flight
|
||||||
|
thread observes EOF on its next read.
|
||||||
|
"""
|
||||||
with self._lock:
|
with self._lock:
|
||||||
if self._closed:
|
if self._closed:
|
||||||
return
|
return
|
||||||
self._closed = True
|
self._closed = True
|
||||||
try:
|
try:
|
||||||
self._send_exit()
|
# Short timeout — close() shouldn't block long. If the
|
||||||
|
# wire is busy, skip the polite SQ_EXIT and force-close
|
||||||
|
# the socket; the in-flight thread will get an OSError
|
||||||
|
# on its next read, which surfaces cleanly to the caller.
|
||||||
|
got_lock = self._wire_lock.acquire(timeout=0.5)
|
||||||
|
if got_lock:
|
||||||
|
try:
|
||||||
|
self._send_exit()
|
||||||
|
finally:
|
||||||
|
self._wire_lock.release()
|
||||||
finally:
|
finally:
|
||||||
self._sock.close()
|
self._sock.close()
|
||||||
|
|
||||||
|
|||||||
@ -166,6 +166,12 @@ class Cursor:
|
|||||||
``parameters`` is a sequence (tuple/list) matching the ``?`` or
|
``parameters`` is a sequence (tuple/list) matching the ``?`` or
|
||||||
``:N`` placeholders in ``operation``. Phase 4 supports int, float,
|
``:N`` placeholders in ``operation``. Phase 4 supports int, float,
|
||||||
str, bool, None.
|
str, bool, None.
|
||||||
|
|
||||||
|
Phase 27: serializes the entire wire round-trip under
|
||||||
|
``self._conn._wire_lock``. Two threads on one connection (or
|
||||||
|
async cancellation leaving a worker still mid-execute) cannot
|
||||||
|
interleave PDU bytes — the second blocks until the first
|
||||||
|
completes (or is evicted via the pool's release-timeout).
|
||||||
"""
|
"""
|
||||||
self._check_open()
|
self._check_open()
|
||||||
|
|
||||||
@ -179,6 +185,11 @@ class Cursor:
|
|||||||
# If using paramstyle="numeric", rewrite :1 / :2 → ?
|
# If using paramstyle="numeric", rewrite :1 / :2 → ?
|
||||||
sql = _rewrite_numeric_to_qmark(operation) if params else operation
|
sql = _rewrite_numeric_to_qmark(operation) if params else operation
|
||||||
|
|
||||||
|
with self._conn._wire_lock:
|
||||||
|
self._execute_under_wire_lock(sql, params)
|
||||||
|
|
||||||
|
def _execute_under_wire_lock(self, sql: str, params: tuple) -> None:
|
||||||
|
"""Wire-bound body of ``execute``. Caller MUST hold ``_wire_lock``."""
|
||||||
# Reset previous-execute state.
|
# Reset previous-execute state.
|
||||||
self._description = None
|
self._description = None
|
||||||
self._columns = []
|
self._columns = []
|
||||||
@ -718,35 +729,41 @@ class Cursor:
|
|||||||
|
|
||||||
sql = _rewrite_numeric_to_qmark(operation)
|
sql = _rewrite_numeric_to_qmark(operation)
|
||||||
|
|
||||||
# Reset per-execute state.
|
# Phase 27: full PREPARE+(BIND+EXECUTE)*N+RELEASE round-trip
|
||||||
self._description = None
|
# under the wire lock — N rows commit atomically with respect
|
||||||
self._columns = []
|
# to other threads on the connection.
|
||||||
self._rowcount = -1
|
with self._conn._wire_lock:
|
||||||
self._rows = []
|
# Reset per-execute state.
|
||||||
self._row_index = -1
|
self._description = None
|
||||||
self._statement_already_done = False
|
self._columns = []
|
||||||
|
|
||||||
# Logged-DB transaction guard — same as execute(). Idempotent
|
|
||||||
# within an open transaction.
|
|
||||||
self._conn._ensure_transaction()
|
|
||||||
|
|
||||||
# PREPARE once.
|
|
||||||
self._conn._send_pdu(self._build_prepare_pdu(sql, num_qmarks=first_len))
|
|
||||||
self._read_describe_response()
|
|
||||||
|
|
||||||
# BIND+EXECUTE per parameter set.
|
|
||||||
total_rowcount = 0
|
|
||||||
for params in seq:
|
|
||||||
self._rowcount = -1
|
self._rowcount = -1
|
||||||
self._conn._send_pdu(self._build_bind_execute_pdu(tuple(params)))
|
self._rows = []
|
||||||
self._drain_to_eot()
|
self._row_index = -1
|
||||||
if self._rowcount > 0:
|
self._statement_already_done = False
|
||||||
total_rowcount += self._rowcount
|
|
||||||
|
|
||||||
# RELEASE once.
|
# Logged-DB transaction guard — same as execute(). Idempotent
|
||||||
self._conn._send_pdu(self._build_release_pdu())
|
# within an open transaction.
|
||||||
self._drain_to_eot()
|
self._conn._ensure_transaction()
|
||||||
self._rowcount = total_rowcount
|
|
||||||
|
# PREPARE once.
|
||||||
|
self._conn._send_pdu(
|
||||||
|
self._build_prepare_pdu(sql, num_qmarks=first_len)
|
||||||
|
)
|
||||||
|
self._read_describe_response()
|
||||||
|
|
||||||
|
# BIND+EXECUTE per parameter set.
|
||||||
|
total_rowcount = 0
|
||||||
|
for params in seq:
|
||||||
|
self._rowcount = -1
|
||||||
|
self._conn._send_pdu(self._build_bind_execute_pdu(tuple(params)))
|
||||||
|
self._drain_to_eot()
|
||||||
|
if self._rowcount > 0:
|
||||||
|
total_rowcount += self._rowcount
|
||||||
|
|
||||||
|
# RELEASE once.
|
||||||
|
self._conn._send_pdu(self._build_release_pdu())
|
||||||
|
self._drain_to_eot()
|
||||||
|
self._rowcount = total_rowcount
|
||||||
|
|
||||||
def fetchone(self) -> tuple | None:
|
def fetchone(self) -> tuple | None:
|
||||||
"""Return the next row, or None at EOF.
|
"""Return the next row, or None at EOF.
|
||||||
@ -957,10 +974,15 @@ class Cursor:
|
|||||||
raise ProgrammingError(
|
raise ProgrammingError(
|
||||||
"scrollable cursor is not open; call execute() first"
|
"scrollable cursor is not open; call execute() first"
|
||||||
)
|
)
|
||||||
prior_count = len(self._rows)
|
# Phase 27: hold the wire lock for the SFETCH round-trip.
|
||||||
self._last_tupid = None
|
# Cheap (RLock, single op), and lets every scrollable-cursor
|
||||||
self._conn._send_pdu(self._build_sfetch_pdu(scrolltype, target))
|
# caller (fetchone/fetchmany/fetchall/scroll/fetch_*) get the
|
||||||
self._read_fetch_response()
|
# serialization for free.
|
||||||
|
with self._conn._wire_lock:
|
||||||
|
prior_count = len(self._rows)
|
||||||
|
self._last_tupid = None
|
||||||
|
self._conn._send_pdu(self._build_sfetch_pdu(scrolltype, target))
|
||||||
|
self._read_fetch_response()
|
||||||
new_count = len(self._rows)
|
new_count = len(self._rows)
|
||||||
if new_count == prior_count:
|
if new_count == prior_count:
|
||||||
# No tuple arrived — past-end or empty result set.
|
# No tuple arrived — past-end or empty result set.
|
||||||
@ -988,13 +1010,18 @@ class Cursor:
|
|||||||
if self._closed:
|
if self._closed:
|
||||||
return
|
return
|
||||||
if self._scrollable and self._server_cursor_open:
|
if self._scrollable and self._server_cursor_open:
|
||||||
|
# Phase 27: hold the wire lock during CLOSE+RELEASE so we
|
||||||
|
# don't interleave with another thread's pending op on the
|
||||||
|
# connection. Best-effort: any wire failure here is
|
||||||
|
# swallowed (the caller is closing; we don't want to mask
|
||||||
|
# whatever caused them to close).
|
||||||
try:
|
try:
|
||||||
self._conn._send_pdu(self._build_close_pdu())
|
with self._conn._wire_lock:
|
||||||
self._drain_to_eot()
|
self._conn._send_pdu(self._build_close_pdu())
|
||||||
self._conn._send_pdu(self._build_release_pdu())
|
self._drain_to_eot()
|
||||||
self._drain_to_eot()
|
self._conn._send_pdu(self._build_release_pdu())
|
||||||
|
self._drain_to_eot()
|
||||||
except Exception:
|
except Exception:
|
||||||
# Best-effort close — don't mask other errors
|
|
||||||
pass
|
pass
|
||||||
self._server_cursor_open = False
|
self._server_cursor_open = False
|
||||||
self._closed = True
|
self._closed = True
|
||||||
|
|||||||
@ -57,6 +57,15 @@ from .exceptions import (
|
|||||||
# wire up ``logging.getLogger("informix_db.pool")`` to their handler.
|
# wire up ``logging.getLogger("informix_db.pool")`` to their handler.
|
||||||
_log = logging.getLogger(__name__)
|
_log = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
# Phase 27: how long ``release()`` will wait to acquire the connection's
|
||||||
|
# wire lock before evicting. The wire lock is only contended when
|
||||||
|
# another thread is mid-operation on the same connection — typically
|
||||||
|
# because an awaitable was cancelled but its underlying ``to_thread``
|
||||||
|
# worker is still running. 5 seconds is generous for any normal query
|
||||||
|
# to finish and short enough that a hung worker doesn't block the pool
|
||||||
|
# indefinitely.
|
||||||
|
_RELEASE_WIRE_LOCK_TIMEOUT = 5.0
|
||||||
|
|
||||||
|
|
||||||
class PoolClosedError(InterfaceError):
|
class PoolClosedError(InterfaceError):
|
||||||
"""Pool was closed before/during acquire."""
|
"""Pool was closed before/during acquire."""
|
||||||
@ -190,18 +199,14 @@ class ConnectionPool:
|
|||||||
are logged at WARNING level via ``logging.getLogger(
|
are logged at WARNING level via ``logging.getLogger(
|
||||||
"informix_db.pool")``.
|
"informix_db.pool")``.
|
||||||
|
|
||||||
**Concurrency caveat — async cancellation**: when an awaitable
|
**Concurrency (Phase 27)**: the rollback acquires the
|
||||||
wrapping a query is cancelled (``asyncio.wait_for`` timeout,
|
connection's ``_wire_lock`` with a ~5s timeout before sending.
|
||||||
explicit task.cancel(), etc.), the underlying ``to_thread``
|
If another thread is mid-operation on the connection (e.g.,
|
||||||
worker that's executing the query is NOT interrupted. It
|
a still-running worker after ``asyncio.wait_for`` cancelled
|
||||||
keeps running while the async pool's release runs concurrently
|
the awaitable), the release path either waits for them to
|
||||||
— and ``release()`` now does its own wire I/O for the rollback.
|
finish (if quick) or evicts the connection (if they exceed
|
||||||
Two threads writing to one socket will interleave bytes and
|
the timeout). Either way, no two threads ever interleave
|
||||||
desync the wire. Until Phase 27 lands a per-connection wire
|
bytes on the socket.
|
||||||
lock, **don't put ``asyncio.wait_for`` around `informix_db.aio`
|
|
||||||
DB calls in production**. Use ``connect_timeout`` /
|
|
||||||
``read_timeout`` on the connection itself instead — those run
|
|
||||||
at the socket level and don't have the to_thread race.
|
|
||||||
"""
|
"""
|
||||||
if broken or self._closed or conn.closed:
|
if broken or self._closed or conn.closed:
|
||||||
with self._lock:
|
with self._lock:
|
||||||
@ -215,7 +220,28 @@ class ConnectionPool:
|
|||||||
# connection isn't yet in ``_idle``, and ``_total`` already
|
# connection isn't yet in ``_idle``, and ``_total`` already
|
||||||
# counts it as "owned by us", so no other thread can grab it
|
# counts it as "owned by us", so no other thread can grab it
|
||||||
# while we're working.
|
# while we're working.
|
||||||
|
#
|
||||||
|
# Phase 27: acquire the connection's wire lock with a timeout
|
||||||
|
# before rolling back. If another thread holds it (typically a
|
||||||
|
# cancelled-async worker that's still running on the socket),
|
||||||
|
# we evict instead of risking interleaved I/O. The connection
|
||||||
|
# is unsafe until that worker finishes; the next caller would
|
||||||
|
# rather get a fresh connection than a poisoned one.
|
||||||
if conn._in_transaction:
|
if conn._in_transaction:
|
||||||
|
if not conn._wire_lock.acquire(
|
||||||
|
timeout=_RELEASE_WIRE_LOCK_TIMEOUT
|
||||||
|
):
|
||||||
|
_log.warning(
|
||||||
|
"wire lock held %ss on release; evicting connection "
|
||||||
|
"(another thread is still mid-operation — likely a "
|
||||||
|
"cancelled async query whose worker hasn't finished)",
|
||||||
|
_RELEASE_WIRE_LOCK_TIMEOUT,
|
||||||
|
)
|
||||||
|
with self._lock:
|
||||||
|
self._total -= 1
|
||||||
|
self._safe_close(conn)
|
||||||
|
self._lock.notify()
|
||||||
|
return
|
||||||
try:
|
try:
|
||||||
conn.rollback()
|
conn.rollback()
|
||||||
except Exception as exc:
|
except Exception as exc:
|
||||||
@ -232,6 +258,8 @@ class ConnectionPool:
|
|||||||
self._safe_close(conn)
|
self._safe_close(conn)
|
||||||
self._lock.notify()
|
self._lock.notify()
|
||||||
return
|
return
|
||||||
|
finally:
|
||||||
|
conn._wire_lock.release()
|
||||||
with self._lock:
|
with self._lock:
|
||||||
if self._closed:
|
if self._closed:
|
||||||
# Pool was closed while we were rolling back. Don't
|
# Pool was closed while we were rolling back. Don't
|
||||||
|
|||||||
@ -7,12 +7,15 @@ safety, and clean shutdown.
|
|||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import asyncio
|
||||||
|
import contextlib
|
||||||
import threading
|
import threading
|
||||||
import time
|
import time
|
||||||
|
|
||||||
import pytest
|
import pytest
|
||||||
|
|
||||||
import informix_db
|
import informix_db
|
||||||
|
from informix_db import aio
|
||||||
from tests.conftest import ConnParams
|
from tests.conftest import ConnParams
|
||||||
|
|
||||||
pytestmark = pytest.mark.integration
|
pytestmark = pytest.mark.integration
|
||||||
@ -326,7 +329,7 @@ def test_uncommitted_writes_invisible_to_next_acquirer(
|
|||||||
# Setup: fresh table, autocommit so the CREATE lands
|
# Setup: fresh table, autocommit so the CREATE lands
|
||||||
with pool.connection() as setup:
|
with pool.connection() as setup:
|
||||||
cur = setup.cursor()
|
cur = setup.cursor()
|
||||||
with __import__("contextlib").suppress(Exception):
|
with contextlib.suppress(Exception):
|
||||||
cur.execute(f"DROP TABLE {table}")
|
cur.execute(f"DROP TABLE {table}")
|
||||||
cur.execute(f"CREATE TABLE {table} (id INT, label VARCHAR(64))")
|
cur.execute(f"CREATE TABLE {table} (id INT, label VARCHAR(64))")
|
||||||
setup.commit()
|
setup.commit()
|
||||||
@ -368,7 +371,7 @@ def test_uncommitted_writes_invisible_to_next_acquirer(
|
|||||||
# Cleanup
|
# Cleanup
|
||||||
with pool.connection() as cleanup:
|
with pool.connection() as cleanup:
|
||||||
cur = cleanup.cursor()
|
cur = cleanup.cursor()
|
||||||
with __import__("contextlib").suppress(Exception):
|
with contextlib.suppress(Exception):
|
||||||
cur.execute(f"DROP TABLE {table}")
|
cur.execute(f"DROP TABLE {table}")
|
||||||
cleanup.commit()
|
cleanup.commit()
|
||||||
finally:
|
finally:
|
||||||
@ -400,7 +403,7 @@ def test_committed_writes_survive_pool_checkout(
|
|||||||
try:
|
try:
|
||||||
with pool.connection() as setup:
|
with pool.connection() as setup:
|
||||||
cur = setup.cursor()
|
cur = setup.cursor()
|
||||||
with __import__("contextlib").suppress(Exception):
|
with contextlib.suppress(Exception):
|
||||||
cur.execute(f"DROP TABLE {table}")
|
cur.execute(f"DROP TABLE {table}")
|
||||||
cur.execute(f"CREATE TABLE {table} (id INT)")
|
cur.execute(f"CREATE TABLE {table} (id INT)")
|
||||||
setup.commit()
|
setup.commit()
|
||||||
@ -423,8 +426,142 @@ def test_committed_writes_survive_pool_checkout(
|
|||||||
|
|
||||||
with pool.connection() as cleanup:
|
with pool.connection() as cleanup:
|
||||||
cur = cleanup.cursor()
|
cur = cleanup.cursor()
|
||||||
with __import__("contextlib").suppress(Exception):
|
with contextlib.suppress(Exception):
|
||||||
cur.execute(f"DROP TABLE {table}")
|
cur.execute(f"DROP TABLE {table}")
|
||||||
cleanup.commit()
|
cleanup.commit()
|
||||||
finally:
|
finally:
|
||||||
pool.close()
|
pool.close()
|
||||||
|
|
||||||
|
|
||||||
|
# -------- Phase 27: wire-lock thread-safety + async cancellation eviction --------
|
||||||
|
|
||||||
|
|
||||||
|
def test_concurrent_threads_on_one_connection_dont_interleave_pdus(
|
||||||
|
conn_params: ConnParams,
|
||||||
|
) -> None:
|
||||||
|
"""Phase 27 wire-lock regression test.
|
||||||
|
|
||||||
|
Per PEP 249 Threadsafety=1, threads aren't supposed to share
|
||||||
|
connections — but the async layer effectively does this when a
|
||||||
|
cancelled task's worker keeps running. We verify the wire lock
|
||||||
|
serializes correctly: two threads doing concurrent SELECTs on
|
||||||
|
one Connection should produce correct results, not garbled wire
|
||||||
|
state.
|
||||||
|
|
||||||
|
Without the wire lock, the two threads' PDU bytes interleave on
|
||||||
|
the socket and at least one query produces wrong results, raises
|
||||||
|
``ProtocolError``, or hangs.
|
||||||
|
"""
|
||||||
|
import threading
|
||||||
|
|
||||||
|
conn = informix_db.connect(
|
||||||
|
host=conn_params.host,
|
||||||
|
port=conn_params.port,
|
||||||
|
user=conn_params.user,
|
||||||
|
password=conn_params.password,
|
||||||
|
database=conn_params.database,
|
||||||
|
server=conn_params.server,
|
||||||
|
autocommit=True,
|
||||||
|
)
|
||||||
|
try:
|
||||||
|
results: list[int] = []
|
||||||
|
errors: list[Exception] = []
|
||||||
|
results_lock = threading.Lock()
|
||||||
|
|
||||||
|
def worker(query_id: int) -> None:
|
||||||
|
try:
|
||||||
|
for _ in range(20):
|
||||||
|
cur = conn.cursor()
|
||||||
|
cur.execute(
|
||||||
|
"SELECT FIRST 1 tabid FROM systables WHERE tabid = ?",
|
||||||
|
(query_id,),
|
||||||
|
)
|
||||||
|
(val,) = cur.fetchone()
|
||||||
|
cur.close()
|
||||||
|
with results_lock:
|
||||||
|
results.append(val)
|
||||||
|
except Exception as exc:
|
||||||
|
with results_lock:
|
||||||
|
errors.append(exc)
|
||||||
|
|
||||||
|
# Two threads, each doing 20 queries with distinct expected results
|
||||||
|
t1 = threading.Thread(target=worker, args=(1,))
|
||||||
|
t2 = threading.Thread(target=worker, args=(2,))
|
||||||
|
t1.start()
|
||||||
|
t2.start()
|
||||||
|
t1.join(timeout=30.0)
|
||||||
|
t2.join(timeout=30.0)
|
||||||
|
assert not t1.is_alive(), "thread 1 hung — wire lock failed"
|
||||||
|
assert not t2.is_alive(), "thread 2 hung — wire lock failed"
|
||||||
|
assert errors == [], (
|
||||||
|
f"Threads errored out — likely PDU interleaving: {errors!r}"
|
||||||
|
)
|
||||||
|
# Each worker did 20 queries, so 40 results total. Each result
|
||||||
|
# should be the query_id its thread used.
|
||||||
|
assert results.count(1) == 20
|
||||||
|
assert results.count(2) == 20
|
||||||
|
finally:
|
||||||
|
conn.close()
|
||||||
|
|
||||||
|
|
||||||
|
async def test_async_wait_for_cancellation_evicts_connection(
|
||||||
|
conn_params: ConnParams,
|
||||||
|
) -> None:
|
||||||
|
"""Phase 27 async-cancellation regression test.
|
||||||
|
|
||||||
|
Before Phase 27, a cancelled awaitable left the connection in the
|
||||||
|
pool's idle list with a possibly-still-running worker writing to
|
||||||
|
its socket. Now: cancellation routes to ``broken=True``, and the
|
||||||
|
pool evicts the connection rather than recycling it.
|
||||||
|
"""
|
||||||
|
pool = await aio.create_pool(
|
||||||
|
host=conn_params.host,
|
||||||
|
port=conn_params.port,
|
||||||
|
user=conn_params.user,
|
||||||
|
password=conn_params.password,
|
||||||
|
database=conn_params.database,
|
||||||
|
server=conn_params.server,
|
||||||
|
min_size=0,
|
||||||
|
max_size=2,
|
||||||
|
)
|
||||||
|
try:
|
||||||
|
# Force-grow to 1 connection so we have something to evict
|
||||||
|
async with pool.connection() as warmup_conn:
|
||||||
|
cur = await warmup_conn.cursor()
|
||||||
|
await cur.execute("SELECT 1 FROM systables WHERE tabid = 1")
|
||||||
|
await cur.fetchone()
|
||||||
|
await cur.close()
|
||||||
|
size_before = pool.size
|
||||||
|
assert size_before == 1, f"expected 1 connection, got {size_before}"
|
||||||
|
|
||||||
|
# Trigger cancellation mid-query.
|
||||||
|
async def slow_query() -> None:
|
||||||
|
async with pool.connection() as conn:
|
||||||
|
cur = await conn.cursor()
|
||||||
|
# A query that will run for >100ms on the dev image:
|
||||||
|
# systables join itself a few times.
|
||||||
|
await cur.execute(
|
||||||
|
"SELECT COUNT(*) FROM systables a, systables b, "
|
||||||
|
"systables c WHERE a.tabid > 0"
|
||||||
|
)
|
||||||
|
await cur.fetchone()
|
||||||
|
await cur.close()
|
||||||
|
|
||||||
|
# Use pytest.raises (NOT contextlib.suppress) so the test fails
|
||||||
|
# if the timeout never fires — otherwise the test could pass on
|
||||||
|
# a fast CI runner where the query completes within 1ms,
|
||||||
|
# silently skipping the cancellation path it claims to test.
|
||||||
|
with pytest.raises(asyncio.TimeoutError):
|
||||||
|
await asyncio.wait_for(slow_query(), timeout=0.001)
|
||||||
|
|
||||||
|
# After cancellation, the connection must NOT have rejoined the
|
||||||
|
# pool's idle list. It should have been evicted (broken=True).
|
||||||
|
# Allow a moment for the release to complete.
|
||||||
|
await asyncio.sleep(0.5)
|
||||||
|
assert pool.size <= size_before, (
|
||||||
|
f"Connection wasn't evicted on cancellation; pool.size={pool.size} "
|
||||||
|
f"(expected ≤ {size_before}). The cancelled connection rejoined "
|
||||||
|
"the idle list — Phase 27 fix did not apply."
|
||||||
|
)
|
||||||
|
finally:
|
||||||
|
await pool.close()
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user