Ryan Malloy 01757415a5 Phase 32: Benchmark improvements (Tier 1 + Tier 2)
Tier 1 — make existing benchmarks reliable:
* Bumped slow-bench rounds: cold_connect_disconnect 5->15, executemany
  series 3->10. Single-round outliers no longer dominate.
* Switched bench reporting to median + IQR. Mean was being moved by
  individual GC pauses / scheduler hiccups (IfxPy executemany IQR
  was 8.2 ms on a 28 ms median - 29% spread - mean was unreliable).
* Updated ifxpy_bench.py to also report median + IQR alongside mean
  for cross-comparable numbers.
* Makefile bench targets now show median, iqr, mean, stddev, ops, rounds.

The robust statistics flipped the comparison story:

  Old (mean, 3 rounds):   us 9% faster  / IfxPy 30% faster on 2 of 5
  New (median, 10+ rds):  us faster on 4 of 5 benchmarks

| Benchmark | IfxPy | informix-db | Δ |
|---|---|---|---|
| select_one_row             | 170us | 119us | us 30% faster |
| select_systables_first_10  | 186us | 142us | us 24% faster |
| select_bench_table_all 1k  | 980us | 832us | us 15% faster |
| executemany 1k in txn      | 28.3ms | 31.3ms | us 10% slower |
| cold_connect_disconnect    | 12.0ms | 10.7ms | us 11% faster |

Tier 2 — add benchmarks for claims we make but don't verify:

tests/benchmarks/test_observability_perf.py:
* test_streaming_fetch_memory_profile — RSS sampling during a
  cursor iteration. Documents memory growth shape; regression
  wall at 100 MB / 1k rows. Currently flat (in-memory cursor
  doesn't grow detectably for 278 rows).
* test_select_1_latency_percentiles — 1000-query distribution
  with p50/p90/p95/p99/max. Result: p99/p50 = 1.42x (tight tail).
  p50=108us, p99=153us.
* test_concurrent_pool_throughput[2,4,8] — N worker threads
  through pool, measures aggregate QPS + per-thread fairness.
  Plateaus at ~6K QPS (server-bound); per-thread latency scales
  ~linearly with N (server serialization expected).

README.md (project root): updated Compared-to-IfxPy table with
the median-based numbers + IQR awareness note.
tests/benchmarks/compare/README.md: added "Statistical robustness"
section explaining why median over mean for fair comparison.

236 integration tests pass; ruff clean.
2026-05-05 12:01:11 -06:00

informix-db

Pure-Python driver for IBM Informix IDS, speaking the SQLI wire protocol over raw sockets. No IBM Client SDK. No JVM. No native libraries. PEP 249 compliant; sync + async APIs; built-in connection pool; TLS support.

To our knowledge this is the first pure-socket Informix driver in any language — every other Informix driver (IfxPy, the legacy informixdb, ODBC bridges, JPype/JDBC, Perl DBD::Informix) wraps either IBM's CSDK or the JDBC JAR.

pip install informix-db

Requires Python ≥ 3.10.

Status

Production ready. Every finding from a system-wide failure-mode audit (data correctness, wire safety, resource leaks, concurrency, async cancellation) has been addressed:

Severity Finding Status
Critical Pool returns connections with open transactions Fixed (Phase 26)
Critical Unsynchronized wire path → PDU interleaving Fixed (Phase 27) — per-connection wire lock
High Async cancellation leaks running workers onto recycled connections Fixed (Phase 27)
High _raise_sq_err bare-except masks wire desync Fixed (Phase 28)
High Cursor finalizers — server-side resources leak on mid-fetch raise Fixed (Phase 28+29)
Medium 5 hardening items Fixed (Phase 28+30)

0 critical, 0 high, 0 medium audit findings remain. Every architectural change went through a Margaret Hamilton-style review focused on silent-failure modes, recovery paths, and documented invariants. Each documented invariant is paired with either a runtime guard or a CI tripwire test.

Test coverage: 300+ tests across unit / integration / benchmark suites. Integration tests run against the official IBM Informix Developer Edition Docker image (15.0.1.0.3DE).

Quick start

import informix_db

with informix_db.connect(
    host="db.example.com", port=9088,
    user="informix", password="...",
    database="mydb", server="informix",
) as conn:
    cur = conn.cursor()
    cur.execute("SELECT id, name FROM users WHERE id = ?", (42,))
    user_id, name = cur.fetchone()

Async (FastAPI / aiohttp / asyncio)

import asyncio
from informix_db import aio

async def main():
    pool = await aio.create_pool(
        host="db.example.com", user="informix", password="...",
        database="mydb",
        min_size=1, max_size=10,
    )
    async with pool.connection() as conn:
        cur = await conn.cursor()
        await cur.execute("SELECT id, name FROM users WHERE id = ?", (42,))
        row = await cur.fetchone()
    await pool.close()

asyncio.run(main())

Connection pool (sync)

import informix_db

pool = informix_db.create_pool(
    host="db.example.com", user="informix", password="...",
    database="mydb",
    min_size=1, max_size=10, acquire_timeout=5.0,
)

with pool.connection() as conn:
    cur = conn.cursor()
    cur.execute("...")

pool.close()

TLS

import ssl

# Production: bring your own context
ctx = ssl.create_default_context(cafile="/path/to/ca.pem")
informix_db.connect(host="...", port=9089, ..., tls=ctx)

# Dev / self-signed: tls=True disables verification
informix_db.connect(host="127.0.0.1", port=9089, ..., tls=True)

Informix uses dedicated TLS-enabled listener ports (configured server-side in sqlhosts) rather than STARTTLS upgrade — point port at the TLS listener (often 9089) when tls is enabled.

Type support

SQL type Python type
SMALLINT / INT / BIGINT / SERIAL int
FLOAT / SMALLFLOAT float
DECIMAL(p,s) / MONEY decimal.Decimal
CHAR / VARCHAR / NCHAR / NVCHAR / LVARCHAR str
BOOLEAN bool
DATE datetime.date
DATETIME YEAR TO ... datetime.datetime / datetime.time / datetime.date
INTERVAL DAY TO FRACTION datetime.timedelta
INTERVAL YEAR TO MONTH informix_db.IntervalYM
BYTE / TEXT (legacy in-row blobs) bytes / str
BLOB / CLOB (smart-LOBs) informix_db.BlobLocator / informix_db.ClobLocator (read via cursor.read_blob_column, write via cursor.write_blob_column)
ROW(...) informix_db.RowValue
SET(...) / MULTISET(...) / LIST(...) informix_db.CollectionValue
NULL None

Smart-LOB (BLOB / CLOB) read & write

# Read: returns the actual bytes
data = cur.read_blob_column(
    "SELECT data FROM photos WHERE id = ?", (42,)
)

# Write: BLOB_PLACEHOLDER token marks where the BLOB goes
cur.write_blob_column(
    "INSERT INTO photos VALUES (?, BLOB_PLACEHOLDER)",
    blob_data=jpeg_bytes,
    params=(42,),
)

Both work end-to-end in pure Python via the lotofile / filetoblob server functions intercepted at the SQ_FILE (98) wire-protocol level — no native machinery anywhere in the thread of execution. See docs/DECISION_LOG.md §1011 for the architecture pivot that made this possible.

Direct stored-procedure invocation (fast-path)

# Cleanly close a smart-LOB descriptor opened via SQL
result = conn.fast_path_call(
    "function informix.ifx_lo_close(integer)", lofd
)
# result == [0] on success

The fast-path RPC (SQ_FPROUTINE / SQ_EXFPROUTINE) bypasses PREPARE → EXECUTE → FETCH for direct UDF/SPL calls. Routine handles are cached per-connection, so repeated calls to the same function take a single round-trip.

Server compatibility

Tested against IBM Informix Dynamic Server 15.0.1.0.3DE (the official icr.io/informix/informix-developer-database Docker image). The wire protocol is stable across modern Informix versions; should work against 12.10+ unmodified.

For features that need server-side configuration (smart-LOBs, logged transactions), see docs/DECISION_LOG.md:

  • Phase 7 — logged-DB transactions
  • Phase 8 — BYTE/TEXT (needs blobspace)
  • Phase 10/11 — BLOB/CLOB (needs sbspace + SBSPACENAME config + level-0 archive)

Performance

Single-connection benchmarks against the dev container on loopback:

Operation Mean Throughput
decode(int) per cell 139 ns 7.2M ops/sec
parse_tuple_payload per row (5 cols) 1.4 µs 715K rows/sec
SELECT 1 round-trip ~140 µs ~7K queries/sec
1000-row SELECT ~1.0 ms ~990K rows/sec sustained
executemany(1000) in transaction 32 ms ~31,000 rows/sec
Pool acquire + query + release 295 µs ~3.4K queries/sec
Cold connect (login handshake) 11 ms ~90 connections/sec

Performance gotcha: executemany(...) under autocommit=True is 53× slower than the same call inside a single transaction (server flushes the transaction log per row). For bulk loads, autocommit=False (default) + conn.commit() at the end. See docs/USAGE.md for the full performance tips section.

Compared to IfxPy (the C-bound PyPI driver)

Head-to-head benchmarks against IfxPy on identical workloads, same Informix server, matched conditions. Using median + IQR over 10+ rounds to resist outlier-round noise:

Benchmark IfxPy 3.0.5 (C-bound) informix-db (pure Python) Result
Single-row SELECT round-trip 170 µs 119 µs informix-db 30% faster
~10-row server-side query 186 µs 142 µs informix-db 24% faster
1000-row SELECT (full fetch) 980 µs 832 µs informix-db 15% faster
executemany(1000) in transaction 28.3 ms 31.3 ms 10% slower
Cold connect (login handshake) 12.0 ms 10.7 ms informix-db 11% faster

informix-db is faster on 4 of 5 benchmarks against the C-bound driver. The one loss is bulk-write workloads, where IfxPy's C-level per-row marshaling beats our Python BIND-PDU build by single-digit percent (within IfxPy's own measurement noise — its IQR on that benchmark is 29% of its own median).

Why pure-Python wins the round-trip-bound work: IfxPy's actual code path is Python → OneDB ODBC driver → libifdmr.so → wire. Ours is Python → wire. The abstraction-layer overhead IfxPy carries on every call costs more than the C-vs-Python codec gap saves. We hit the wire directly with one less hop.

Full methodology, IQR caveats, install gauntlet, and reproduction in tests/benchmarks/compare/README.md.

A note on IfxPy's install gauntlet: getting it to run on a modern system requires Python ≤ 3.11, setuptools <58, permissive CFLAGS, manual download of a 92 MB ODBC tarball, four LD_LIBRARY_PATH directories, and libcrypt.so.1 (deprecated 2018, missing on Arch / Fedora 35+ / RHEL 9). informix-db's install: pip install informix-db.

Standards & guarantees

  • PEP 249 (DB-API 2.0): connect(), Connection, Cursor, description, rowcount, exception hierarchy
  • paramstyle = "numeric" (Informix's native ESQL/C convention; ? and :1 both work)
  • Threadsafety = 1: threads may share the module but not connections; the pool gives per-thread connection access. Phase 27 added a per-connection wire lock that makes accidental sharing safe (interleaved PDUs serialize correctly), but PEP 249 advice still holds — give each thread its own connection.
  • CalVer versioning: YYYY.MM.DD releases. PEP 440 post-releases (.1, .2) for same-day fixes.

Development

The full test + lint workflow is in the Makefile. Quick summary:

make test                     # 77 unit tests (no Docker)
make ifx-up && make test-integration   # 231 integration tests
make bench                    # benchmark suite
make lint                     # ruff

For the smart-LOB tests specifically, the dev container needs additional one-time setup (blobspace + sbspace + level-0 archive). See docs/DECISION_LOG.md §10 for the onspaces / onmode / ontape commands.

Documentation

  • docs/USAGE.md — practical recipes: connections, parameter binding, type mapping, transactions, performance tips, scrollable cursors, BLOBs, async, TLS, locale/Unicode, error handling, known limitations
  • tests/benchmarks/README.md — performance baselines, headline numbers, how to run regressions
  • CHANGELOG.md — phase-by-phase release notes

Project history & design rationale

This driver was built incrementally across 30 phases, each with a focused scope and decision log. The reasoning trail lives in:

Notable architectural pivots documented in the decision log:

  • Phase 10/11 (smart-LOB read/write): used lotofile/filetoblob SQL functions + SQ_FILE protocol intercept instead of the heavier SQ_FPROUTINE + SQ_LODATA stack — ~3x smaller than originally projected
  • Phase 7 (logged-DB transactions): discovered Informix requires explicit SQ_BEGIN before each transaction in non-ANSI mode, plus SQ_RBWORK needs a savepoint short payload
  • Phase 16 (async): shipped thread-pool wrapping (~250 lines) instead of full I/O abstraction refactor (~2000 lines); functionally equivalent for typical FastAPI workloads

License

MIT.

Description
Pure-Python driver for IBM Informix IDS — speaks the SQLI wire protocol over a raw socket. No CSDK, no JVM, no native libraries.
https://informix-db.warehack.ing
Readme MIT 716 KiB
Languages
Python 85.6%
MDX 8.1%
CSS 2.1%
Java 1.7%
Astro 1%
Other 1.4%