cucx-docs's 007 empirically proved that route_translation_chain's
candidate filter `WHERE np.tkpatternusage IN (3, 5, 7)` excluded
Device DNs (tkpatternusage=2), which caused false-positive HIGH
findings on CTI-RP-to-CTI-RP failsafe chains — the typical CER
deployment shape.
The Bingham canary: 911-CTI-RP CFNA → 912 (DN of 912-CTI-RP) under
911CER-CSS. Direct numplan query against 911CER-PT returns 26 rows;
translation_chain reported `candidates_evaluated: 23`. The 3-row
gap is exactly the 3 Device DNs, excluded by the pre-fix filter
regardless of input number.
CUCM's runtime CFNA matcher includes Device DNs (otherwise no one
could dial 912 and reach the device). My tool's exclusion diverged
from production routing semantics. Result: every cluster using a
CTI-RP-to-CTI-RP failsafe pattern got at least one false-positive
HIGH finding on its first cti_failsafe_reachability run, wasting
operator investigation time on a phantom defect.
This commit broadens the candidate filter:
- WHERE np.tkpatternusage IN (3, 5, 7)
+ WHERE np.tkpatternusage IN (2, 3, 5, 7)
^
Device DN
Side effect: route_translation_chain now also surfaces Device DNs as
matches when called directly, which matches production routing
semantics. Existing callers benefit automatically.
The _note in the response now names the candidate set explicitly so
future readers don't have to dig into the SQL to know what's
included.
Updated comment block above the WHERE clause documents:
- which tkpatternusage values are included and why
- the empirical observation that motivated including Device DNs
- cross-reference to cti-audit-prompts/007 for the smoking-gun
candidates_evaluated:23-vs-26 evidence
Tests: +2 in TestDeviceDnInTranslationChainCandidates:
- test_translation_chain_sql_includes_device_dn_usage: lock the
SQL down so a future contributor can't re-narrow the filter to
(3, 5, 7) and re-introduce the false-positive class
- test_cti_rp_to_cti_rp_failsafe_does_not_false_positive: the
Bingham canary scenario — 911-CTI-RP forwarding to a Device DN
in a reachable partition correctly produces zero findings
The dispatch fake's SQL match-string updated from "(3, 5, 7)" to
"(2, 3, 5, 7)" to keep the existing 31 cti tests green; net
mcaxl suite: 269 → 271 passing.
Live re-run pending — will ping the agent thread with post-patch
output once the MCP server reloads.
Re-run expectations (per cucx-docs's 007):
- 911-CTI-RP / 912 finding (CFNA + CFUR): GONE — Device DN matches
- 912-CTI-RP / 10911 finding: UNCHANGED — Route pattern still
unreachable (CER911-PT not in 911CER-CSS)
- 913-CTI-RP / 60003 finding: UNCHANGED — destination doesn't
exist anywhere
Findings: 6 → 4 (the 4 that actually matter).
Limitation surfaced by the live Bingham smoke-test (cti-audit-prompts/004):
the canonical 912-CTI-RP finding got the broken-forward flag correct,
but the suggested-fix message couldn't name CER911-PT (where pattern
'10.911' lives) because the exact-literal lookup
`WHERE np.dnorpattern = '10911'` doesn't match the dot-form `10.911`.
The CUCM separator-dot in patterns is purely visual — represents
access-code boundary, not a digit. A destination string `10911`
should match a configured pattern `10.911` since both represent the
same dialed digits.
Two-stage match in _suggest_failsafe_fix:
1. Exact-literal: WHERE np.dnorpattern = '<dest>' (current behavior)
2. Dot-stripped: pull all patterns with `.` in them, filter
Python-side by `pattern.replace('.', '') == dest`
Stage 2 only runs when stage 1 returns no partitions, so the common
case (exact-literal hit) takes the fast path. Falls back to the
wildcard-investigation generic message only when neither stage finds
a match.
The fix message also distinguishes the two cases:
- Exact-literal hit → "Pattern '10911' lives in partition X..."
- Dot-stripped hit → "Pattern '10.911' (matches destination '10911')
lives in partition X..."
Naming both the pattern form and the destination keeps the operator
oriented when the dialed digits and the configured pattern look
different.
Tests: +5 in TestDotStrippedFixSuggestion exercising:
- dot-stripped match cites the dotted pattern form
- exact-literal takes precedence over dotted match
- multi-partition dotted match
- no-exact-no-dotted falls back to generic
- irrelevant dot-positions correctly excluded from match
One existing assertion updated from "no exact-literal pattern" to
"no exact-literal or dot-stripped pattern" (more accurate after the
patch).
Full mcaxl suite: 264 → 269 passing (+5 dot-stripped tests).
The 1 unrelated test_wildcard.py timing flake is pre-existing
(regex-backtracking timing assertion fails by 36ms under load).
Cross-references:
- Live smoke-test findings: agent-threads/cti-audit-prompts/004
- Original tool: agent-threads/cti-audit-prompts/002, commit d33cd7c
Closes the bug class cucx-docs flagged at Bingham — a CTI Route
Point's CFNA destination points at a number that is structurally
unreachable from the configured CFNA-CSS, so the failsafe forward
fires but finds no matching pattern and the call dies. Invisible
from any single-record inspection (CTI RP record looks fine,
destination pattern exists in some partition, CSS is fine — defect
lives in the relationship between CFNA-CSS and destination's
partition).
The motivating Bingham finding (life-safety severity):
912-CTI-RP (Secondary CER) CFNA + CFUR → "10911" via 911CER-CSS
Pattern "10.911" exists in CER911-PT
911CER-CSS does NOT contain CER911-PT
→ failsafe is structurally broken; both CER servers down would
produce fast-busy on 911 calls instead of routing through ELIN-10
to the PSAP
Implementation per axl/agent-threads/cti-audit-prompts/002:
- Tool, not prompt — output is structured + deterministic; same
shape as route_patterns_targeting (Q1 confirmed as proposed)
- Three-tier severity: HIGH for life-safety descriptions, MEDIUM
for non-life-safety, no LOW (Q2 refined from cucx-docs's
binary proposal — every broken forward is a real bug, just not
all are 911)
- Scope: CFNA + CFUR only for v1; CFB excluded by design (Q3
confirmed — CTI RPs rarely go busy)
- Lives in route_plan.py alongside route_patterns_targeting +
device_grep + translation_chain (Q5 — defer cti.py namespace
until adjacent prompts land)
- Named cti_failsafe_reachability not _audit (Q4 — drops the
_audit suffix per the established tool-vs-prompt naming split;
tools use direct-action names, prompts use _audit)
Life-safety token list (case-insensitive substring match against
name AND description):
("emergency", "911", "cer", "psap", "panic", "alert")
Suggested-fix message names the partition where the destination's
pattern lives and proposes either "add partition X to CSS Y" or
"change CSS to a CSS containing partition X." Falls back to a
generic "manual investigation needed" message when the destination
matches no exact-literal pattern in any partition (often means a
wildcard pattern is the actual target).
Tests: 26 in TestLifeSafetyDetection + TestCtiFailsafeReachability:
- 16 token-matching cases (10 positive, 4 negative, 2 sentinel)
- 10 tool-level cases including the canonical Bingham bug
reproduced verbatim (assertion compares the entire finding dict
to the expected output from cucx-docs's 001 message)
Full mcaxl suite: 238 → 264 passing (+26 from this work).
Adjacent prompts cucx-docs flagged as lower-priority follow-ups
(cti_route_point_audit, cti_port_pool_audit,
cti_application_user_audit) deferred but tracked.
Third cross-server prompt landing. Story D from
axl/agent-threads/cross-server-prompts/. Lives in mcaxl because the
dial plan is the first layer to investigate — "did the call make it
past CUCM at all?" determines the rest of the triage.
Composes:
- mcaxl (required) — route lookup for the dialed DID, identify
egress trunk and any translation patterns
- mcsiphon (recommended) — CDR record(s) for the failure window;
extract Q.850 cause codes, duration,
connect status, release direction,
device names + IPs
- mcdewey (optional) — Cisco's published troubleshooting guidance
for the specific cause code surfaced
Verdict layer names (literal vocabulary so downstream tooling can
pattern-match):
- cucm_dial_plan
- cucm_region_or_css
- cube_sip_trunk_negotiation
- far_end_sbc
- far_end_fax_server
- t38_negotiation_failure
- inconclusive
Cause-code-to-layer mapping table embedded in the prompt body covers
the high-frequency Q.850 codes (1, 16, 17, 19, 27, 31, 38, 47, 65,
79, 127). Plus releaseDirection (originator/destination/network)
guidance for narrowing.
Common patterns surfaced explicitly:
- Setup failures with dateTimeConnect=0 (signaling-layer)
- Mid-call drops with non-zero duration (often T.38 renegotiation)
- Cause-code mismatch between origCause and destCause (CUBE
translation layer)
- Receive-and-abandon at the route point (CSS doesn't reach
destination partition)
The prompt's specific high-leverage detail is the "T.38 renegotiation
mid-call" pattern — calls that connect on G.711, then fail when fax
tones trigger T.38 switchover. That shape is invisible from any single
MCP and is exactly what cucx-docs's planned-but-unwritten
runbooks/rightfax-failed-fax-investigation.mdx anticipated.
Tests: registration sentinel updated to 14 prompts. 238 tests passing.
Closes the planned cucx-docs runbook page — that becomes a thin
operational shim around this prompt rather than a from-scratch
troubleshooting tree.
First cross-server prompt landing. Story C from
axl/agent-threads/cross-server-prompts/. Closes a concrete cucx-docs
open finding (orphaned paging DNs 1302 / 1304 at /systems/paging/
carried as "confirm with operator").
A DN is "definitively dead" when:
1. It exists in numplan but has no devicenumplanmap entry —
no device claims it as a line (mcaxl, required)
2. It is not referenced as a UCCX trigger entry point
(mcuccx, strongly recommended)
3. It has no recent CDR activity (mcsiphon, optional)
Each layer narrows the candidate set; the intersection is "safe to
retire — verified across CUCM dial plan + UCCX contact center + CDR
activity."
The prompt embodies the architectural decisions confirmed in
cross-server-prompts/002:
- Per-primary-lens placement (Q1): lives in mcaxl because the
dial plan is its primary lens
- Graceful degradation with explicit gaps (Q2): the verdict
declares MCP availability up-front and adjusts confidence per
sibling; distinguishes connected-but-broken (include error) from
not-connected (note "unavailable")
- Normal naming, no cross_* prefix (Q3): just "dead_dn_finder"
Tier output: "definitively dead" / "likely dead, partial coverage" /
"structurally orphan, unconfirmed" / "active". The cucx-docs paging
DNs (1302, 1304) get explicit name-callouts in the verdict if they
appear, closing the loop back to /systems/paging/.
Tests: registration sentinel updated to 13 prompts. 238/238 passing.
Live-cluster smoke test pending — cucx-docs will run against the
Bingham cluster once they consume this thread direction.
Closes items 4-7 of cucx-docs's prompt-suggestions roadmap (see
axl/agent-threads/cucx-prompt-suggestions/ for the source thread).
did_block_overlap(block_pattern) — new prompt. LLM-orchestrated audit
that finds carveout patterns inside a DID block and surfaces silent
routing exceptions (e.g., 9498/9499 carved out of the 20878594XX block
to route to a different fax server). Composes the existing
route_patterns(filter=) tool with post-processing rather than
introducing a new tool — cucx-docs's #3 was originally pitched as a
tool, but the audit-narrative output is more naturally a prompt.
partition_summary(partition_name=None) — new prompt. "What is this
partition for?" orientation report composing route_partitions,
route_patterns, route_calling_search_spaces, and the new
route_patterns_targeting. No new SQL — this is pure orchestration.
Useful when walking into an unfamiliar cluster and seeing a partition
name like RTC-MGW-Inbound and needing to figure out its role before
touching anything.
cucm_sql_help — deepened with five schema-landmark sections that cost
real audit sessions 3-5 query attempts each to discover. Topics:
numplan↔device M:N via devicenumplanmap; non-existence of
sipdestination as a table; routelist (singular) ≠ numplan→RL;
LEFT-JOIN convention for type-decoder enum tables; CDR/CMR timestamp
localization (cluster-TZ-conditional). Also updated the docs-search
reference from "cisco-docs MCP" to "mcdewey MCP" to match yesterday's
rename.
cucm-schema-cheatsheet docs — appended a "Schema gotchas (from real
audit sessions)" section mirroring the cucm_sql_help content. Two
locations because they serve different consumers: the prompt is read
by an LLM at query time, the docs page is read by a human reviewing
the cluster offline.
Tests: registration sentinel updated to include the two new prompts
(catches the case where a new module is added without a server.py
shim — the prompt would otherwise be invisible to the LLM). Full
suite still 238 passing.
Q3 verification (CDR timestamp empirical) still pending — cluster TLS
intermittent this session. The schema-landmark text is conditional
on cluster TZ per cucx-docs's caveat, so even an unverified ship is
defensible.
Two complementary additions from cucx-docs's prompt-suggestions handoff
(see axl/agent-threads/cucx-prompt-suggestions/ for the source thread).
device_grep(pattern, classes=None) — fuzzy device discovery by name OR
description, optionally filtered by tkclass.name. Surfaces "wait, there
are TWO of these?" findings (parallel fax servers, duplicate CUBEs,
vestigial conference bridges) by grouping matches by class so the
structure of what matched is visible at a glance. CUCM-style % wildcards
work; case-insensitive matching via UPPER(); single quotes properly
escaped via _esc.
axl_sql error hints — when AXL returns an error AND the query contains
the trigger phrase, append a path-correction hint to the error message.
Two patterns shipped:
- "Column (fkdevice) not found" + numplan in query → suggest the
devicenumplanmap M:N join (the literal multi-attempt schema-discovery
experience cucx-docs hit at Bingham — numplan has no direct fkdevice)
- "not in database" + sipdestination in query → suggest sipdestinationgroup
+ sipprofile + axl_list_tables(pattern='sip%') for discovery (the
`sipdestination` table is reasonable-sounding but doesn't exist)
Hints are surgical (both error fragment AND query trigger must match)
to keep false-positive risk near zero. Validator behavior unchanged —
this is post-execution error augmentation, not gate enhancement. Failing
queries now raise RuntimeError(augmented) when a hint applies; otherwise
the original exception passes through unchanged.
Tests: +19 (8 device_grep + 11 error-hints with end-to-end mock through
execute_sql_query). Full suite 219 → 238 passing.
Live-cluster smoke test still pending (TLS handshake intermittent
this session). Sequencing nit from cucx-docs's msg 003 (move error-hint
earlier) honored — bundled with device_grep in this single commit.
Inverse of list_route_lists_and_groups — given a destination device,
return every numplan whose direct target is that device. Closes the
highest-priority gap from cucx-docs's prompt-suggestions handoff
(see axl/agent-threads/cucx-prompt-suggestions/001 + 003 for the
multi-message context).
Schema walk: device → devicenumplanmap → numplan, with LEFT JOINs
to routepartition + typepatternusage for friendly output. M:N is the
landmark — numplan does NOT have a direct fkdevice column, which was
cucx-docs's literal multi-attempt schema-discovery experience that
motivated the tool.
Three wildcard-expansion modes per cucx-docs Q2:
- False (default) — patterns intact
- "count" — per-pattern digit-string estimate + total surface;
unbounded patterns (`!`, `@`) reported as None count and force
the total to None so an auditor sees the partial measurement;
per-pattern caps at 10,000 to prevent runaway estimation
- "enumerate" — actual digit-string list, only for tightly-bounded
patterns (no `!`, no `@`, no `.`, ≤ 1,000 expansions); patterns
that violate any constraint return null with a skip reason
Direct-target only per Q1 — full transitive reachability composes
with route_lists_and_groups + route_translation_chain, called out
both in the docstring and in the response's _note field.
37 new tests cover the math layer (count/enumerate helpers in
isolation), the bounds/cap behavior, the unbounded-pattern flagging,
empty-result handling, SQL injection escaping, and the integration
through a FakeAxlClient. Full suite: 219/219 passing.
Live-cluster smoke test pending — cluster TLS intermittently failing
this session; will re-verify once stable.
The sibling docs server was renamed from `mcp-cisco-docs` to `mcdewey`
(generalized from a Cisco-only corpus to a multi-vendor docs library).
Update the prompt-enrichment section to point at the new package name +
its PyPI URL, and adjust the prose to call it "the sibling docs server"
generically rather than "cisco-docs" specifically.
The CHANGELOG entry referencing this project's own pre-rename name
(`mcp-cucm-axl`) is left intact — that's legitimate historical record
of why this project is now `mcaxl`.
Drift between the docs ("every tool is read-only") and reality
(cache_clear mutates the local SQLite cache) is the bug being
addressed here. The code is fine — cache_clear has zero CUCM-side
effect — but the docs over-promised by not naming the local-cache
exception explicitly.
cache_clear docstring (server.py): now leads with "Local-only:
mutates the SQLite response cache ... Does NOT touch CUCM" with a
pointer to the explanation page.
reference/tools.md: read-only claim qualified as "against CUCM";
the two enforcement layers (sqlparse validator + allowlist proxy)
named explicitly; cache_clear flagged as the lone local-mutation
tool.
explanation/read-only-by-structure.md: validator section updated
with the full forbidden-keyword list, multi-statement detection,
and an explanation of how sqlparse fixes the regex blindspots.
New "Defense-in-depth: read-only allowlist proxy" section
describing _ReadOnlyServiceProxy and the parallel RisPort gate.
New "What read-only does NOT mean" section enumerating the
local-cache exception and the AXL_CACHE_TTL=0 opt-out for
read-only-filesystem deployments.
Today, mcaxl is read-only against CUCM by *absence* — the tools
never call write methods. But absence isn't enforced: a future
contributor adding a tool could write
self._service.addRoutePartition(...) and zeep would happily
dispatch it. There's no positive guard.
Two new chokepoints close that gap:
AXL side — _ReadOnlyServiceProxy wraps the zeep service object.
__getattr__ refuses any method outside _ALLOWED_AXL_METHODS
(currently {getCCMVersion, executeSQLQuery}) with a new
ReadOnlyViolation exception, raised at attribute lookup BEFORE
zeep serializes a SOAP envelope. Underscore-prefixed and dunder
attributes pass through (zeep introspects via _binding_options,
__class__, etc., and those don't dispatch SOAP).
RisPort side — RisPort70 envelopes are hand-rolled, so the proxy
pattern doesn't apply directly. The equivalent chokepoint lives in
the envelope builders: _check_operation_allowed(name) is the first
line of every builder, and _ALLOWED_RISPORT_OPERATIONS is the
allowlist (currently {selectCmDevice}).
Operators can verify the proxy is active via the health tool —
connection_status() now reports read_only_proxy: true and
allowed_axl_methods: [...].
Tests:
- new tests/test_readonly_proxy.py (13 tests):
* allowed methods dispatch through to inner service
* 9 parameterized refusals (addRoutePartition, updatePhone,
removeUser, applyPhone, resetPhone, restartPhone,
executeSQLUpdate, doDeviceLogin, wipePhone)
* allowlist drift detection (set must be exactly what we
advertise — accidental widening fails red)
* dunder + underscore-prefixed passthrough
- tests/test_risport.py: +TestReadOnlyAllowlist (7 tests):
* selectCmDevice passes _check_operation_allowed
* 6 parameterized refusals (addCmDevice, removeCmDevice,
resetDevice, restartDevice, applyCmDevice, executeSQLUpdate)
* allowlist drift detection
182 tests pass total (was 161; +13 proxy + 7 risport + 1 allowlist
drift catch).
The regex-based validator worked for everything tested, but had a
class of structural blindspot: it didn't actually know what a token
was, so it accepted `SELECT 1; SELECT 2` (no forbidden keyword in
either statement) and relied entirely on the keyword scan catching
write verbs. With sqlparse we get:
- Explicit multi-statement detection via `len(sqlparse.parse(query))`
— `SELECT 1; SELECT 2` is now refused with a clear "Multiple
statements detected" message.
- Proper string/comment boundary handling — `'log: DROP detected'`
is one Literal.String token; the DROP inside it never reaches the
forbidden-keyword scan. `inserted_at` is one Name token; INSERT
isn't matched as a substring.
- Same conservative behavior for keywords-as-identifiers (sqlparse
is a lexer, not a parser, so `SELECT delete FROM device` is still
refused — CUCM's data dictionary doesn't use SQL keywords as
column names anyway).
Hamilton review CRITICAL #1 preserved: the cleaned query returned to
the caller is still byte-for-byte the input (modulo trailing ; and
outer whitespace). sqlparse is consulted for analysis only.
Tests: +6 sqlparse-specific cases in TestSqlparseSpecific covering
multi-statement, comment-disguised injection, keyword-substring
identifiers, and CTE walks. 2 existing tests broadened from
match="DROP" to match="DROP|Multiple" — same query refused, the
diagnosis just got more accurate (multi-statement caught earlier
than forbidden-keyword scan).
36/36 validator tests pass.
The Astro docs site doesn't belong in the published sdist (node_modules,
build artefacts, dev container scaffolding). Adds `docs/` to the existing
sdist exclude list, alongside the other dev-only paths.
Compose project name pinned to `mcaxl-docs` via the v2 `name:` field.
Without it, Compose defaults to the parent directory's basename — and
all three sibling docs sites live in `docs/`, so they were colliding
and cross-recreating each other on every `up`.
Three additions to the docs site, all atomic to docs/:
1. Deployment configs (Dockerfile + Caddyfile + docker-compose.yml +
.env.example + Makefile) mirroring bingham/cucx's pattern. The
compose service uses caddy-docker-proxy labels with the operator's
.mcp.l.supported.systems wildcard DNS pattern; suggested subdomain
is mcaxl-docs.mcp.l.supported.systems.
2. Logo + favicon (forest-green palette matching the existing custom.css
accent). Wordmark uses ui-monospace with currentColor so Starlight
inverts on light/dark; icon-mark is a terminal chevron + three
diminishing query-row lines (audit-by-query motif).
3. Live cluster examples in reference/tools.md for axl_version,
axl_list_tables (route% pattern), and axl_describe_table
(routepartition). Outputs sanitized per python.md PII rules
(15.0.1.12900(234) → 15.0(1); cluster-fingerprinting build string
removed).
Build clean: 17 pages built, pagefind search index across all,
favicon resolves to /favicon.svg, logo fingerprinted into _astro/.
Not yet deployed — operator wires docker compose up when ready.
17-page Astro/Starlight site mirroring the bingham/cucx conventions
(telemetry off, devToolbar off, astro-icon + lucide, separate
custom.css, Diátaxis-structured sidebar with autogenerate per
directory). Green accent palette differentiates from bingham/cucx's
teal.
Pages by Diátaxis quadrant:
- Getting Started (3): installation, configuration, first-audit
- How-To (4): sip-trunk-report (port from docs/query-patterns/),
route-plan-overview, investigate-pattern (mermaid flowchart),
find-orphan-resources
- Reference (4): tools (all 19), prompts (all 10), env-vars,
cucm-schema-cheatsheet
- Explanation (4): read-only-by-structure, cluster-isolated-cache,
hamilton-review-patterns, pypi-yank-lesson
Build-verified clean (npm run build → 17 pages in 7.88s, pagefind
search index built across all pages, zero errors).
Legacy docs/query-patterns/sip-trunk-report.md kept in place — that
file ships in the published Python sdist's docs/ tree, deletion would
be a package change not just a docs-site change. The new how-to
version is a near-verbatim port.
Content gaps for follow-up: real cluster-output examples in tool/
prompt reference pages, verified CUCM 15 SQL in
find-orphan-resources.md, optional favicon.
Not yet wired for deployment (Caddyfile/Dockerfile out of scope for
v1). Local preview: cd docs && npm run dev.
The original 2026.04.27 was published-then-deleted from PyPI within
hours after a stricter audit (against the unpacked sdist, not just
curated source paths) found cluster-fingerprint content that the
pre-publish grep had missed. This release supersedes the deleted one;
no functional differences.
Issues found in 2026.04.27 that this fixes:
1. docs/query-patterns/sip-trunk-report.md — "Live result snapshot"
section (38 lines) contained the live cluster's actual SIP trunk
inventory: real hostnames (exp-c-p.binghammemorial.org), real
internal IPs (172.20.6.99, .104, .105, .114, .120, .222, plus
172.20.2.22, 172.20.14.105, 172.24.10.10), real trunk-name +
description rows. Section removed entirely. The query-pattern doc
itself still ships — schema/SQL guidance is generic and useful.
One inline FQDN example (`exp-c-p.binghammemorial.org`) replaced
with `exp-c-p.example.com`. Status line that named the specific
maintenance release (`Validated against CUCM 15.0.1.12900-234 on
2026-04-25.`) genericized to `Validated against CUCM 15.`
2. .mcp.json shipping in sdist with `/home/rpm/bingham/axl` as the
`--directory` argument. Local filesystem path = hostname leak.
Added to `[tool.hatch.build.targets.sdist] exclude`. File stays
in the source repo for development; no longer ships.
3. pyproject.toml comment about the audit workflow ironically
contained the literal word "bingham" as the example grep token.
Rewritten to use "site-specific tokens" generically.
Audit verification (against the unpacked sdist this time):
tar -xzf dist/mcaxl-2026.4.27.1.tar.gz -C /tmp/sdist-inspect
grep -rnEi 'bingham|binghammemorial|10\.[0-9]+\.[0-9]+\.[0-9]+|
172\.(1[6-9]|2[0-9]|3[01])\.[0-9]+\.[0-9]+|
192\.168\.[0-9]+\.[0-9]+|SupportedSystems|CCX-AXL|
CER-AXL|CUC-AXL|TabSync|variphy|15\.0\.1\.12900|
production cluster|/home/rpm|cucm-pub\.bingham'
/tmp/sdist-inspect/
→ returns empty (verified)
Tests still 155/155.
Lesson encoded for next time: the pre-publish audit MUST run against
the unpacked sdist, not just the four explicitly-named paths in the
python.md rule (src/, tests/, README.md, pyproject.toml, .env.example).
The sdist also pulls in docs/, top-level dotfiles, and uv.lock.
CHANGELOG.md spells this out in the post-release note for next time.
Six surgical scrubs to clear cluster-fingerprint references before the
PyPI release. Per `~/.claude/rules/python.md`'s pre-publish PII audit
section: specific build strings (`15.0.1.12900-234`-style maintenance
release IDs) and cluster role descriptors ("production") narrow the
fingerprint of which deployment the developer tested against. Replaced
with the more accurate Cisco user-facing version ("CUCM 15.0(1)" or
"CUCM 15") and operational descriptor ("live cluster" — same trust
signal without the prod disclosure).
Files:
README.md
"Tested against CUCM 15.0.1.12900" → "Tested against CUCM 15.0(1)"
placeholder host hardened to "cucm-pub.example.com" (RFC-reserved
`.example` TLD per the rule's documented convention)
CHANGELOG.md
"production CUCM 15.0.1.12900 cluster" → "live CUCM 15 cluster"
src/mcaxl/risport.py
Comment: "verified against CUCM 15.0.1.12900 documentation" →
"verified against CUCM 15 RisPort70 docs"
src/mcaxl/route_plan.py
Comment: "the typepatternusage table in CUCM 15.0.1.12900" →
"the typepatternusage table in CUCM 15"
.env.example
Normalized to RFC-reserved values:
cucm-pub → cucm-pub.example.com
AxlUser → axl-readonly (descriptive function, not
a real-account-shape name)
TopSecret... → replace-with-your-password (clearly a placeholder)
Audit verification:
grep -rnE '15\.0\.1\.12900|bingham|SupportedSystems|CCX-AXL|CER-AXL|
CUC-AXL|TabSync|variphy|production|10\.[0-9]+\.[0-9]+\.[0-9]+|
172\.(1[6-9]|2[0-9]|3[01])\.[0-9]+\.[0-9]+|192\.168\.[0-9]+\.[0-9]+'
src/ pyproject.toml README.md CHANGELOG.md .env.example
→ returns empty (verified)
Sdist verification:
tar -tzf dist/mcaxl-2026.4.27.tar.gz | grep -iE 'CLAUDE|axlsqltoolkit|
bingham|tests/'
→ returns empty (verified)
Tests directory IS excluded from sdist via
`[tool.hatch.build.targets.sdist] exclude = ["tests/"]` — important
because test fixtures contain real cluster hostnames in mock SOAP
responses (test_risport.py SAMPLE_RESPONSE). Tests stay in the source
repo for development; they don't ship to PyPI.
Tests still pass: 155/155.
Ready for `uv publish --token …`.
Renames the package from `mcp-cucm-axl` to `mcaxl` to fit the
operator's mc<interface> naming convention (mcusb, mcaxl, …),
and scrubs Bingham-specific defaults so the package works for
anyone, anywhere.
Rename:
- pyproject.toml: name, scripts entry point, description
- src/mcp_cucm_axl/ → src/mcaxl/ (git mv preserves history)
- All Python imports updated via sed
- Cache directory: ~/.cache/mcp-cucm-axl/ → ~/.cache/mcaxl/
- Log prefix [mcp-cucm-axl] → [mcaxl]
- Package version lookup: importlib.metadata.version("mcaxl")
- .mcp.json command updated to invoke `mcaxl` script
- All 155 tests pass under the new name (verified)
Bingham-specific scrubs:
- docs_loader._DEFAULT_INDEX_DIR: hardcoded /home/rpm/bingham/...
path removed; defaults to None. Operators set CISCO_DOCS_INDEX_PATH
env var; without it, prompts gracefully degrade with a fallback
notice instructing the LLM to use the cisco-docs MCP search_docs
tool instead.
- prompts/_common.docs_or_empty_msg: removed the explicit
/home/rpm/bingham/... path from the fallback message text.
- server.py: removed dead-code copy of _docs_or_empty_msg() that
was leftover from before the prompts package extraction.
- README.md: completely rewritten as a public-facing readme. Lead
paragraph names CUCM as the target platform, install instructions
cover uvx / pip / Claude Code MCP add. Recommends cisco-cucm-mcp
as the operations counterpart.
PyPI metadata:
- Initial CalVer version: 2026.04.27
- License: MIT (LICENSE file added)
- Project URLs: Homepage / Source / Issues / Changelog all point
at git.supported.systems/mcp/mcaxl (newly-created Gitea repo
in the mcp/ org for PyPI releases)
- Classifiers: Beta / Telecommunications Industry / Topic:Telephony
- Keywords: mcp, cisco, cucm, axl, risport, voip, sip, audit
- sdist excludes: CLAUDE.md, .env*, axlsqltoolkit.zip, audits/,
tests/, pytest/ruff caches. Verified clean: wheel ships only the
mcaxl/ source tree + LICENSE + METADATA + entry_points.
CHANGELOG.md added with a 2026.04.27 initial-release entry,
documenting tool/prompt counts, structural read-only guarantees,
Hamilton review closure, live-cluster verification, and known
limitations.
Build verification:
- `uv build` produces clean wheel + sdist
- Wheel: 22 source files, 195KB total, no Bingham-specific files
- Sdist excludes verified: no CLAUDE.md, no axlsqltoolkit.zip
- Entry point: `mcaxl = mcaxl.server:main`
- Package installs as mcaxl==2026.4.27
Two ideas borrowed from cisco-cucm-mcp (calltelemetry/cisco-cucm-mcp,
MIT licensed): real-time device registration via RisPort70, and
exponential-backoff retry on transient HTTP 5xx errors. Both are
purpose-built for the audit use case rather than general-purpose
ports — RisPort tools exist to inform audit findings, not as a
standalone "look at my devices" interface.
Rate limit / 503 backoff (~30 lines + 3 tests):
AxlClient now mounts an HTTPAdapter with a urllib3 Retry policy
(3 retries, exponential backoff, status_forcelist=[502,503,504]).
Configurable via AXL_RATE_LIMIT_RETRIES (default 3, 0 disables).
Surfaces in connection_status() so operators can see the policy.
Closes a real reliability gap: CUCM SOAP rate-limits under load
during change windows or with multiple concurrent admins; pre-fix
any 503 was a hard failure.
RisPort70 (new src/risport.py + 2 tools + prompt update):
Hand-coded SOAP client for /realtimeservice2/services/RISService70
(avoids dragging in another zeep instance for one operation).
Reuses AXL_URL/USER/PASS env vars — RisPort lives on the same host.
New tools:
device_registration_status(device_class, status, name_filter, page_size)
device_registration_summary() — cluster-wide breakdown by class
Live-cluster verification (cucm-pub.binghammemorial.org):
Phone: 803 registered=679 unregistered=123 rejected=1
Gateway: 85 registered=41 rejected=44 ← real audit finding
SIPTrunk: 22 registered=18 unregistered=4
HuntList: 28 registered=28
H323/CTI: 0 (cluster doesn't use these)
Discovered while live-verifying: CUCM 15 wraps the RisPort response
in an extra <SelectCmDeviceResult> element inside <selectCmDeviceReturn>.
Older CUCM versions exposed the fields directly. The parser falls
back to either shape; tests cover both (test_legacy_response_shape_still_parses
asserts the older shape still works).
phone_inventory_report prompt updated:
New Step 3 — "Cross-reference with real-time registration" — recommends
device_registration_summary() + device_registration_status(status="UnRegistered")
to surface configured-but-never-registered phones (strongest orphan signal),
PartiallyRegistered phones (firewall/cert/version mismatch indicator),
and registration-state vs config-state mismatches.
Tooling delta worth noting:
AXL device count: 1,377 phones
RisPort device count: 803 phones
Delta (~574) likely templates, hidden phones, or stale config —
itself an audit finding the new tool will surface
to anyone running phone_inventory_report.
README updated:
- Added health(), device_registration_status, device_registration_summary
- Added "Scope and complement" section recommending @calltelemetry/cisco-cucm-mcp
alongside for operational debugging (logs, perfmon, packet capture,
service control). The two servers answer different questions; the LLM
with both can compose audit findings with operational state.
- Listed all 10 prompts (was 4 outdated entries).
Tests: 134 → 155 (+21).
Closes bingham/mcp-cucm-axl#1
route_devices_using_css missed device.fkcallingsearchspace_cgpntransform
and _cdpntransform — the columns trunks use to attach calling-party and
called-party number transformation CSSs. A CSS only referenced via these
columns showed up as "0 references" in impact analysis, leading an
operator to conclude safe-to-delete and break outbound transformations.
Same failure shape as Hamilton CRITICAL #2 (false-zero impact analysis)
but at a different schema layer: that fix added 7 reference points
covering the obvious cases; this fix closes the rest.
What's covered now (71 fkcallingsearchspace_* columns total across 14
tables in CUCM 15):
Templates added for the bulk cases:
_device_query(suffix) — device.fkcallingsearchspace_<suffix>
_devicepool_query(suffix) — devicepool.fkcallingsearchspace_<suffix>
_numplan_query(suffix) — numplan.fkcallingsearchspace_<suffix>
Categories added (51 new):
11× device variants (incl. _cgpntransform and _cdpntransform — the issue)
17× devicepool inheritance variants (closes M1 caveat from audit reports)
13× numplan forwarding/transformation variants (cfbint/cfhr/etc.)
site, externalcallcontrolprofile, recordingprofile, usageprofile,
vipre164transformation×2, incomingtransformationprofile×4
Schema gotchas discovered and codified:
- devicepool, externalcallcontrolprofile, recordingprofile have no
`description` column (verified against syscolumns 2026-04-26)
- site has neither `name` nor `description` — uses `tksite` enum joined
against `typesite.name` for the human-readable form
Live verification on cucm-pub.binghammemorial.org (CUCM 15.0.1.12900-234):
XFORM-Outbound-ANI: 0 → 1 ref (PSTN-Router-SIP-Trk via _cgpntransform)
XFORM-Outbound-DNIS: 0 → 1 ref (PSTN-Router-SIP-Trk via _cdpntransform)
E911CSS: unchanged at 0, but now with `complete: True`
— upgrades from "appears orphan with caveat" to
"confirmed orphan" since DP variants now covered
Internal-CSS: 163 → 174 refs (DP + extra numplan variants)
Tests (128 → 134, +6):
test_issue_1_cgpntransform_column_enumerated
test_issue_1_cdpntransform_column_enumerated
test_finds_trunk_via_cgpntransform_reference (mock-driven E2E)
test_complete_schema_coverage_against_known_columns
— encodes the 71-column snapshot from CUCM 15. If a future CUCM
version adds a new fkcallingsearchspace_* column, the test fires
red so the contributor knows to add it to _CSS_REFERENCE_QUERIES.
test_no_duplicate_table_column_pairs
— guards against double-counting if two categories accidentally
reference the same column.
test_error_in_multiple_tables_propagates
— verifies error reporting works across the new shared-suffix cases
(e.g., _cgpnunknown on both device AND devicepool).
Operator-suggested prompt: "what does my AXL account *actually* have
permission to do?" Resolves the user → access-control-group →
function-role chain for a single account, defaulting to the AXL service
account from AXL_USER env when no userid is given.
The prompt principle came in using table names from older Cisco
docs (`enduserauthgroupmap`, `dirgrouprolemap`) that don't exist on
CUCM 15. The shipped SQL uses the verified CUCM 15 names
(`enduserdirgroupmap`, `functionroledirgroupmap`); a regression test
asserts the deprecated names don't appear in the rendered SQL section,
so any future "fix" reverting to the older names fires red.
Live verification on cucm-pub.binghammemorial.org found the existing
AXL service account (`SupportedSystemsReadOnly`) has 4 roles via the
`ReadOnly-AXL` access control group:
- Standard AXL API Access (full RW — group misnamed)
- Standard AXL Read Only API Access (the genuinely-read-only one)
- Standard Packet Sniffing (PHI-relevant in healthcare)
- Standard RealtimeAndTraceCollection
The first finding is structural: the group `ReadOnly-AXL` contains
the FULL RW role `Standard AXL API Access` despite its name. The
MCP server's structural read-only enforcement (no write methods
registered) is what prevents this from mattering — but the account
itself is over-privileged relative to what the tool needs. The
prompt's findings template surfaces this kind of misnamed-group
case explicitly.
Also discovered (and documented in the prompt body): AXL auth is
case-insensitive for usernames, but SQL `WHERE name = 'X'` is
case-sensitive. Step 3 of the prompt handles the case-mismatch
fallback so a typo like `SupportedSYstemsReadOnly` (env) vs
`SupportedSystemsReadOnly` (cluster canonical) doesn't produce a
silently-empty result.
5 new tests:
- correct CUCM 15 table names embedded in SQL
- explicit userid threads through to the query
- default reads AXL_USER from env
- missing userid AND missing env → clear instruction
- SQL injection defense (single-quote escape)
123 → 128 tests; 9 → 10 prompts. Prompt registration smoke test
updated to assert the new shim is wired.
Builds on the prompts-package extraction. Each new prompt embeds
schema-verified SQL plus a findings template tuned to surface
audit-actionable issues (orphans, drift, capacity outliers, security
posture).
phone_inventory_report(filter=None):
Aggregates by model / device pool / CSS, then anomaly queries for
phones with no description, phones whose description echoes their
MAC-based name, phones with no owner, phones in non-default CSS.
Cross-references owner status (phones owned by inactive users
surface as findings).
user_audit(focus=full|admin|inactive|app_users):
End user + application user inventory, role/group assignments via
the enduserdirgroupmap → dirgroup → functionroledirgroupmap →
functionrole join chain. Security-critical findings: app users
with admin-grade role memberships, local-user accounts with admin
privileges, phones owned by inactive users.
inbound_did_audit():
Reusable form of today's cucm-inbound-did-inventory work. XFORM-
Inbound-DNIS curated list categorized (pass-through, block-trans,
specific renames, wildcards, catch-all hazard). Cross-checked
against Internal-PT route patterns and the operator-curated
PSTN-Screen-PT spam blocklist. Findings for orphan target
extensions and the silent !-catch-all risk.
hunt_pilot_audit():
Hunt pilot inventory with queue settings, line group membership,
and distribution algorithm decoding. Schema knowledge already
Hamilton-verified: huntpilotqueue joins via fknumplan_pilot, NOT
fknumplan (the test asserts the correct column appears in the
rendered prompt). Findings: queue misconfigurations (NULL
destinations, infinite max-wait), empty line groups, dead pilots
with no route-list destination.
Implementation notes:
- Each prompt's SQL was validated against the live cluster
(cucm-pub.binghammemorial.org, CUCM 15.0.1.12900-234).
- user_audit originally used UNION ALL with NULL-typed status
column for the headcounts query; Informix rejected it. Split
into two simpler queries (commented in the prompt body).
- phone_inventory_report uses a Hamilton-style SQL escape for
the optional name_filter (single quotes doubled).
- All four prompts gracefully degrade when the docs index isn't
loaded (verified by test_all_new_prompts_render_without_docs).
114 → 123 tests; 5 → 9 prompts. Full live-cluster verification:
- 12 phone models, 629 Cisco 7841 phones (largest model)
- 1,246 active end users, 25 application users
- Hunt pilots with named distribution algorithms (Broadcast, Top
Down, etc.) — confirms typedistributealgorithm join works
- Hamilton-fixed huntpilotqueue.fknumplan_pilot column verified
in the embedded SQL via dedicated regression test.
Refactor: the four existing inline prompts in server.py move into
individual modules under src/mcp_cucm_axl/prompts/. Server.py keeps
thin @mcp.prompt-decorated shims that delegate to the corresponding
render() function — FastMCP needs the shims because it introspects
their signatures to expose parameters to the LLM, but the prompt
*content* now lives one-prompt-per-file.
Why: server.py's prompt section had grown to ~200 lines of inline
markdown. As more query patterns get documented (see
docs/query-patterns/) this would only worsen. Per-module bodies are
easier to diff, review, and unit-test in isolation.
Layout:
src/mcp_cucm_axl/prompts/
__init__.py
_common.py — shared helpers, keyword sets, render_schema_block
route_plan_overview.py
investigate_pattern.py
audit_routing.py
cucm_sql_help.py
sip_trunk_report.py — NEW
Each prompt module exports a `render(docs, *args) -> str` function
that takes the DocsIndex as a parameter (no module globals). The
shim in server.py grabs the runtime `_docs` and passes it in. Pure
functions = trivially unit-testable.
NEW prompt: sip_trunk_report.
Implementation reference: docs/query-patterns/sip-trunk-report.md
(written separately as a query-pattern doc, validated against the
live cluster). The prompt embeds:
- Step 1: trunk inventory SQL (device + sipdevice + 5 LEFT JOINs)
- Step 2: per-destination SQL (siptrunkdestination)
- Step 3: pointer to existing route_lists_and_groups() tool
- Step 4: findings template (SPOF, profile sprawl, CSS asymmetry,
codec heterogeneity, DNS-vs-IP, security posture)
Optional `name_filter` parameter narrows the inventory via LIKE; the
filter value is escaped for SQL safety (single quotes doubled per
Informix convention).
Tests: 14 new in tests/test_prompts_package.py covering each
prompt's render() with and without docs, plus a registration smoke
test that confirms the FastMCP shim set matches the prompts package
exports (catches the case where a new module is added without its
shim).
Total: 100 → 114 tests; 5 prompts registered; live verification
against cucm-pub.binghammemorial.org confirms the embedded SQL
produces real inventory data. The four original prompts are
behaviorally identical to before — same content, just relocated.
Document the SQL queries used to build a comprehensive SIP trunk
inventory (device + sipdevice + siptrunkdestination joins, plus
route_lists_and_groups for membership). Captures rationale for each
column, common gotchas (routelistdetail doesn't exist, lvarchar(1)
flag fields return 't'/'f' strings), and a draft prompt signature
suggesting how to extract this into a @mcp.prompt function in
server.py — same shape as the existing route_plan_overview /
investigate_pattern / audit_routing prompts.
Empty src/mcp_cucm_axl/prompts/ directory remains unused; this lives
under docs/ since it's reference material rather than a runtime
prompt. Future commit can promote the queries into the prompt
function and delete this if redundant.
Live result snapshot included for reference (CUCM 15.0.1.12900-234,
2026-04-25, 11 trunks).
Closes the four remaining findings from the margaret-hamilton review.
13 new regression tests; all 100 pass; live cluster smoke verified.
MAJOR #4 — wildcard regex catastrophic backtracking + silent malformed.
Two changes to _wildcard_to_regex():
a) Bounded the `!` and `@` wildcards to \d{1,50} (was \d+). Adjacent
`!` patterns previously compiled to (\d+)(\d+)... which has
exponential backtracking on near-miss inputs. CUCM dial strings
are practically capped well below 50 digits; the bound keeps
complexity polynomial without losing real-world coverage.
Verified: 10 adjacent `!` against a 30-digit near-miss now finishes
in ~240ms (was unbounded; could have been minutes on real
pathological cases).
b) Unclosed `[` now raises ValueError instead of silently treating the
bracket as a literal. _pattern_matches_number catches the error
and returns False so a single bad pattern doesn't crash
translation_chain — but the bad pattern is no longer invisibly
producing wrong matches. The previous silent fallback meant a
pattern like `[0-9` (typo, missing `]`) would match input
containing the literal characters `[` `0` `-` `9`.
3 new tests covering: bounded-regex shape (`\d{1,N}`), pathological
input completes quickly, unclosed bracket raises explicitly,
well-formed character class still works.
MAJOR #5 — distinguish config errors from operational errors.
Pre-fix: any first-time connection failure set `_connection_error`
and pinned it forever. A transient network blip or session timeout
required restarting the MCP server. Hamilton's framing: Apollo's
software was *designed* to recover from transient faults; pinning
forever is the antithesis of "design the error path first."
Fix: split into two state fields:
_config_error — permanent until restart (missing env vars only)
_last_error — last operational failure, NOT a pin
Operational failures (zeep Client construction, network, TLS, session)
clear from the next call's perspective: the next call attempts fresh.
Configuration errors (missing AXL_URL etc.) stay pinned because
they don't get better on retry.
Added _ConfigError as a private subclass to make the distinction
explicit at the raise site, and connection_status() to expose
connected/connected_at/config_error/last_error for diagnostic
transparency.
3 new tests: config errors pin, operational errors don't pin,
connection_status() reports state.
MINOR #6 — _to_int silent coercion of bad data.
Pre-fix: a non-numeric value from the cluster (data corruption,
schema drift across CUCM versions) silently became None, which
downstream sort logic defaulted to 0 — jumbling the failover order
in the displayed result with no warning.
Fix: still returns None on bad data (caller error path unchanged),
but logs the offending value to stderr so an operator notices
something's wrong at the data layer. None itself is silent
(legitimately-unset column).
2 new tests: real None is silent, bad string logs to stderr with
the offending value visible.
MINOR #7 — standardize tool failure shapes; add health() tool.
Pre-fix: cache_stats and cache_clear returned `{"error": "..."}`
when _cache was None, while AXL-touching tools raised RuntimeError.
LLM consumers had to handle two shapes.
Fix: _require_cache() helper raises RuntimeError consistently with
_client(). All tool failures now use the same exception shape.
Added health() tool that reports cache/axl/docs initialization
status plus the AXL connection_status — gives operators a
self-diagnostic when something fails at bootstrap.
3 new tests: cache_stats raises, cache_clear raises, health()
reports each subsystem.
Three findings from a margaret-hamilton-style review of the MCP server,
fixed with regression tests written first (red → green). One bonus
finding (huntpilotqueue column name) was surfaced by the third fix
itself — exactly the audit-trust failure mode that fix exists to expose.
CRITICAL #1 — sql_validator: comment-strip mutated string literals.
The cleaned query returned by validate_select() is what travels to AXL.
Previously, the comment-strip pass ran before the literal-aware pass,
so `--` or `/* */` markers inside a string literal were silently eaten:
input: WHERE description = 'Smith -- old line'
to AXL: WHERE description = 'Smith (truncated mid-literal)
The LLM saw rows that looked plausible but were not what its query
asked for. "Confidently wrong" is exactly the failure mode the review
was hunting.
Fix: only strip comments on the analysis-only copy used for keyword
detection. The cleaned output preserves the input verbatim (modulo
trailing semicolon and outer whitespace). 6 new tests covering literal
preservation across `--`, `/* */`, LIKE patterns with embedded comment
markers, and forbidden keywords inside real comments.
CRITICAL #2 — cache key omitted cluster identity.
The on-disk cache key was `method::args_json`. An operator swapping
AXL_URL between test and prod (or between two clusters) would silently
serve stale data from cluster A as if from cluster B. The audit
report would be confidently wrong with no signal anything happened.
Fix: AxlCache now takes cluster_id and prefixes all keys with it.
Server bootstrap derives cluster_id as a 12-char SHA-256 prefix of
AXL_URL. cache_stats() surfaces both the current cluster_id and a
`foreign_cluster_entries` count so an env-swap is visible. Schema
migration handles pre-fix cache files via PRAGMA table_info introspection
plus a one-shot ALTER TABLE ADD COLUMN. 5 new tests covering isolation,
shared-id sharing, stats reporting, legacy DB upgrade, and per-cluster
clear() scoping.
MAJOR #3 — find_devices_using_css summary undercounted partial failures.
The function is per-category resilient (one failed query doesn't kill
the whole impact analysis), but the resilience never propagated up to
the response. total_returned and any_truncated only reflected SUCCESSFUL
categories. An LLM consuming "47 references" had no way to know 5
categories errored and the real number was likely much higher.
Fix: response now includes complete: bool, categories_with_errors: int,
and error_categories: [list]. The LLM/auditor sees the partial-failure
state and can decide whether to act on incomplete data. 5 new tests
using a FakeAxlClient stand-in to simulate per-category failures.
BONUS finding (uncovered by Major #3 fix): huntpilotqueue join used
the wrong column. Three CSS impact categories (huntpilot_max_wait_css,
huntpilot_no_agent_css, huntpilot_queue_full_css) were silently
erroring with "Column (fknumplan) not found" because huntpilotqueue
joins via fknumplan_pilot, not fknumplan. With the Major #3 fix in
place, this surfaced immediately as `complete: False, error_categories:
[3 huntpilot_*]` against the live cluster. Fixed inline; live re-run
now reports `complete: True, total_returned: 163` for Internal-CSS.
87 unit tests passing (up from 70). Live cluster smoke test
(cucm-pub.binghammemorial.org, CUCM 15.0.1.12900-234) verifies all
three fixes plus the bonus finding work end-to-end.
Two defects found during live-cluster audit shakedown.
1. SQL validator false-positives on string literals
The forbidden-keyword check tokenized the entire query, including
contents of single-quoted string literals. CSS names like
'Call Forward-CSS', DN descriptions containing 'DELETE', or partition
names with 'INSERT' all tripped the validator even though the SQL
itself was clean read-only. Found while running impact analysis on
"Call Forward-CSS".
Fix: strip string literals (single-quoted, with '' as escape) into
whitespace before the forbidden-keyword tokenization. The cleaned
query returned to the caller still contains the literals — they're
only invisible to the analysis pass.
7 new tests covering: words inside literals (Call/Drop/Delete/etc.),
escaped quotes, multiple literals, and the critical case where a
forbidden keyword appears immediately after a literal.
2. CSS impact analysis missed primary device CSS + 7 other refs
Running route_devices_using_css("E911CSS") returned total=0 even
though E911CSS is configured in the cluster. Root cause: our
enumeration covered device.fkcallingsearchspace_{reroute,restrict,
refer,rdntransform} but not the primary device.fkcallingsearchspace
itself — the column the GUI sets when assigning a CSS to a phone.
The simple unsuffixed name didn't match our earlier "%css%" schema
filter (the actual column spells out "callingsearchspace").
Added 8 new reference categories:
device_primary_css — the big one
device_cgpn_unknown_css — calling-party-unknown
line_monitoring_css — devicenumplanmap monitoring CSS
gateway_h323_called_xform_css — H.323 gateway transform
gateway_sip_called_xform_css — SIP trunk transform
huntpilot_max_wait_css — hunt pilot queue handling
huntpilot_no_agent_css — hunt pilot queue handling
huntpilot_queue_full_css — hunt pilot queue handling
Re-running on live cluster:
Internal-CSS: 146 -> 163 refs (16 new device_primary_css matches)
Call Forward-CSS: previously rejected by validator -> 150 refs
E911CSS: still 0 — high-confidence orphan finding now
Two MCP tools blew the per-response token cap when run against a real
medium-sized cluster (Bingham Memorial, ~1500 patterns in Internal-PT,
20 route filters with hundreds of member rules each):
route_devices_using_css("Internal-CSS") -> 103,590 chars
route_filters() -> 304,639 chars
Both responses are now compact-by-default with opt-in detail:
route_filters(include_members=False, default):
- returns name, clause, dial_plan, and member_count per filter
- 304,639 -> 17,354 chars (94% reduction)
- member_count is the audit-relevant signal anyway: filters with
100+ rules are complex; the count tells you that without paying
for the full rule listing
- include_members=True scopes detail to a single named filter
(BLK-ALWAYS-RF with 432 rules: 40K chars; tractable per-filter)
route_devices_using_css(max_per_category=50, default):
- each category returns at most max_per_category rows
- truncated: bool flag set when underlying count exceeds the cap
- 103,590 -> 13,855 chars (87% reduction)
- implementation uses SELECT FIRST max+1, so no extra COUNT query
per category — single round-trip with accurate truncation flag
- LLM can drill in via higher max_per_category or axl_sql when
truncated=true
Both changes are backward-compatible defaults; existing callers continue
to work and just get smaller, structured responses.
- route_plan.py: drop `NULL AS context` from voicemail_pilot_css query.
Informix rejected it as a syntax error; the column wasn't carrying any
signal anyway, so the simpler SELECT works and matches the other
reference-point queries.
- README.md: tool table now covers all 16 tools (route_device_pool_route_groups,
route_devices_using_css, route_filters were missing).
- .gitignore: explicitly ignore .env. Already covered by ~/.gitignore_global,
but worth being self-contained — anyone cloning without the global ignore
shouldn't be one stray `git add` away from leaking AXL credentials.
Read-only MCP server for Cisco Unified CM 15 AXL — built for LLM-driven
cluster auditing, with a particular focus on the Route Plan Report:
partitions, calling search spaces, route patterns, translation patterns,
called/calling party transformations, and digit-discard instructions.
Pairs intentionally with the sibling mcp-cisco-docs server (live
cluster state + vendor docs in one LLM context).
Architecture:
- zeep SOAP client to CUCM AXL
- WSDL bootstrap from Cisco's axlsqltoolkit.zip (auto-extract on
first launch; zip is gitignored, vendor-licensed)
- SQLite response cache at ~/.cache/mcp-cucm-axl/responses/
- Schema-grounded prompts that pull chunks from the sibling
cisco-docs index (docs_loader.py)
Read-only by structural guarantee — never registers AXL write methods
(no executeSQLUpdate, no add*/update*/remove*/apply*/reset*/restart*
tools). SQL queries also client-side validated (sql_validator.py) to
begin with SELECT or WITH.
Tools exposed:
Foundational: axl_version, axl_sql, axl_list_tables,
axl_describe_table, cache_stats, cache_clear
Route plan: route_partitions, route_calling_search_spaces,
route_patterns, route_inspect_pattern,
route_lists_and_groups, route_translation_chain,
route_digit_discard_instructions
Prompts (schema-grounded):
route_plan_overview, investigate_pattern, audit_routing,
cucm_sql_help
Tests cover cache, docs_loader, normalize, sql_validator, wildcard.