Closes the four remaining findings from the margaret-hamilton review.
13 new regression tests; all 100 pass; live cluster smoke verified.
MAJOR #4 — wildcard regex catastrophic backtracking + silent malformed.
Two changes to _wildcard_to_regex():
a) Bounded the `!` and `@` wildcards to \d{1,50} (was \d+). Adjacent
`!` patterns previously compiled to (\d+)(\d+)... which has
exponential backtracking on near-miss inputs. CUCM dial strings
are practically capped well below 50 digits; the bound keeps
complexity polynomial without losing real-world coverage.
Verified: 10 adjacent `!` against a 30-digit near-miss now finishes
in ~240ms (was unbounded; could have been minutes on real
pathological cases).
b) Unclosed `[` now raises ValueError instead of silently treating the
bracket as a literal. _pattern_matches_number catches the error
and returns False so a single bad pattern doesn't crash
translation_chain — but the bad pattern is no longer invisibly
producing wrong matches. The previous silent fallback meant a
pattern like `[0-9` (typo, missing `]`) would match input
containing the literal characters `[` `0` `-` `9`.
3 new tests covering: bounded-regex shape (`\d{1,N}`), pathological
input completes quickly, unclosed bracket raises explicitly,
well-formed character class still works.
MAJOR #5 — distinguish config errors from operational errors.
Pre-fix: any first-time connection failure set `_connection_error`
and pinned it forever. A transient network blip or session timeout
required restarting the MCP server. Hamilton's framing: Apollo's
software was *designed* to recover from transient faults; pinning
forever is the antithesis of "design the error path first."
Fix: split into two state fields:
_config_error — permanent until restart (missing env vars only)
_last_error — last operational failure, NOT a pin
Operational failures (zeep Client construction, network, TLS, session)
clear from the next call's perspective: the next call attempts fresh.
Configuration errors (missing AXL_URL etc.) stay pinned because
they don't get better on retry.
Added _ConfigError as a private subclass to make the distinction
explicit at the raise site, and connection_status() to expose
connected/connected_at/config_error/last_error for diagnostic
transparency.
3 new tests: config errors pin, operational errors don't pin,
connection_status() reports state.
MINOR #6 — _to_int silent coercion of bad data.
Pre-fix: a non-numeric value from the cluster (data corruption,
schema drift across CUCM versions) silently became None, which
downstream sort logic defaulted to 0 — jumbling the failover order
in the displayed result with no warning.
Fix: still returns None on bad data (caller error path unchanged),
but logs the offending value to stderr so an operator notices
something's wrong at the data layer. None itself is silent
(legitimately-unset column).
2 new tests: real None is silent, bad string logs to stderr with
the offending value visible.
MINOR #7 — standardize tool failure shapes; add health() tool.
Pre-fix: cache_stats and cache_clear returned `{"error": "..."}`
when _cache was None, while AXL-touching tools raised RuntimeError.
LLM consumers had to handle two shapes.
Fix: _require_cache() helper raises RuntimeError consistently with
_client(). All tool failures now use the same exception shape.
Added health() tool that reports cache/axl/docs initialization
status plus the AXL connection_status — gives operators a
self-diagnostic when something fails at bootstrap.
3 new tests: cache_stats raises, cache_clear raises, health()
reports each subsystem.
Read-only MCP server for Cisco Unified CM 15 AXL — built for LLM-driven
cluster auditing, with a particular focus on the Route Plan Report:
partitions, calling search spaces, route patterns, translation patterns,
called/calling party transformations, and digit-discard instructions.
Pairs intentionally with the sibling mcp-cisco-docs server (live
cluster state + vendor docs in one LLM context).
Architecture:
- zeep SOAP client to CUCM AXL
- WSDL bootstrap from Cisco's axlsqltoolkit.zip (auto-extract on
first launch; zip is gitignored, vendor-licensed)
- SQLite response cache at ~/.cache/mcp-cucm-axl/responses/
- Schema-grounded prompts that pull chunks from the sibling
cisco-docs index (docs_loader.py)
Read-only by structural guarantee — never registers AXL write methods
(no executeSQLUpdate, no add*/update*/remove*/apply*/reset*/restart*
tools). SQL queries also client-side validated (sql_validator.py) to
begin with SELECT or WITH.
Tools exposed:
Foundational: axl_version, axl_sql, axl_list_tables,
axl_describe_table, cache_stats, cache_clear
Route plan: route_partitions, route_calling_search_spaces,
route_patterns, route_inspect_pattern,
route_lists_and_groups, route_translation_chain,
route_digit_discard_instructions
Prompts (schema-grounded):
route_plan_overview, investigate_pattern, audit_routing,
cucm_sql_help
Tests cover cache, docs_loader, normalize, sql_validator, wildcard.