mcaxl

5 Commits 1 Branch 0 Tags

Author	SHA1	Message	Date
Ryan Malloy	dee5fdacda	Hamilton review fixes: validator literal preservation, cache cluster id, CSS impact partial-failure reporting Three findings from a margaret-hamilton-style review of the MCP server, fixed with regression tests written first (red → green). One bonus finding (huntpilotqueue column name) was surfaced by the third fix itself — exactly the audit-trust failure mode that fix exists to expose. CRITICAL #1 — sql_validator: comment-strip mutated string literals. The cleaned query returned by validate_select() is what travels to AXL. Previously, the comment-strip pass ran before the literal-aware pass, so `--` or `/* /` markers inside a string literal were silently eaten: input: WHERE description = 'Smith -- old line' to AXL: WHERE description = 'Smith (truncated mid-literal) The LLM saw rows that looked plausible but were not what its query asked for. "Confidently wrong" is exactly the failure mode the review was hunting. Fix: only strip comments on the analysis-only copy used for keyword detection. The cleaned output preserves the input verbatim (modulo trailing semicolon and outer whitespace). 6 new tests covering literal preservation across `--`, `/ /`, LIKE patterns with embedded comment markers, and forbidden keywords inside real comments. CRITICAL #2 — cache key omitted cluster identity. The on-disk cache key was `method::args_json`. An operator swapping AXL_URL between test and prod (or between two clusters) would silently serve stale data from cluster A as if from cluster B. The audit report would be confidently wrong with no signal anything happened. Fix: AxlCache now takes cluster_id and prefixes all keys with it. Server bootstrap derives cluster_id as a 12-char SHA-256 prefix of AXL_URL. cache_stats() surfaces both the current cluster_id and a `foreign_cluster_entries` count so an env-swap is visible. Schema migration handles pre-fix cache files via PRAGMA table_info introspection plus a one-shot ALTER TABLE ADD COLUMN. 5 new tests covering isolation, shared-id sharing, stats reporting, legacy DB upgrade, and per-cluster clear() scoping. MAJOR #3 — find_devices_using_css summary undercounted partial failures. The function is per-category resilient (one failed query doesn't kill the whole impact analysis), but the resilience never propagated up to the response. total_returned and any_truncated only reflected SUCCESSFUL categories. An LLM consuming "47 references" had no way to know 5 categories errored and the real number was likely much higher. Fix: response now includes complete: bool, categories_with_errors: int, and error_categories: [list]. The LLM/auditor sees the partial-failure state and can decide whether to act on incomplete data. 5 new tests using a FakeAxlClient stand-in to simulate per-category failures. BONUS finding (uncovered by Major #3 fix): huntpilotqueue join used the wrong column. Three CSS impact categories (huntpilot_max_wait_css, huntpilot_no_agent_css, huntpilot_queue_full_css) were silently erroring with "Column (fknumplan) not found" because huntpilotqueue joins via fknumplan_pilot, not fknumplan. With the Major #3 fix in place, this surfaced immediately as `complete: False, error_categories: [3 huntpilot_]` against the live cluster. Fixed inline; live re-run now reports `complete: True, total_returned: 163` for Internal-CSS. 87 unit tests passing (up from 70). Live cluster smoke test (cucm-pub.binghammemorial.org, CUCM 15.0.1.12900-234) verifies all three fixes plus the bonus finding work end-to-end.	2026-04-25 23:09:55 -06:00
Ryan Malloy	82d8fbe563	SQL validator: ignore string literals; CSS impact: add primary + 7 more Two defects found during live-cluster audit shakedown. 1. SQL validator false-positives on string literals The forbidden-keyword check tokenized the entire query, including contents of single-quoted string literals. CSS names like 'Call Forward-CSS', DN descriptions containing 'DELETE', or partition names with 'INSERT' all tripped the validator even though the SQL itself was clean read-only. Found while running impact analysis on "Call Forward-CSS". Fix: strip string literals (single-quoted, with '' as escape) into whitespace before the forbidden-keyword tokenization. The cleaned query returned to the caller still contains the literals — they're only invisible to the analysis pass. 7 new tests covering: words inside literals (Call/Drop/Delete/etc.), escaped quotes, multiple literals, and the critical case where a forbidden keyword appears immediately after a literal. 2. CSS impact analysis missed primary device CSS + 7 other refs Running route_devices_using_css("E911CSS") returned total=0 even though E911CSS is configured in the cluster. Root cause: our enumeration covered device.fkcallingsearchspace_{reroute,restrict, refer,rdntransform} but not the primary device.fkcallingsearchspace itself — the column the GUI sets when assigning a CSS to a phone. The simple unsuffixed name didn't match our earlier "%css%" schema filter (the actual column spells out "callingsearchspace"). Added 8 new reference categories: device_primary_css — the big one device_cgpn_unknown_css — calling-party-unknown line_monitoring_css — devicenumplanmap monitoring CSS gateway_h323_called_xform_css — H.323 gateway transform gateway_sip_called_xform_css — SIP trunk transform huntpilot_max_wait_css — hunt pilot queue handling huntpilot_no_agent_css — hunt pilot queue handling huntpilot_queue_full_css — hunt pilot queue handling Re-running on live cluster: Internal-CSS: 146 -> 163 refs (16 new device_primary_css matches) Call Forward-CSS: previously rejected by validator -> 150 refs E911CSS: still 0 — high-confidence orphan finding now	2026-04-25 20:50:57 -06:00
Ryan Malloy	e3fb10cb4b	Cap response size on route_filters and route_devices_using_css Two MCP tools blew the per-response token cap when run against a real medium-sized cluster (Bingham Memorial, ~1500 patterns in Internal-PT, 20 route filters with hundreds of member rules each): route_devices_using_css("Internal-CSS") -> 103,590 chars route_filters() -> 304,639 chars Both responses are now compact-by-default with opt-in detail: route_filters(include_members=False, default): - returns name, clause, dial_plan, and member_count per filter - 304,639 -> 17,354 chars (94% reduction) - member_count is the audit-relevant signal anyway: filters with 100+ rules are complex; the count tells you that without paying for the full rule listing - include_members=True scopes detail to a single named filter (BLK-ALWAYS-RF with 432 rules: 40K chars; tractable per-filter) route_devices_using_css(max_per_category=50, default): - each category returns at most max_per_category rows - truncated: bool flag set when underlying count exceeds the cap - 103,590 -> 13,855 chars (87% reduction) - implementation uses SELECT FIRST max+1, so no extra COUNT query per category — single round-trip with accurate truncation flag - LLM can drill in via higher max_per_category or axl_sql when truncated=true Both changes are backward-compatible defaults; existing callers continue to work and just get smaller, structured responses.	2026-04-25 20:43:13 -06:00
Ryan Malloy	9340e7385a	Post-initial polish: voicemail SQL fix, README, .env in local ignore - route_plan.py: drop `NULL AS context` from voicemail_pilot_css query. Informix rejected it as a syntax error; the column wasn't carrying any signal anyway, so the simpler SELECT works and matches the other reference-point queries. - README.md: tool table now covers all 16 tools (route_device_pool_route_groups, route_devices_using_css, route_filters were missing). - .gitignore: explicitly ignore .env. Already covered by ~/.gitignore_global, but worth being self-contained — anyone cloning without the global ignore shouldn't be one stray `git add` away from leaking AXL credentials.	2026-04-25 20:34:57 -06:00
Ryan Malloy	8b3da9d729	Initial mcp-cucm-axl Read-only MCP server for Cisco Unified CM 15 AXL — built for LLM-driven cluster auditing, with a particular focus on the Route Plan Report: partitions, calling search spaces, route patterns, translation patterns, called/calling party transformations, and digit-discard instructions. Pairs intentionally with the sibling mcp-cisco-docs server (live cluster state + vendor docs in one LLM context). Architecture: - zeep SOAP client to CUCM AXL - WSDL bootstrap from Cisco's axlsqltoolkit.zip (auto-extract on first launch; zip is gitignored, vendor-licensed) - SQLite response cache at ~/.cache/mcp-cucm-axl/responses/ - Schema-grounded prompts that pull chunks from the sibling cisco-docs index (docs_loader.py) Read-only by structural guarantee — never registers AXL write methods (no executeSQLUpdate, no add/update/remove/apply/reset/restart tools). SQL queries also client-side validated (sql_validator.py) to begin with SELECT or WITH. Tools exposed: Foundational: axl_version, axl_sql, axl_list_tables, axl_describe_table, cache_stats, cache_clear Route plan: route_partitions, route_calling_search_spaces, route_patterns, route_inspect_pattern, route_lists_and_groups, route_translation_chain, route_digit_discard_instructions Prompts (schema-grounded): route_plan_overview, investigate_pattern, audit_routing, cucm_sql_help Tests cover cache, docs_loader, normalize, sql_validator, wildcard.	2026-04-25 20:29:18 -06:00