mcaxl

MCP/mcaxl

Author	SHA1	Message	Date
Ryan Malloy	1b92f83dc4	route_plan: fix Python-comment-in-f-string regression in translation_chain Live re-run after the c995bc2 Device-DN fix returned "A syntax error has occurred" from Informix. Cause: the explanatory comment block documenting why tkpatternusage=2 was added landed inside the translation_chain f-string instead of above it. Python's `#` line comments only work outside string literals — inside an f-string, each `#` is literal text. Informix received SQL with `#` lines after the JOIN clauses, parsed them as illegal tokens, and rejected. Offline tests didn't catch this because the FakeAxlClient dispatches on substring matches without actually parsing the SQL. The bug only manifested when the SQL hit a real Informix engine. Fix: move the comment block ABOVE the `sql = f"""` assignment so it becomes a real Python comment instead of literal SQL text. Sentinel test added (test_no_python_comment_chars_leak_into_sql): captures all SQL emitted by a cti_failsafe_reachability call and asserts no `#` character appears anywhere. CUCM's data dictionary doesn't use `#` in any table or column name, and Informix uses `--` / `/* */` for comments — so a `#` in any captured query is almost certainly an escaped Python comment. Catches this exact class of regression for any future contributor. Tests: cti suite 33 → 34; full mcaxl suite 271 → 272 passing. Operational impact: this regression made cti_failsafe_reachability unrunnable against any live cluster between c995bc2 and this fix. The Bingham 6→4 finding-reduction verification queued in 008 is unblocked once the MCP server reloads with this commit.	2026-05-09 05:09:20 -06:00
Ryan Malloy	c995bc2712	route_plan: translation_chain includes Device DNs (cti-audit-prompts/007) cucx-docs's 007 empirically proved that route_translation_chain's candidate filter `WHERE np.tkpatternusage IN (3, 5, 7)` excluded Device DNs (tkpatternusage=2), which caused false-positive HIGH findings on CTI-RP-to-CTI-RP failsafe chains — the typical CER deployment shape. The Bingham canary: 911-CTI-RP CFNA → 912 (DN of 912-CTI-RP) under 911CER-CSS. Direct numplan query against 911CER-PT returns 26 rows; translation_chain reported `candidates_evaluated: 23`. The 3-row gap is exactly the 3 Device DNs, excluded by the pre-fix filter regardless of input number. CUCM's runtime CFNA matcher includes Device DNs (otherwise no one could dial 912 and reach the device). My tool's exclusion diverged from production routing semantics. Result: every cluster using a CTI-RP-to-CTI-RP failsafe pattern got at least one false-positive HIGH finding on its first cti_failsafe_reachability run, wasting operator investigation time on a phantom defect. This commit broadens the candidate filter: - WHERE np.tkpatternusage IN (3, 5, 7) + WHERE np.tkpatternusage IN (2, 3, 5, 7) ^ Device DN Side effect: route_translation_chain now also surfaces Device DNs as matches when called directly, which matches production routing semantics. Existing callers benefit automatically. The _note in the response now names the candidate set explicitly so future readers don't have to dig into the SQL to know what's included. Updated comment block above the WHERE clause documents: - which tkpatternusage values are included and why - the empirical observation that motivated including Device DNs - cross-reference to cti-audit-prompts/007 for the smoking-gun candidates_evaluated:23-vs-26 evidence Tests: +2 in TestDeviceDnInTranslationChainCandidates: - test_translation_chain_sql_includes_device_dn_usage: lock the SQL down so a future contributor can't re-narrow the filter to (3, 5, 7) and re-introduce the false-positive class - test_cti_rp_to_cti_rp_failsafe_does_not_false_positive: the Bingham canary scenario — 911-CTI-RP forwarding to a Device DN in a reachable partition correctly produces zero findings The dispatch fake's SQL match-string updated from "(3, 5, 7)" to "(2, 3, 5, 7)" to keep the existing 31 cti tests green; net mcaxl suite: 269 → 271 passing. Live re-run pending — will ping the agent thread with post-patch output once the MCP server reloads. Re-run expectations (per cucx-docs's 007): - 911-CTI-RP / 912 finding (CFNA + CFUR): GONE — Device DN matches - 912-CTI-RP / 10911 finding: UNCHANGED — Route pattern still unreachable (CER911-PT not in 911CER-CSS) - 913-CTI-RP / 60003 finding: UNCHANGED — destination doesn't exist anywhere Findings: 6 → 4 (the 4 that actually matter).	2026-05-09 04:37:41 -06:00
Ryan Malloy	99986daa45	route_plan: cti_failsafe_reachability fix-suggestion handles dotted patterns Limitation surfaced by the live Bingham smoke-test (cti-audit-prompts/004): the canonical 912-CTI-RP finding got the broken-forward flag correct, but the suggested-fix message couldn't name CER911-PT (where pattern '10.911' lives) because the exact-literal lookup `WHERE np.dnorpattern = '10911'` doesn't match the dot-form `10.911`. The CUCM separator-dot in patterns is purely visual — represents access-code boundary, not a digit. A destination string `10911` should match a configured pattern `10.911` since both represent the same dialed digits. Two-stage match in _suggest_failsafe_fix: 1. Exact-literal: WHERE np.dnorpattern = '<dest>' (current behavior) 2. Dot-stripped: pull all patterns with `.` in them, filter Python-side by `pattern.replace('.', '') == dest` Stage 2 only runs when stage 1 returns no partitions, so the common case (exact-literal hit) takes the fast path. Falls back to the wildcard-investigation generic message only when neither stage finds a match. The fix message also distinguishes the two cases: - Exact-literal hit → "Pattern '10911' lives in partition X..." - Dot-stripped hit → "Pattern '10.911' (matches destination '10911') lives in partition X..." Naming both the pattern form and the destination keeps the operator oriented when the dialed digits and the configured pattern look different. Tests: +5 in TestDotStrippedFixSuggestion exercising: - dot-stripped match cites the dotted pattern form - exact-literal takes precedence over dotted match - multi-partition dotted match - no-exact-no-dotted falls back to generic - irrelevant dot-positions correctly excluded from match One existing assertion updated from "no exact-literal pattern" to "no exact-literal or dot-stripped pattern" (more accurate after the patch). Full mcaxl suite: 264 → 269 passing (+5 dot-stripped tests). The 1 unrelated test_wildcard.py timing flake is pre-existing (regex-backtracking timing assertion fails by 36ms under load). Cross-references: - Live smoke-test findings: agent-threads/cti-audit-prompts/004 - Original tool: agent-threads/cti-audit-prompts/002, commit d33cd7c	2026-05-09 03:36:51 -06:00
Ryan Malloy	d33cd7c809	route_plan: add cti_failsafe_reachability tool Closes the bug class cucx-docs flagged at Bingham — a CTI Route Point's CFNA destination points at a number that is structurally unreachable from the configured CFNA-CSS, so the failsafe forward fires but finds no matching pattern and the call dies. Invisible from any single-record inspection (CTI RP record looks fine, destination pattern exists in some partition, CSS is fine — defect lives in the relationship between CFNA-CSS and destination's partition). The motivating Bingham finding (life-safety severity): 912-CTI-RP (Secondary CER) CFNA + CFUR → "10911" via 911CER-CSS Pattern "10.911" exists in CER911-PT 911CER-CSS does NOT contain CER911-PT → failsafe is structurally broken; both CER servers down would produce fast-busy on 911 calls instead of routing through ELIN-10 to the PSAP Implementation per axl/agent-threads/cti-audit-prompts/002: - Tool, not prompt — output is structured + deterministic; same shape as route_patterns_targeting (Q1 confirmed as proposed) - Three-tier severity: HIGH for life-safety descriptions, MEDIUM for non-life-safety, no LOW (Q2 refined from cucx-docs's binary proposal — every broken forward is a real bug, just not all are 911) - Scope: CFNA + CFUR only for v1; CFB excluded by design (Q3 confirmed — CTI RPs rarely go busy) - Lives in route_plan.py alongside route_patterns_targeting + device_grep + translation_chain (Q5 — defer cti.py namespace until adjacent prompts land) - Named cti_failsafe_reachability not _audit (Q4 — drops the _audit suffix per the established tool-vs-prompt naming split; tools use direct-action names, prompts use _audit) Life-safety token list (case-insensitive substring match against name AND description): ("emergency", "911", "cer", "psap", "panic", "alert") Suggested-fix message names the partition where the destination's pattern lives and proposes either "add partition X to CSS Y" or "change CSS to a CSS containing partition X." Falls back to a generic "manual investigation needed" message when the destination matches no exact-literal pattern in any partition (often means a wildcard pattern is the actual target). Tests: 26 in TestLifeSafetyDetection + TestCtiFailsafeReachability: - 16 token-matching cases (10 positive, 4 negative, 2 sentinel) - 10 tool-level cases including the canonical Bingham bug reproduced verbatim (assertion compares the entire finding dict to the expected output from cucx-docs's 001 message) Full mcaxl suite: 238 → 264 passing (+26 from this work). Adjacent prompts cucx-docs flagged as lower-priority follow-ups (cti_route_point_audit, cti_port_pool_audit, cti_application_user_audit) deferred but tracked.	2026-05-09 03:28:49 -06:00

4 Commits