prompts + docs: did_block_overlap, partition_summary, schema landmarks

Closes items 4-7 of cucx-docs's prompt-suggestions roadmap (see axl/agent-threads/cucx-prompt-suggestions/ for the source thread). did_block_overlap(block_pattern) — new prompt. LLM-orchestrated audit that finds carveout patterns inside a DID block and surfaces silent routing exceptions (e.g., 9498/9499 carved out of the 20878594XX block to route to a different fax server). Composes the existing route_patterns(filter=) tool with post-processing rather than introducing a new tool — cucx-docs's #3 was originally pitched as a tool, but the audit-narrative output is more naturally a prompt. partition_summary(partition_name=None) — new prompt. "What is this partition for?" orientation report composing route_partitions, route_patterns, route_calling_search_spaces, and the new route_patterns_targeting. No new SQL — this is pure orchestration. Useful when walking into an unfamiliar cluster and seeing a partition name like RTC-MGW-Inbound and needing to figure out its role before touching anything. cucm_sql_help — deepened with five schema-landmark sections that cost real audit sessions 3-5 query attempts each to discover. Topics: numplan↔device M:N via devicenumplanmap; non-existence of sipdestination as a table; routelist (singular) ≠ numplan→RL; LEFT-JOIN convention for type-decoder enum tables; CDR/CMR timestamp localization (cluster-TZ-conditional). Also updated the docs-search reference from "cisco-docs MCP" to "mcdewey MCP" to match yesterday's rename. cucm-schema-cheatsheet docs — appended a "Schema gotchas (from real audit sessions)" section mirroring the cucm_sql_help content. Two locations because they serve different consumers: the prompt is read by an LLM at query time, the docs page is read by a human reviewing the cluster offline. Tests: registration sentinel updated to include the two new prompts (catches the case where a new module is added without a server.py shim — the prompt would otherwise be invisible to the LLM). Full suite still 238 passing. Q3 verification (CDR timestamp empirical) still pending — cluster TLS intermittent this session. The schema-landmark text is conditional on cluster TZ per cucx-docs's caveat, so even an unverified ship is defensible.
2026-05-05 17:52:44 -06:00 · 2026-05-05 17:52:44 -06:00 · a07f8c7291
commit a07f8c7291
parent 9427e3d4df
7 changed files with 464 additions and 2 deletions
--- a/docs/src/content/docs/reference/cucm-schema-cheatsheet.md
+++ b/docs/src/content/docs/reference/cucm-schema-cheatsheet.md
@ -187,8 +187,92 @@ Examples: `sipdevice.isanonymous`, `sipdevice.acceptinboundrdnis`,
   bucket have `fkroutepartition = NULL`. The default-CSS rules apply
   to them.
 ## Schema gotchas (from real audit sessions)
 Each landmark below cost a multi-attempt schema-discovery walk in
 actual CUCM 15 audits. They're listed here so the next operator skips
 that walk. `axl_sql` also appends a corrective hint at the error layer
 for the first two — the docs version is for when you're composing a
 query *before* it errors.
 ### `numplan` ↔ `device` is M:N, not a direct foreign key
 `numplan` does **not** have an `fkdevice` column. The link goes through
 `devicenumplanmap`:
 ```sql
 JOIN devicenumplanmap m ON m.fknumplan = numplan.pkid
 JOIN device d ON m.fkdevice = d.pkid
 ```
 A natural-sounding `WHERE numplan.fkdevice = ?` errors with
 `Column (fkdevice) not found`. `axl_sql` appends a hint pointing at
 the join table; this docs entry is the same fact in the pre-write
 phase. The `route_patterns_targeting(device_name=...)` tool wraps
 this join shape if you just want the inverse query.
 ### `sipdestination` is not a table
 Reasonable-sounding name; doesn't exist as an Informix table. SIP
 trunk destinations live on `device` joined with `sipdestinationgroup`
 and `sipprofile`. Run:
 ```sql
 -- See actual SIP-related tables on this cluster:
 axl_list_tables(pattern='sip%')
 ```
 `axl_sql` appends a hint when a query references `sipdestination`
 and hits a "table not in database" error.
 ### `routelist` (singular) is the RL→RG join, not numplan→RL
 The naming is confusable. `routelist` holds the
 `(fkdevice, fkroutegroup, selectionorder)` rows that compose a Route
 List from its Route Groups. It is **not** the link from a route
 pattern (`numplan`) to a route list — that link is
 `numplan.fkdestination`, which can point at a Route List's device
 pkid, an external phone, etc.
 ### Always `LEFT JOIN` the enum-decoder type tables
 `typeclass`, `typemodel`, `typepatternusage`, `typecountry`,
 `typecallingsearchspaceuse`, etc. hold integer enum → name mappings.
 **Always `LEFT JOIN`, not `INNER JOIN`** — if Cisco adds a new enum
 value in a future release, an inner join silently drops rows for
 that new value, while a left join returns the row with NULL in the
 enum-name column. NULL is something an auditor notices and
 investigates; a missing row is a silent gap.
 ### CDR/CMR timestamps are local-time-as-UTC-seconds (cluster-TZ-conditional)
 When the cluster's TZ is anything other than UTC, the CDR Repository
 writes wall-clock-as-epoch into `dateTimeOrigination`,
 `dateTimeConnect`, `dateTimeDisconnect`, etc. Decoding via
 `TO_TIMESTAMP()` returns the correct *value* (e.g. `06:53:31`) but
 mislabels it as UTC. **Don't double-convert with `AT TIME ZONE`** —
 the timestamps are already in local time.
 This is conditional on the cluster's TZ setting. A UTC-configured
 cluster emits genuine UTC. Verify before assuming:
 ```sql
 SELECT FIRST 5 ds.name, tz.name AS timezone
 FROM datetimesetting ds
 LEFT JOIN typetimezone tz ON tz.enum = ds.tktimezone
 ```
 If the result shows anything non-UTC (e.g. `America/Denver`), CDR
 timestamps from that cluster are local-as-epoch and tooling that
 treats them as UTC will display correct values with a wrong label —
 or, worse, double-convert and produce wrong values.
 [`mcsiphon`](https://pypi.org/project/mcsiphon/) (operational
 diagnostics MCP server) suffixes its CDR timestamp fields with
 `_local` to make the convention visible to downstream tooling.
 ## See also
 - [Tools reference](/reference/tools/) — the helpers that abstract over the gotchas above
 - [SIP trunk report](/how-to/sip-trunk-report/) — worked example with all the joins
- Cisco's official CUCM data dictionary — search via `cisco-docs` MCP for "data dictionary"
+- Cisco's official CUCM data dictionary — search via [`mcdewey`](https://pypi.org/project/mcdewey/) MCP for "data dictionary"
--- a/src/mcaxl/prompts/init.py
+++ b/src/mcaxl/prompts/init.py
@ -20,9 +20,11 @@ shim is where the parameter contract lives.
 from . import (
    audit_routing,
    cucm_sql_help,
    did_block_overlap,
    hunt_pilot_audit,
    inbound_did_audit,
    investigate_pattern,
    partition_summary,
    phone_inventory_report,
    route_plan_overview,
    sip_trunk_report,
@ -33,9 +35,11 @@ from . import (
 __all__ = [
    "audit_routing",
    "cucm_sql_help",
    "did_block_overlap",
    "hunt_pilot_audit",
    "inbound_did_audit",
    "investigate_pattern",
    "partition_summary",
    "phone_inventory_report",
    "route_plan_overview",
    "sip_trunk_report",
--- a/src/mcaxl/prompts/cucm_sql_help.py
+++ b/src/mcaxl/prompts/cucm_sql_help.py
@ -38,12 +38,74 @@ The user asks: **{question}**
 2. Run `axl_describe_table(<table_name>)` for the candidate tables to see
   exact column names and types.
 3. If the schema chunks below already answer the question, draft the SQL
-   directly. If not, also invoke the `cisco-docs` MCP server's `search_docs`
+   directly. If not, also invoke the `mcdewey` MCP server's `search_docs`
   tool with a relevant query (e.g., search_docs("data dictionary table for X")).
 4. Compose the SELECT, run it via `axl_sql(query=...)`.
 5. Summarize the result for the user — counts, anomalies, and what you'd
   recommend doing about them.
 ## Schema landmarks worth knowing before you start
 A handful of CUCM Informix schema details that cost real audit sessions
 3-5 query attempts each before the right path was found. If your
 question touches any of these areas, jump straight to the right shape
 instead of rediscovering it.
 ### `numplan` ↔ `device` is **M:N**, not a direct foreign key
 `numplan` does **not** have an `fkdevice` column. The link goes through
 `devicenumplanmap`:
 ```sql
 JOIN devicenumplanmap m ON m.fknumplan = numplan.pkid
 JOIN device d ON m.fkdevice = d.pkid
 ```
 A natural-sounding `WHERE numplan.fkdevice = ?` will fail with
 "Column (fkdevice) not found" — and `axl_sql` now appends a corrective
 hint to that error, but you'll save the round-trip if you start with
 the join above.
 ### `routelist` (singular) is the RL→RG join, not numplan→RL
 The naming is confusable. `routelist` holds the `(fkdevice, fkroutegroup,
 selectionorder)` rows that compose a Route List from its Route Groups.
 It is **not** the link from a route pattern (`numplan`) to a route list.
 That link is `numplan.fkdestination` (which can point at a Route List's
 device pkid, an external phone, etc.).
 ### `sipdestination` does not exist
 Reasonable-sounding name, surfaces in some Cisco docs, but it isn't an
 Informix table. SIP trunk destinations live on `device` plus the
 `sipdestinationgroup` and `sipprofile` join tables. Run
 `axl_list_tables(pattern='sip%')` to see the actual tables, and
 `axl_sql` will append a corrective hint if you try to FROM it.
 ### Always `LEFT JOIN` the type-decoder enum tables
 `typeclass`, `typemodel`, `typepatternusage`, `typecountry`, `typecallingsearchspaceuse`, etc.
 hold integer enum → name mappings. **Always use `LEFT JOIN`, not `INNER
 JOIN`** — if Cisco adds a new enum value in a future release, an inner
 join silently drops the row, while a left join still returns the row
 with NULL in the enum-name column. The latter is something an auditor
 sees and acts on; the former is a silent gap.
 ### CDR/CMR timestamps are stored as local-time-as-UTC-seconds (not actual UTC)
 When the cluster's TZ is set to anything other than UTC, the CDR
 Repository service writes wall-clock-as-epoch into `dateTimeOrigination`,
 `dateTimeConnect`, `dateTimeDisconnect`, etc. So decoding via
 `TO_TIMESTAMP()` returns the correct *value* (e.g. `06:53:31`) but
 mislabels it as UTC. Don't double-convert with `AT TIME ZONE` — the
 timestamps are already in local time.
 A UTC-configured cluster (rare in production) emits genuine UTC
 timestamps; this is conditional on the cluster's TZ setting, not a
 universal truth. If your question involves CDR analysis or time windows,
 verify the cluster TZ first via `axl_describe_table('datetimesetting')`
 + a query, or treat all CDR times as local with `_local` suffixes.
 ## Possibly relevant schema chunks
 {schema_block}
--- a/src/mcaxl/prompts/did_block_overlap.py
+++ b/src/mcaxl/prompts/did_block_overlap.py
@ -0,0 +1,136 @@
 """DID block overlap audit — find carveout patterns inside larger blocks.
 Surfaces the kind of finding that's easy to miss visually but material
 for any DID-block audit: a more-specific pattern carved out of a larger
 block (e.g., `9498` and `9499` carved out of the RightFax `20878594XX`
 100-DID block, routing those 2 DIDs to ZetaFax instead).
 Per CUCM's longest-match rule, the more-specific patterns win. Knowing
 where carveouts exist tells the operator about routing exceptions that
 documentation often omits.
 """
 from __future__ import annotations
 from typing import TYPE_CHECKING
 from ._common import render_schema_block
 if TYPE_CHECKING:
    from ..docs_loader import DocsIndex
 _KEYWORDS = [
    "route pattern", "longest match", "translation pattern",
    "wildcard", "DID block", "called party",
 ]
 def render(docs: "DocsIndex | None", block_pattern: str) -> str:
    """Audit a DID block for carveout patterns that route differently.
    Args:
      block_pattern: the nominal block, e.g. `"20878594XX"`,
        `"+1208524XXXX"`. Wildcard syntax is CUCM's (X for any digit,
        [a-b] for ranges, sets, etc.).
    """
    schema_block = render_schema_block(
        docs, _KEYWORDS, max_chunks=3, max_chars_per_chunk=800
    )
    return f"""# DID Block Overlap Audit — `{block_pattern}`
 Find every numplan entry that matches *anywhere* within the nominal
 block `{block_pattern}`, grouped by destination device, with carveouts
 flagged. Per CUCM's longest-match rule, more-specific patterns win —
 which means a carveout silently overrides the block for those digits.
 ## Step 1 — Discover all numplan entries that fall in or near the block
 ```sql
 SELECT
  np.dnorpattern AS pattern,
  rp.name AS partition,
  np.description,
  tpu.name AS pattern_type,
  d.name AS destination_device,
  tc.name AS destination_class
 FROM numplan np
 LEFT OUTER JOIN routepartition rp ON np.fkroutepartition = rp.pkid
 LEFT OUTER JOIN typepatternusage tpu ON np.tkpatternusage = tpu.enum
 LEFT OUTER JOIN devicenumplanmap m ON m.fknumplan = np.pkid
 LEFT OUTER JOIN device d ON d.pkid = m.fkdevice
 LEFT OUTER JOIN typeclass tc ON tc.enum = d.tkclass
 WHERE np.dnorpattern LIKE '<prefix>%'
 ORDER BY np.dnorpattern;
 ```
 Replace `<prefix>` with the literal-prefix portion of `{block_pattern}`
 (everything up to the first wildcard character). Example: for
 `20878594XX`, the prefix is `20878594` so `LIKE '20878594%'` catches
 both the block and any carveouts.
 ## Step 2 — Categorize each match against the block
 Walk the result rows and assign each pattern to one of:
 - **Block itself**: pattern equals `{block_pattern}`. The "default"
  routing for the block.
 - **Carveout** (more-specific): pattern is fully inside the block but
  contains fewer wildcards. Example: block `20878594XX`, carveout
  `2087859498` (no wildcards) or `208785949[8-9]` (range narrower than
  the block's `XX`).
 - **Sibling block**: pattern shares the prefix but doesn't fall fully
  inside the named block. Example: `2087859410` when the block is
  `20878594XX` — wait, that IS inside; counter-example: `208785930X`
  when the block is `20878594XX` shares `2087859` but not `20878594`.
 - **Adjacent / unrelated**: shorter pattern, longer pattern, or one
  whose prefix doesn't actually fall inside the block.
 ## Step 3 — Group by destination device
 For each pattern in the block-itself + carveout categories, list its
 destination device. The audit-grade output is:
 ```
 Block: 20878594XX → RightFax-Trunk (100 DIDs nominally)
  Carveouts that route differently:
    2087859498 → ZetaFax-Trunk     [single DID exception]
    2087859499 → ZetaFax-Trunk     [single DID exception]
  Effective routing:
    98 DIDs → RightFax-Trunk
    2 DIDs → ZetaFax-Trunk
 ```
 ## Findings to surface
 - **Undocumented carveouts**: any carveout whose `description` doesn't
  explain why it diverges from the block. High-severity if the
  destination differs (silent routing exception); low if same-destination
  (probably accidental duplication).
 - **Same-destination carveouts**: patterns that route to the same device
  as the block. Often harmless duplication that adds noise to the route
  plan but no operational impact. Consolidation candidates.
 - **Mixed-partition matches**: same pattern in multiple partitions can
  produce different routing depending on the calling CSS. Flag if found
  inside the audited block.
 - **Disabled carveouts** (`np.blockenable = 't'`): a carveout that's
  explicitly blocked is unusual — likely a temporary measure that wasn't
  removed. Worth surfacing.
 ## Suggested follow-up calls
 - `route_patterns_targeting('<destination>')` for each unique destination
  device surfaced — confirms what *else* targets the same device beyond
  this block.
 - `route_inspect_pattern('<carveout pattern>')` to see the full route
  trace for any carveout flagged as suspicious.
 ## Reference: longest-match semantics
 """ + schema_block + """
 Produce a structured report grouped by destination device, with the
 "effective routing" summary at the top. Don't enumerate every digit —
 summarize counts ("98 DIDs → A, 2 DIDs → B") and list only the carveout
 patterns explicitly. Flag anything that looks like a silent override."""
--- a/src/mcaxl/prompts/partition_summary.py
+++ b/src/mcaxl/prompts/partition_summary.py
@ -0,0 +1,144 @@
 """Partition orientation report — what is this partition actually for?
 Composes existing tools (route_partitions, route_patterns) into a
 "what does this partition do" narrative — pattern count, longest patterns,
 distinct destination devices, internal vs external ratio. Useful when
 you walk into a cluster and see a partition named `RTC-MGW-Inbound` or
 similar and need to figure out its role before touching anything.
 """
 from __future__ import annotations
 from typing import TYPE_CHECKING
 from ._common import render_schema_block
 if TYPE_CHECKING:
    from ..docs_loader import DocsIndex
 _KEYWORDS = [
    "route partition", "calling search space", "called party",
    "route pattern", "translation pattern", "directory number",
 ]
 def render(docs: "DocsIndex | None", partition_name: str | None = None) -> str:
    """Compose a partition's role profile from existing routing data.
    Args:
      partition_name: the partition to profile. If None, the prompt
        instructs the LLM to first list all partitions and pick the
        most-populous to demonstrate the shape.
    """
    schema_block = render_schema_block(
        docs, _KEYWORDS, max_chunks=3, max_chars_per_chunk=700
    )
    target = partition_name or "<largest partition>"
    discovery_step = (
        ""
        if partition_name
        else """## Step 0 — Pick a partition (no name supplied)
 Run `route_partitions()` and pick the most-populous partition (highest
 `pattern_count`) to demonstrate the profile shape. In production use,
 the partition name should be supplied explicitly.
 """
    )
    return f"""# Partition Orientation: `{target}`
 Compose a "what is this partition for?" report from the routing data
 already exposed by `route_partitions`, `route_patterns`, and
 `route_calling_search_spaces`. No new SQL needed — this is an
 LLM-orchestrated narrative across existing tools.
 {discovery_step}## Step 1 — High-level profile via `route_partitions`
 Call `route_partitions()` and find the row for `{target}`. Note:
 - `pattern_count` — total numplan entries in the partition
 - `css_count` — how many Calling Search Spaces include this partition
  (signal of "blast radius" if you change the partition's contents)
 If `css_count` is 0, the partition is **orphaned** — no calling device
 can reach its patterns. High-severity audit finding (or vestigial cleanup
 candidate). Surface it explicitly.
 ## Step 2 — Pattern inventory via `route_patterns(partition='{target}')`
 Run `route_patterns(partition='{target}', limit=500)`. Categorize the
 result by `kind`:
 - **Directory numbers** (`tkpatternusage = 2`): user-assigned DNs.
  Counted; if zero, partition holds no extensions.
 - **Translation patterns** (`tkpatternusage = 3`): rewrite rules. List
  the 5 longest patterns + 5 shortest.
 - **Route patterns** (`tkpatternusage = 5`): outbound routing. Top
  destinations by pattern count.
 - **Hunt pilots** / **CTI route points** / etc.: count each.
 Compute:
 - **Internal vs external ratio**: count of patterns that look like 4-5
  digit extensions vs 7+ digit external numbers.
 - **Wildcard density**: count of patterns containing `X`, `!`, `@`,
  `[…]` vs literal-only patterns.
 - **Longest pattern**: indicates the partition's "deepest" rule.
 ## Step 3 — CSS membership via `route_calling_search_spaces`
 Call `route_calling_search_spaces()` (no name filter) and find every
 CSS whose ordered partition list contains `{target}`. For each:
 - Position in the CSS (sortorder) — `0` means first-priority lookup
 - Other partitions in the CSS — context for what the partition gets
  paired with
 A partition that's always at position 0 of its CSSes behaves as the
 primary lookup; a partition always at the bottom is a fallback or
 catch-all.
 ## Step 4 — Destination devices via `route_patterns_targeting`
 For the route-pattern subset (Step 2), if there are ≤ 10 distinct
 destination devices, call `route_patterns_targeting(<each device>)`
 to see what *else* targets each. This catches the case where two
 partitions both route to the same trunk — relevant for failover
 analysis.
 ## Step 5 — Synthesize the role
 Produce a 1-paragraph "what this partition is for" summary, then a
 structured report:
 ```
 Partition: <name>
 Role inference: <e.g., "Internal extension home", "PSTN inbound
  transformations", "Fax DID block">
 Pattern count: N (M directory numbers, K route patterns, ...)
 CSS membership: in N CSSes; primary in K, fallback in M
 Internal/external mix: X% internal-shape, Y% external-shape
 Wildcard density: N/M patterns use wildcards
 Longest pattern: <pattern> (length=N)
 Top 3 destinations: <list, if route patterns present>
 ```
 ## Findings to call out
 - **Orphaned partition** (CSS count = 0): not reachable from any device.
 - **Single-DN partition**: holds 1 pattern only. Often vestigial; rarely
  intentional.
 - **Mixed-purpose**: contains a mix of internal extensions AND external
  route patterns. Usually a sign of legacy migration; flag for review.
 - **Position-inconsistent**: partition appears at sortorder 0 in some
  CSSes and sortorder >5 in others. May indicate intentional priority
  inversion or accidental misconfiguration — surface for operator review.
 ## Reference: partition + CSS semantics
 """ + schema_block + """
 Produce the structured report and the role-inference paragraph.
 Don't dump every pattern — categorize, count, and call out exceptions."""
--- a/src/mcaxl/server.py
+++ b/src/mcaxl/server.py
@ -546,6 +546,36 @@ def hunt_pilot_audit() -> str:
    return _prompts.hunt_pilot_audit.render(_docs)
@mcp.prompt
 def did_block_overlap(block_pattern: str) -> str:
    """Find carveout patterns inside a DID block — more-specific patterns
    that win CUCM's longest-match rule and route differently than the
    nominal block. Surfaces silent routing exceptions like "9498 and 9499
    carved out of the 20878594XX block to route to a different fax server".
    Args:
        block_pattern: the nominal block in CUCM pattern syntax, e.g.
            ``20878594XX`` or ``+1208524XXXX``.
    """
    return _prompts.did_block_overlap.render(_docs, block_pattern)
@mcp.prompt
 def partition_summary(partition_name: str | None = None) -> str:
    """Compose a "what is this partition for?" report from existing
    routing data — pattern count, longest patterns, distinct destination
    devices, internal vs external ratio, CSS membership. Useful for
    walking into a cluster and orienting on an unfamiliar partition
    before touching anything.
    Args:
        partition_name: partition to profile. If omitted, the prompt
            instructs the LLM to first list partitions and pick the
            most-populous one to demonstrate the shape.
    """
    return _prompts.partition_summary.render(_docs, partition_name)
@mcp.prompt
 def whoami(userid: str | None = None) -> str:
    """Look up the role chain for a single user (defaults to the AXL
--- a/tests/test_prompts_package.py
+++ b/tests/test_prompts_package.py
@ -169,6 +169,8 @@ def test_all_prompts_registered_in_server():
        "inbound_did_audit",
        "hunt_pilot_audit",
        "whoami",
        "did_block_overlap",
        "partition_summary",
    }, f"unexpected prompt set: {names}"