feat: Add cursor-based pagination with grep filtering
Some checks failed
Build Ghidra Plugin / build (push) Has been cancelled

- Implement pagination system for large responses (10K+ items)
- Add grep/regex filtering capability to results
- Session isolation for multi-client MCP scenarios
- Cursor management tools (next, list, delete, delete_all)
- Upgrade to mcp>=1.22.0 for FastMCP Context support
- Switch to date-based versioning (2025.12.1)
- Add prominent _message field to guide LLMs on cursor usage

10 tools with pagination support:
- functions_list - list all functions
- functions_decompile - decompiled code (line pagination)
- functions_disassemble - assembly (instruction pagination)
- functions_get_variables - function variables
- data_list - defined data items
- data_list_strings - string data
- xrefs_list - cross-references
- structs_list - struct types
- analysis_get_callgraph - call graph edges
- analysis_get_dataflow - data flow steps
This commit is contained in:
Ryan Malloy 2025-12-01 12:25:28 -07:00
parent 662e202482
commit c747abe813
3 changed files with 2043 additions and 185 deletions

View File

@ -6,6 +6,61 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
## [Unreleased]
## [2025.12.1] - 2025-12-01
### Added
- **Cursor-Based Pagination System:** Implemented efficient pagination for large responses (10K+ items) without filling context windows.
- `page_size` parameter (default: 50, max: 500) for controlling items per page
- `cursor_id` returned for navigating to subsequent pages
- Session isolation prevents cursor cross-contamination between MCP clients
- TTL-based cursor expiration (5 minutes) with LRU eviction (max 100 cursors)
- **Grep/Regex Filtering:** Added `grep` and `grep_ignorecase` parameters to filter results with regex patterns before pagination.
- **Bypass Option:** Added `return_all` parameter to retrieve complete datasets (with large response warnings).
- **Cursor Management Tools:** New MCP tools for cursor lifecycle management:
- `cursor_next(cursor_id)` - Fetch next page of results
- `cursor_list()` - List active cursors for current session
- `cursor_delete(cursor_id)` - Delete specific cursor
- `cursor_delete_all()` - Delete all session cursors
- **Enumeration Resources:** New lightweight MCP resources for quick data enumeration (more efficient than tool calls):
- `/instances` - List all active Ghidra instances
- `/instance/{port}/summary` - Program overview with statistics
- `/instance/{port}/functions` - List functions (capped at 1000)
- `/instance/{port}/strings` - List strings (capped at 500)
- `/instance/{port}/data` - List data items (capped at 1000)
- `/instance/{port}/structs` - List struct types (capped at 500)
- `/instance/{port}/xrefs/to/{address}` - Cross-references to an address
- `/instance/{port}/xrefs/from/{address}` - Cross-references from an address
### Changed
- **MCP Dependency Upgrade:** Updated from `mcp==1.6.0` to `mcp>=1.22.0` for FastMCP Context support.
- **Version Strategy:** Switched to date-based versioning (YYYY.MM.D format).
- **Tool Updates:** 11 tools now support pagination with grep filtering:
- `functions_list` - List functions with pagination
- `functions_decompile` - Decompiled code with line pagination (grep for code patterns)
- `functions_disassemble` - Assembly with instruction pagination (grep for opcodes)
- `functions_get_variables` - Function variables with pagination
- `data_list` - List data items with pagination
- `data_list_strings` - List strings with pagination
- `xrefs_list` - List cross-references with pagination
- `structs_list` - List struct types with pagination
- `structs_get` - Struct fields with pagination (grep for field names/types)
- `analysis_get_callgraph` - Call graph edges with pagination
- `analysis_get_dataflow` - Data flow steps with pagination
- **LLM-Friendly Responses:** Added prominent `_message` field to guide LLMs on cursor continuation.
### Fixed
- **FastMCP Compatibility:** Removed deprecated `version` parameter from FastMCP constructor.
### Security
- **ReDoS Protection:** Added validation for grep regex patterns to prevent catastrophic backtracking attacks.
- Pattern length limit (500 chars)
- Repetition operator limit (15 max)
- Detection of dangerous nested quantifier patterns like `(a+)+`
- **Session Spoofing Prevention:** Removed user-controllable `session_id` parameter from all tools.
- Sessions now derived from FastMCP context (`ctx.session`, `ctx.client_id`)
- Prevents users from accessing or manipulating other sessions' cursors
- **Recursion Depth Limit:** Added depth limit (10) to grep matching to prevent stack overflow on deeply nested data.
## [2.0.0] - 2025-11-11
### Added
@ -117,7 +172,8 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
- Initial project setup
- Basic MCP bridge functionality
[unreleased]: https://github.com/teal-bauer/GhydraMCP/compare/v2.0.0...HEAD
[unreleased]: https://github.com/teal-bauer/GhydraMCP/compare/v2025.12.1...HEAD
[2025.12.1]: https://github.com/teal-bauer/GhydraMCP/compare/v2.0.0...v2025.12.1
[2.0.0]: https://github.com/teal-bauer/GhydraMCP/compare/v1.4.0...v2.0.0
[1.4.0]: https://github.com/teal-bauer/GhydraMCP/compare/v1.3.0...v1.4.0
[1.3.0]: https://github.com/teal-bauer/GhydraMCP/compare/v1.2...v1.3.0

File diff suppressed because it is too large Load Diff

View File

@ -1,12 +1,12 @@
[project]
name = "ghydramcp"
version = "2.0.0"
version = "2025.12.1"
description = "AI-assisted reverse engineering bridge: a multi-instance Ghidra plugin exposed via a HATEOAS REST API plus an MCP Python bridge for decompilation, analysis & binary manipulation"
readme = "README.md"
requires-python = ">=3.11"
dependencies = [
"mcp==1.6.0",
"requests==2.32.3",
"mcp>=1.22.0",
"requests>=2.32.3",
]
[project.scripts]