Expand docs site to 15 pages, add project URLs to pyproject.toml
Some checks failed
Build Ghidra Plugin / build (push) Has been cancelled

9 new pages organized by diataxis: guides (workflows, cursor
pagination, troubleshooting), reference (REST API, MCP resources,
configuration), concepts (architecture, prior art), and changelog.

Rewrote mcp-tools.md to cover all 64 tools across 14 categories.
Updated overview with architecture diagram and capability summary.
Added Claude Desktop config paths to installation page.

Sidebar now has 5 sections with 12 navigable entries.
Version bumped to 2026.3.7 with docs/repo/issues URLs for PyPI.
This commit is contained in:
Ryan Malloy 2026-03-07 17:21:03 -07:00
parent 1db36464ed
commit 38df6ee12a
14 changed files with 2393 additions and 62 deletions

View File

@ -35,12 +35,38 @@ export default defineConfig({
{ label: 'Installation', slug: 'getting-started/installation' }, { label: 'Installation', slug: 'getting-started/installation' },
], ],
}, },
{
label: 'Guides',
items: [
{ label: 'Analysis Workflows', slug: 'guides/workflows' },
{ label: 'Cursor Pagination', slug: 'guides/cursor-pagination' },
{ label: 'Troubleshooting', slug: 'guides/troubleshooting' },
],
},
{ {
label: 'Reference', label: 'Reference',
collapsed: true, collapsed: true,
items: [ items: [
{ label: 'Docker Usage', slug: 'reference/docker' },
{ label: 'MCP Tools', slug: 'reference/mcp-tools' }, { label: 'MCP Tools', slug: 'reference/mcp-tools' },
{ label: 'MCP Resources', slug: 'reference/resources' },
{ label: 'REST API', slug: 'reference/rest-api' },
{ label: 'Configuration', slug: 'reference/configuration' },
{ label: 'Docker Usage', slug: 'reference/docker' },
],
},
{
label: 'Concepts',
collapsed: true,
items: [
{ label: 'Architecture', slug: 'concepts/architecture' },
{ label: 'Prior Art', slug: 'concepts/prior-art' },
],
},
{
label: 'About',
collapsed: true,
items: [
{ label: 'Changelog', slug: 'about/changelog' },
], ],
}, },
], ],

View File

@ -0,0 +1,88 @@
---
title: Changelog
description: Version history and release notes
---
This page summarizes each release. For full commit-level detail, see the repository history.
## Unreleased
### Added
- Symbol CRUD operations: `symbols_create`, `symbols_rename`, `symbols_delete`, `symbols_imports`, `symbols_exports`
- Bookmark management: `bookmarks_list`, `bookmarks_create`, `bookmarks_delete`
- Enum and typedef creation: `enums_create`, `enums_list`, `typedefs_create`, `typedefs_list`
- Variable management: `variables_list`, `variables_rename`, `functions_variables`
- Namespace and class tools: `namespaces_list`, `classes_list`
- Memory segment listing: `segments_list`
- 13 analysis prompts with progress reporting
- Docker port auto-allocation from a configurable pool (default 8192-8223)
- Lazy `instances_use` -- returns immediately, validates on first real call
- All Docker operations non-blocking via thread executors
- Session isolation for `docker_stop` and `docker_cleanup`
### Fixed
- Eliminated 4+ hour hangs when switching to slow or unreachable instances
- Multiple bug fixes across Docker lifecycle and session management
## 2025.12.1
### Added
- Cursor-based pagination with configurable `page_size` and `cursor_id`
- Grep and regex filtering applied before pagination
- 8 enumeration resources using `ghidra://` URIs
### Security
- ReDoS protection on regex filters
- Session spoofing prevention for cursor operations
## 2.0.0
### Changed
- Full MCP integration refactor using FastMCP
- HATEOAS-driven API v2 with hypermedia links on all responses
### Added
- String listing across program memory
- Data manipulation tools
- Cross-reference analysis tools
- Memory read and write operations
## 1.4.0
### Changed
- Communication between bridge and plugin switched to structured JSON
### Added
- Test suites for bridge and plugin
- Origin checking on HTTP requests
## 1.3.0
### Added
- Variable manipulation tools (rename and retype)
- Dynamic version reporting in API responses
## 1.2.0
### Added
- Multi-instance support -- connect to multiple Ghidra sessions and switch between them
## 1.1.0
### Added
- Initial bridge release connecting MCP server to Ghidra plugin
## 1.0.0
- Initial project setup

View File

@ -0,0 +1,87 @@
---
title: Architecture
description: How MCGhidra's components fit together and why
---
MCGhidra is a three-layer stack. Each layer operates independently, communicates over well-defined boundaries, and can be replaced without affecting the others.
```
┌─────────────────────────────────────────────────────────────────┐
│ MCP Client (Claude Code, Claude Desktop, etc.) │
└──────────────────────────┬──────────────────────────────────────┘
│ stdio (MCP protocol)
┌──────────────────────────┴──────────────────────────────────────┐
│ MCGhidra Python Server (FastMCP) │
│ ┌─────────┐ ┌──────────┐ ┌─────────┐ ┌────────────┐ │
│ │Functions │ │ Data │ │Analysis │ │ Docker │ ... │
│ │ Mixin │ │ Mixin │ │ Mixin │ │ Mixin │ │
│ └────┬────┘ └────┬─────┘ └────┬────┘ └─────┬─────┘ │
│ └───────────┴────────────┴─────────────┘ │
│ HTTP Client │
└──────────────────────────┬──────────────────────────────────────┘
│ HTTP REST (HATEOAS)
┌──────────────────────────┴──────────────────────────────────────┐
│ Ghidra Plugin (Java, runs inside JVM) │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ MCGhidraServer.py → HTTP endpoints │ │
│ │ Functions / Data / Memory / Xrefs / Analysis │ │
│ └────────────────────────────────────────────────────────┘ │
│ Ghidra Analysis Engine (decompiler, disassembler, types) │
└─────────────────────────────────────────────────────────────────┘
```
## The Three-Layer Stack
The top layer is the MCP client. Claude Code, Claude Desktop, or any MCP-compatible tool connects to MCGhidra over stdio using the Model Context Protocol. The client sees MCP tools, resources, and prompts -- it never deals with HTTP or Ghidra's internals directly.
The middle layer is the Python MCP server, built on FastMCP. It translates MCP tool calls into HTTP requests against the Ghidra plugin's REST API. The server is organized as a set of mixins -- Functions, Data, Analysis, Docker, and others -- each registering their own tools. This keeps the codebase navigable despite having 64+ tools.
The bottom layer is the Ghidra plugin. It runs as a Jython script inside Ghidra's JVM and starts an HTTP server that exposes Ghidra's analysis engine. The plugin does not know or care about MCP. It serves a HATEOAS REST API that any HTTP client can consume.
### Why a REST intermediary
A direct JVM-to-MCP bridge sounds simpler, but Ghidra's runtime imposes real constraints. The JVM uses OSGi classloading, the scripting environment is Jython (Python 2.7), and Ghidra's internal APIs are not designed for external consumption. HTTP sidesteps all of this. The Ghidra plugin speaks HTTP; the Python server speaks MCP. Each layer uses the language and runtime best suited to its job.
This separation also enables multi-instance support. Multiple Ghidra processes can run on different ports, each analyzing a different binary, and the MCP server routes requests to the right one. If the REST layer were baked into the MCP transport, this routing would be much harder.
Finally, the REST layer is language-independent. The Python server could be replaced with a Go or Rust implementation without touching the Ghidra plugin. This is not a theoretical benefit -- it means the plugin's API is usable outside of MCP entirely.
## HATEOAS Design
Most REST APIs call themselves RESTful but skip the hypermedia constraint. MCGhidra does not. Every response from the Ghidra plugin includes `_links` pointing to related resources.
A request to `GET /functions/0x401000` returns the function metadata along with links to decompile it, disassemble it, list its variables, and find cross-references. The client follows links rather than constructing URLs from templates.
This matters more for MCP agents than for human users. An agent that follows links does not need to memorize URL patterns or understand the API's URL structure upfront. It reads a response, sees what actions are available, and picks the relevant one. The API is self-describing at every step.
The practical effect: when the Ghidra plugin adds a new capability, the agent can discover and use it without any changes to the MCP server -- as long as the server forwards the link.
## Session Isolation
Each MCP client gets a session ID, derived from the FastMCP context. This ID scopes all stateful operations.
Pagination cursors are session-scoped. If two clients are paging through the same function list, their cursors are independent -- advancing one does not affect the other. Docker containers track which session started them, and `docker_stop` validates ownership before killing a container. One client cannot shut down another client's analysis session.
`docker_cleanup` follows the same rule. It only removes containers and port locks belonging to the calling session, unless explicitly asked to clean up orphans.
## Port Pooling
When Docker provisioning starts a new container, it needs a host port to map the container's HTTP API. Ports come from a configurable pool, defaulting to 8192-8319 (128 ports).
Allocation uses `flock`-based file locking. Each port has a lock file, and the allocator takes an exclusive lock before assigning it. This is safe across multiple processes -- if two MCP servers run on the same host, they will not collide.
The `PortPool` is lazy. It is not created until the first Docker operation that needs a port. If a user never touches Docker, the lock directory is never created and no background work occurs.
A background discovery thread scans the port range every 30 seconds, probing each port with a 0.5-second timeout. This is how the server finds Ghidra instances that were started outside of MCGhidra -- manually launched containers, or Ghidra instances running the plugin natively.
## Non-Blocking Design
The MCP server runs an asyncio event loop. Blocking that loop would freeze all connected clients. MCGhidra avoids this in several ways.
All Docker subprocess calls (`docker run`, `docker stop`, `docker logs`) run in thread pool executors via `asyncio.to_thread`. The event loop stays responsive while containers start, stop, or produce output.
`instances_use` is lazy. When a client switches to a new Ghidra instance, the server creates a stub immediately and returns. It does not validate the connection until the first real tool call against that instance. This avoids the situation where a slow or unreachable Ghidra instance blocks the `instances_use` call for minutes.
`docker_auto_start` returns as soon as the container is running. It does not wait for Ghidra to finish loading and analyzing the binary -- that can take minutes for large files. The client is expected to poll `docker_health` until the API responds.
The background port discovery thread runs on its own schedule and never blocks the event loop. It updates the instance list atomically, so clients always see a consistent snapshot.

View File

@ -0,0 +1,39 @@
---
title: Prior Art
description: Acknowledgments and related projects
---
MCGhidra builds on the work of many people and projects. This page gives credit where it is due.
## Ghidra
NSA released Ghidra as open source in 2019 after years of internal development. MCGhidra would not exist without the decade of investment the agency put into building a full-featured analysis engine. The decompiler alone represents person-years of work on intermediate representations, type inference, and control flow recovery. The fact that it runs headless, supports scripting, and handles dozens of processor architectures out of the box made this project feasible.
Ghidra is available at [ghidra-sre.org](https://ghidra-sre.org/).
## GhidraMCP by Laurie Wired
The direct inspiration. [Laurie Wired's GhidraMCP](https://github.com/LaurieWired/GhidraMCP/) demonstrated that connecting Ghidra to the Model Context Protocol was viable and useful. MCGhidra started as a fork of her project and evolved into a different architecture -- a HATEOAS REST intermediary, multi-instance support, Docker provisioning, cursor-based pagination -- but the core idea of letting an MCP agent drive Ghidra traces back to her work. The proof of concept she built made the case that this was worth pursuing further.
## FastMCP
[FastMCP](https://github.com/jlowin/fastmcp) by Jeremiah Lowin is the Python framework that MCGhidra's MCP server is built on. Its decorator-based tool registration and mixin composition pattern made it practical to organize 64+ tools into maintainable domain modules. The `Context` system for session isolation and progress reporting is central to how MCGhidra handles multi-client scenarios. FastMCP removed a large amount of boilerplate that would otherwise dominate the codebase.
## HATEOAS and REST
The Hypermedia as the Engine of Application State constraint comes from Roy Fielding's 2000 dissertation, where he formalized the REST architectural style. Most APIs that call themselves RESTful ignore this constraint. MCGhidra embraces it because agents benefit from self-describing responses -- when every result includes `_links` to related resources, the agent does not need to memorize URL patterns or maintain a hardcoded API map.
## Model Context Protocol
Anthropic's [MCP specification](https://modelcontextprotocol.io/) provides the transport layer between MCGhidra and its clients. The protocol's tool/resource/prompt abstraction maps naturally to reverse engineering workflows: tools for mutating operations like renaming symbols, resources for read-only enumeration like listing functions, and prompts for guided analysis workflows.
## Related Projects
MCGhidra is part of a broader ecosystem of people bridging reverse engineering tools with external interfaces. Notable related work includes:
- Binary Ninja MCP servers that expose BN's API over the same protocol
- IDA Pro scripting bridges that have connected IDA to external tools for years
- Radare2 and rizin automation frameworks, which pioneered the idea of a scriptable RE command interface
- The growing community of MCP server authors connecting domain-specific tools to language model agents
Each of these projects informed the design decisions in MCGhidra, whether by example or by contrast. The RE tooling community has a long history of building bridges between analysis engines and the outside world -- MCGhidra is one more entry in that tradition.

View File

@ -19,7 +19,13 @@ This installs the MCP server and bundles the Ghidra plugin JAR. No separate plug
## MCP Client Configuration ## MCP Client Configuration
Add MCGhidra to your MCP client configuration. For Claude Code: ### Claude Desktop
Add to your Claude Desktop configuration file:
- **macOS**: `~/Library/Application Support/Claude/claude_desktop_config.json`
- **Windows**: `%APPDATA%\Claude\claude_desktop_config.json`
- **Linux**: `~/.config/Claude/claude_desktop_config.json`
```json ```json
{ {
@ -32,6 +38,18 @@ Add MCGhidra to your MCP client configuration. For Claude Code:
} }
``` ```
### Claude Code
```bash
claude mcp add mcghidra -- uvx mcghidra
```
Or if running from a local clone:
```bash
claude mcp add mcghidra -- uv run --directory /path/to/MCGhidra mcghidra
```
## Docker Setup (Optional) ## Docker Setup (Optional)
If you want automatic container provisioning: If you want automatic container provisioning:

View File

@ -5,6 +5,15 @@ description: What MCGhidra does and how the pieces fit together
MCGhidra is a two-part system that bridges NSA's [Ghidra](https://ghidra-sre.org/) reverse engineering framework with [Model Context Protocol](https://modelcontextprotocol.io/) (MCP) clients. MCGhidra is a two-part system that bridges NSA's [Ghidra](https://ghidra-sre.org/) reverse engineering framework with [Model Context Protocol](https://modelcontextprotocol.io/) (MCP) clients.
## What You Get
- **64 MCP tools** across 12 categories (functions, data, structs, symbols, analysis, memory, variables, bookmarks, enums, namespaces, segments, Docker)
- **13 analysis prompts** for guided RE workflows (malware triage, crypto identification, auth bypass hunting, protocol analysis, and more)
- **19 MCP resources** for quick enumeration without tool calls
- **Cursor-based pagination** for handling binaries with 100K+ functions
- **Server-side grep** filtering before results hit the wire
- **Docker provisioning** with automatic port pooling and session isolation
## Components ## Components
### Ghidra Plugin (Java) ### Ghidra Plugin (Java)
@ -28,6 +37,24 @@ A [FastMCP](https://github.com/jlowin/fastmcp) server that wraps the REST API as
- **Raw firmware support** — specify processor language, base address, and loader for binary blobs - **Raw firmware support** — specify processor language, base address, and loader for binary blobs
- **Session isolation** — each MCP client gets its own session ID, preventing cross-talk - **Session isolation** — each MCP client gets its own session ID, preventing cross-talk
## How the Pieces Connect
```
┌──────────────┐ MCP ┌──────────────┐ HTTP ┌──────────────┐
│ MCP Client │◄────────────►│ MCGhidra │◄────────────►│ Ghidra │
│ (Claude, │ stdio │ Python │ REST API │ Plugin │
│ Cursor, │ │ Server │ (HATEOAS) │ (Java) │
│ etc.) │ │ │ │ │
└──────────────┘ └──────────────┘ └──────────────┘
┌─────┴─────┐
│ Docker │
│ Engine │
└───────────┘
```
The MCP client communicates with MCGhidra over stdio using the Model Context Protocol. MCGhidra translates tool calls into HTTP requests against the Ghidra plugin's REST API. When Docker is available, MCGhidra can also provision and manage Ghidra containers automatically.
## Typical Workflow ## Typical Workflow
1. **Start Ghidra** — either via Docker (`docker_auto_start`) or by running the plugin in an existing Ghidra instance 1. **Start Ghidra** — either via Docker (`docker_auto_start`) or by running the plugin in an existing Ghidra instance

View File

@ -0,0 +1,158 @@
---
title: Cursor Pagination
description: Working with large binaries using cursor-based pagination
---
Large binaries can contain tens of thousands of functions, hundreds of thousands of cross-references, and thousands of strings. Returning all of that in a single tool response would overflow the MCP client's context window and produce unusable output. MCGhidra uses cursor-based pagination to deliver results in controlled pages.
## How it works
When a paginated tool returns more items than the page size, the response includes a `cursor_id`. Pass that cursor ID to `cursor_next` to get the next page. Continue until `has_more` is `false`.
```
# First page: get 100 functions matching a pattern
result = functions_list(page_size=100, grep="crypt|hash")
# Returns:
# {
# result: [...],
# pagination: {
# cursor_id: "a1b2c3d4e5f67890",
# total_count: 12847,
# filtered_count: 847,
# page_size: 100,
# current_page: 1,
# total_pages: 9,
# has_more: true
# }
# }
# Next page
result = cursor_next(cursor_id="a1b2c3d4e5f67890")
# Returns page 2 of 9
# Continue until has_more is false
result = cursor_next(cursor_id="a1b2c3d4e5f67890")
# ...
```
Each response also includes a `_message` field with a human-readable summary like "Showing 100 of 847 items (page 2/9). To get the next 100 items, call: cursor_next(cursor_id='a1b2c3d4e5f67890')". MCP clients use this to decide whether to continue fetching.
## Server-side grep filtering
The `grep` parameter filters results on the server before pagination. This is much more efficient than fetching everything and filtering client-side, because only matching items are stored in the cursor and counted toward page totals.
```
# Only functions with "auth" or "login" in their name/address
functions_list(grep="auth|login", page_size=50)
# Case-sensitive search (grep_ignorecase defaults to true)
data_list_strings(grep="BEGIN CERTIFICATE", grep_ignorecase=false, page_size=50)
```
The grep pattern is a regular expression. It matches against all string values in each result item -- for a function, that means the name, address, and signature fields are all searched.
### Pattern safety
Patterns are validated before execution to prevent runaway matches:
- Maximum 500 characters
- Maximum 15 repetition operators (`*`, `+`, `?`, `{n,m}`)
- Nested quantifiers like `(a+)+` are rejected
If a pattern fails validation, the tool returns an error with code `INVALID_GREP_PATTERN` explaining what to fix.
## The return_all option
When you need all matching results without paging through cursors, pass `return_all=True`:
```
functions_list(grep="crypt", return_all=True)
```
This bypasses pagination and returns every matching item in a single response. There is a token budget guard (default: 8,000 estimated tokens) that kicks in if the response would be too large. When the guard triggers, the response includes:
- A sample of the first 3 items
- The available field names
- Suggested narrower queries (grep patterns, field projections, or pagination)
Combine `return_all` with `grep` and `fields` to keep the response size down:
```
# Get all crypto-related function names and addresses (nothing else)
functions_list(grep="crypt|aes|sha", fields=["name", "address"], return_all=True)
```
## Page size
The `page_size` parameter controls how many items each page contains.
| Parameter | Default | Maximum |
|-----------|---------|---------|
| `page_size` | 50 | 500 |
For most MCP client contexts, 50-100 items per page is a good balance between making progress and keeping individual responses readable. Going above 200 is rarely useful unless you are scripting.
## Cursor lifecycle
### TTL and eviction
Cursors expire after 5 minutes of inactivity (no `cursor_next` calls). The timer resets each time a cursor is accessed.
When more than 100 cursors exist for a session, the least-recently-used cursor is evicted to make room. In practice, you will rarely hit this limit unless you start many queries without finishing them.
### Session isolation
Each MCP client session gets its own set of cursors. You cannot access or interfere with another session's cursors. Session IDs are derived from the MCP client context -- they are not user-controllable.
### Management tools
| Tool | What it does |
|------|-------------|
| `cursor_list()` | Show all active cursors for the current session: IDs, page progress, TTL remaining, grep pattern |
| `cursor_delete(cursor_id="...")` | Delete a specific cursor to free memory |
| `cursor_delete_all()` | Delete all cursors for the current session |
These are useful for cleanup during long analysis sessions or when you want to re-run a query from scratch.
## Example: scanning all strings for credentials
```
# Start with a broad credential search
result = data_list_strings(grep="password|secret|key|token|api_key|credential", page_size=100)
# Process first page of results
# ... examine the strings ...
# Get more if there are additional pages
if result has cursor_id:
result = cursor_next(cursor_id="...")
```
## Example: iterating through all functions matching a pattern
```
# First page
result = functions_list(grep="handle_|process_|parse_", page_size=50)
# Loop through pages
while result has cursor_id:
# Decompile interesting functions from this page
for func in result:
if func looks relevant:
functions_decompile(name=func["name"])
# Advance
result = cursor_next(cursor_id="...")
```
## Tips
- Prefer server-side `grep` over fetching everything. A query like `functions_list(grep="ssl")` is far cheaper than `functions_list(return_all=True)` followed by manual filtering.
- Use `fields` to reduce response size. If you only need names and addresses, `functions_list(fields=["name", "address"], page_size=100)` cuts the per-item size significantly.
- Small page sizes (50-100) keep individual responses from consuming too much context. You can always fetch more pages.
- If a cursor expires (5-minute TTL), just re-run the original query. The cursor IDs are not reusable -- you get a new one each time.
- For very large binaries (100K+ functions), start with grep-filtered queries rather than listing everything. Even paginated, iterating through 2,000 pages of 50 items each is slow and rarely what you actually need.

View File

@ -0,0 +1,257 @@
---
title: Troubleshooting
description: Common issues and solutions when using MCGhidra
---
## Container Issues
### Container will not start
Check that the binary path is correct and accessible from the Docker daemon. The path you pass to `docker_auto_start` must exist on the host machine, and the Docker volume mount must be able to reach it.
```
docker_auto_start(binary_path="/path/to/binary")
```
If this fails, verify:
- The file exists at the specified path
- The `mcghidra:latest` Docker image is built (run `docker_status()` to check)
- Docker is running and your user has permission to access it
### Health check timeouts
Analysis takes time. A small binary (under 1 MB) typically finishes in about 20 seconds. Larger binaries -- especially firmware images or complex C++ programs -- can take several minutes.
Poll `docker_health` to check readiness:
```
docker_health(port=8195)
```
While waiting, check what Ghidra is doing:
```
docker_logs(port=8195)
```
If you see Ghidra import and analysis messages in the logs but the health check never succeeds, the analysis is still running. If the logs show errors or the container has exited, the import likely failed (see "Import failed" below).
### Port conflicts
MCGhidra allocates ports from a pool (default 8192-8319). If another application is using a port in this range, the allocator skips it. If you run many concurrent containers and exhaust the pool, `docker_auto_start` will report that no ports are available.
Check current allocations with:
```
docker_status()
```
You can adjust the port range with environment variables:
| Variable | Default |
|----------|---------|
| `MCGHIDRA_PORT_START` | `8192` |
| `MCGHIDRA_PORT_END` | `8319` |
### Viewing container logs
```
docker_logs(port=8195, tail=200)
```
This shows stdout and stderr from the Ghidra headless process. Look for lines containing `ERROR`, `WARN`, or `Exception` to diagnose import or analysis failures.
---
## Connection Issues
### "No Ghidra instance specified"
This means no current instance is set. First, discover available instances, then select one:
```
instances_list()
instances_use(port=8195)
```
If `instances_list` returns no instances, either no Ghidra process is running or it is on a port outside the discovery range.
### Instance not found after starting a container
`docker_auto_start` returns a port, but the MCP server does not automatically register it as the current instance. You need to call:
```
instances_use(port=8195)
```
If `instances_list` does not show the container, the API may not be ready yet. Poll `docker_health` first.
### API version mismatch
If you see version mismatch errors, the Ghidra plugin is older than the MCP server expects. The current server expects API v2. Update the plugin by rebuilding the Docker image or installing the latest MCGhidra release.
### Timeout on first tool call after instances_use
`instances_use` is lazy -- it creates a stub entry without connecting to Ghidra. The first real tool call (like `functions_list`) validates the connection. If Ghidra is not ready yet, that call will time out.
Wait for `docker_health` to report healthy before calling `instances_use`.
---
## Analysis Issues
### Import failed
Raw binaries (firmware, bootloaders) need the `language` parameter to tell Ghidra which processor architecture to use. Without it, Ghidra tries to auto-detect the format and will fail on headerless files.
```
docker_auto_start(
binary_path="/path/to/firmware.bin",
language="ARM:LE:32:v4t",
base_address="0x00000000"
)
```
Check the logs if auto-import fails:
```
docker_logs(port=8195)
```
Common causes:
- Missing `language` for raw binaries
- Incorrect base address
- Corrupted or truncated binary file
- Unsupported file format (check with the `file` command on the host)
### OSGi bundle error
This is a known Ghidra limitation that can occur with certain script configurations. It appears as "Failed to get OSGi bundle" in the container logs. It does not usually affect analysis results -- the API still functions. If it blocks operation, rebuilding the Docker image with the latest scripts resolves it in most cases.
### Analysis incomplete
If decompiled output looks wrong (missing function boundaries, incorrect types), Ghidra's auto-analysis may not have finished or may need a second pass:
```
analysis_run()
```
This triggers a full re-analysis of the current program. It can take a while on large binaries.
### Decompilation timeout
For very large or complex functions, the decompiler can take longer than the default timeout. If `functions_decompile` times out, the function may have deeply nested loops, heavy inlining, or obfuscated control flow.
Try disassembly instead for a faster view:
```
functions_disassemble(address="00401234")
```
---
## Pagination Issues
### Cursor expired
Cursors have a 5-minute inactivity TTL. If you wait too long between `cursor_next` calls, the cursor is deleted. Re-run the original query to get a fresh cursor:
```
functions_list(grep="crypt", page_size=100)
```
See [Cursor Pagination](/guides/cursor-pagination/) for details on cursor lifecycle.
### Context window overflow
If tool responses are consuming too much context, reduce the page size:
```
functions_list(page_size=25, grep="your_pattern")
```
Use `fields` to limit which fields are returned:
```
functions_list(page_size=50, fields=["name", "address"])
```
And always prefer `grep` to filter results before they reach the client.
### "Session spoofing" errors
Session IDs are derived from the MCP client context and cannot be set manually. If you see session-related errors, it means a cursor belongs to a different MCP session. Each session (each Claude conversation, for example) has its own isolated cursor space.
---
## Docker-Specific Issues
### docker_auto_start appears to hang
`docker_auto_start` returns immediately after starting the container. It does not wait for analysis to complete. If it seems to hang, the issue is likely Docker itself taking time to pull or start the container. Check:
```
docker_status()
```
### Cross-session interference
Each MCP session has a unique session ID. Docker containers are tagged with their owning session. `docker_stop` validates that the container belongs to your session before stopping it. You cannot stop another session's container.
If you need to clean up containers from a previous session that is no longer active, use:
```
docker_cleanup(session_only=False)
```
Be careful with this -- it removes all orphaned MCGhidra containers, not just yours.
### Stale containers
If containers from previous sessions are still running, they consume ports from the pool. Use `docker_cleanup()` (which defaults to `session_only=True`) to clean up your own stale containers, or `docker_cleanup(session_only=False)` to remove all orphaned containers.
### Build failures
If `docker_build()` fails, make sure:
- The Dockerfile context is correct (it needs both the `docker/` directory and the project root)
- Docker has enough disk space
- The base Ghidra image layers download successfully (network access required for first build)
---
## Debug Mode
Set the `MCGHIDRAMCP_DEBUG` environment variable before starting the MCP server to enable verbose logging:
```bash
MCGHIDRAMCP_DEBUG=1 uvx mcghidra
```
Or in your MCP client configuration:
```json
{
"mcpServers": {
"mcghidra": {
"command": "uvx",
"args": ["mcghidra"],
"env": {
"MCGHIDRAMCP_DEBUG": "1"
}
}
}
}
```
Debug output goes to stderr and includes:
- Instance discovery attempts and results
- HTTP request/response details for Ghidra API calls
- Cursor creation, access, and expiration events
- Docker container lifecycle events
- Port pool allocation and release
Check the MCP server's stderr output in your terminal or in the MCP client's server log viewer.

View File

@ -0,0 +1,236 @@
---
title: Analysis Workflows
description: Common reverse engineering workflows with MCGhidra
---
These workflows assume you have MCGhidra installed and configured as described in the [Installation guide](/getting-started/installation/).
## Triage a Binary
The fastest way to get oriented in an unknown binary. Start a container, wait for Ghidra to finish analysis, then survey the surface area.
### 1. Start analysis
```
docker_auto_start(binary_path="/path/to/target.exe")
```
This returns immediately with a port number. It does not block while Ghidra runs.
### 2. Wait for analysis to complete
Poll until the HTTP API responds:
```
docker_health(port=8195)
```
For a small binary (under 1 MB), expect about 20 seconds. Larger binaries can take several minutes. Check `docker_logs(port=8195)` while waiting to see Ghidra's progress.
### 3. Set the instance as current
```
instances_use(port=8195)
```
After this, every tool call defaults to this instance. No need to pass `port` again.
### 4. Get the program overview
```
program_info()
functions_list(page_size=100)
data_list_strings(page_size=100)
```
`program_info` returns architecture, compiler, and image base address. The function and string listings give a first sense of scale and naming conventions.
### 5. Search for interesting patterns
Use server-side grep to find functions and strings related to security-sensitive behavior:
```
functions_list(grep="password|key|auth|crypt|login|verify", page_size=100)
data_list_strings(grep="password|secret|key|token|credential", page_size=100)
```
From here, decompile anything that looks relevant and follow cross-references to understand the surrounding logic.
---
## Rename and Annotate Loop
Ghidra auto-analysis produces generic names like `FUN_00401234`. As you reverse engineer, renaming functions and adding comments makes the decompiled output progressively easier to read.
### 1. Decompile a function
```
functions_decompile(address="00401234")
```
### 2. Identify what it does
Read the pseudocode. Look at string references, called functions, and parameter usage to determine the function's purpose.
### 3. Rename it
```
functions_rename(address="00401234", new_name="validate_user_credentials")
```
### 4. Set the signature
If you can determine the parameter types and return type:
```
functions_set_signature(
address="00401234",
signature="int validate_user_credentials(char *username, char *password)"
)
```
### 5. Add a comment
```
functions_set_comment(
address="00401234",
comment="Checks username/password against the SQLite user table. Returns 1 on success."
)
```
### 6. Re-decompile
```
functions_decompile(address="00401234")
```
The decompiled output now uses your names, types, and annotations. Functions called from `validate_user_credentials` also reflect the updated name wherever they reference it. Repeat this loop for each function you investigate.
---
## Firmware Reverse Engineering
Raw firmware (bootloaders, embedded system images, bare-metal code) requires extra setup because there is no ELF/PE header for Ghidra to parse.
### 1. Start with the right loader
Specify the processor language and base address:
```
docker_auto_start(
binary_path="/path/to/firmware.bin",
language="ARM:LE:32:v4t",
base_address="0x00000000"
)
```
When `language` is set, MCGhidra uses `BinaryLoader` to map the raw bytes at the given address. See the [Installation guide](/getting-started/installation/) for a table of common language IDs.
### 2. Find the entry point
ARM firmware typically starts with an exception vector table at address 0x00000000. The first entry is the initial stack pointer, and the second is the reset vector (entry point):
```
memory_read(address="0x00000000", length=32, format="hex")
functions_decompile(address="0x00000004")
```
### 3. Identify peripherals
Embedded firmware talks to hardware through memory-mapped I/O. Look for reads and writes to addresses outside the firmware's code and data regions:
```
data_list_strings(grep="UART|SPI|I2C|GPIO")
functions_list(grep="init_periph|hw_init|bsp_")
```
Constants like `0x40000000`, `0x48000000`, or `0xE000E000` (ARM Cortex-M NVIC) are strong indicators of peripheral access.
### 4. Trace interrupt handlers
Interrupt vector tables are typically at fixed offsets. For Cortex-M, the vector table starts at the base address. Each 4-byte entry points to a handler:
```
memory_read(address="0x00000000", length=256, format="hex")
```
Create functions at each non-null vector address:
```
functions_create(address="0x00000040")
functions_decompile(address="0x00000040")
```
### 5. Map protocol implementations
Firmware that communicates over a bus (UART, SPI, USB, CAN) will have recognizable patterns: ring buffers, state machines with packet parsing, and checksum calculations. Use call graph analysis to trace from peripheral init functions to protocol handlers:
```
analysis_get_callgraph(name="uart_init", max_depth=4)
```
---
## Using Analysis Prompts
MCGhidra includes 13 built-in prompts that guide Claude through structured analysis workflows. Each prompt defines a series of steps, tool calls, and checks for a specific reverse engineering task.
### Running a prompt
In Claude Code or Claude Desktop, use the `/prompt` command:
```
/prompt malware_triage
```
Claude will then execute a multi-step analysis: listing functions, scanning strings, checking imports, and producing a structured report. Prompts that involve scanning (like `malware_triage` or `identify_crypto`) report progress as they work through each step.
### Available prompts
| Prompt | What it does |
|--------|-------------|
| `malware_triage` | Quick capability assessment across 21 scanning steps: checks for network activity, file manipulation, process injection, anti-analysis tricks, and persistence mechanisms |
| `identify_crypto` | Scans for known crypto constants (AES S-boxes, SHA magic numbers), function names matching crypto libraries, and common key schedule patterns |
| `find_authentication` | Searches for password checks, credential storage, license validation, certificate handling, and authentication bypass patterns |
| `analyze_protocol` | Framework for reversing network or file format protocols: identifies packet structures, state machines, serialization routines |
| `trace_data_flow` | Follows data forward or backward through a program to map how input reaches sensitive operations |
| `find_main_logic` | Navigates past CRT startup, compiler-generated wrappers, and initialization to find the actual application entry point |
| `analyze_imports` | Categorizes imported functions by capability (file I/O, networking, crypto, process management) and flags suspicious combinations |
| `analyze_strings` | Groups strings by category (URLs, file paths, error messages, format strings) and cross-references them to find their usage |
| `analyze_switch_table` | Identifies jump tables and command dispatchers, maps case values to handler functions |
| `find_config_parsing` | Locates configuration file readers, command-line parsers, registry access, and environment variable lookups |
| `compare_functions` | Side-by-side comparison of two functions to identify patches, variants, or shared library code |
| `document_struct` | Traces struct usage across the binary to document field types, offsets, sizes, and purpose |
| `find_error_handlers` | Maps error handling paths, cleanup routines, exception handlers, and exit patterns |
### Prompt examples
Triage an unknown binary for malicious capabilities:
```
/prompt malware_triage
```
Find all cryptographic implementations:
```
/prompt identify_crypto
```
Trace how user input flows to a specific sink:
```
/prompt trace_data_flow
```
### What happens during a prompt
Each prompt orchestrates a series of MCP tool calls. For example, `malware_triage` will:
1. Call `program_info()` to determine the architecture and format
2. Call `functions_list(grep=...)` repeatedly with patterns for each capability category (networking, file ops, process injection, etc.)
3. Call `data_list_strings(grep=...)` to find suspicious string patterns
4. Call `symbols_imports(grep=...)` to categorize imported APIs
5. Produce a summary with findings organized by risk category
Prompts that scan many patterns report numeric progress (e.g., "Step 12/21: Checking for anti-debug techniques") so you can see where they are in the analysis.

View File

@ -0,0 +1,115 @@
---
title: Configuration
description: Environment variables and settings for the MCP server, Docker containers, and port pool
---
MCGhidra is configured through environment variables. No configuration file is required -- defaults work out of the box for local development.
## MCP Server
These variables control the Python MCP server process.
| Variable | Default | Description |
|----------|---------|-------------|
| `GHIDRA_HOST` | `localhost` | Hostname for connecting to Ghidra instances. Change this when Ghidra runs on a remote host. |
| `MCGHIDRAMCP_DEBUG` | *unset* | Set to `1` to enable DEBUG-level logging. Shows HTTP requests, pagination details, and discovery results. |
| `MCGHIDRA_FEEDBACK` | `true` | Enable or disable feedback collection. Set to `false` to disable. |
| `MCGHIDRA_FEEDBACK_DB` | `~/.mcghidra/feedback.db` | Path to the SQLite database for feedback data. The parent directory is created automatically. |
### Internal Defaults
These values are set in `MCGhidraConfig` and are not currently exposed as environment variables, but can be overridden programmatically when creating the server:
| Setting | Default | Description |
|---------|---------|-------------|
| `quick_discovery_range` | 18489-18498 | Port range for quick instance discovery scans |
| `full_discovery_range` | 18400-18599 | Port range for full discovery scans (`instances_discover`) |
| `request_timeout` | 30.0s | HTTP request timeout for Ghidra API calls |
| `discovery_timeout` | 0.5s | HTTP timeout per port during discovery scans |
| `default_page_size` | 50 | Default pagination page size |
| `max_page_size` | 500 | Maximum allowed page size |
| `cursor_ttl_seconds` | 300 | Cursor expiration time (5 minutes) |
| `max_cursors_per_session` | 100 | Maximum active cursors per MCP session |
| `max_response_tokens` | 8000 | Hard token budget -- the return_all guard triggers above this |
| `expected_api_version` | 2 | Minimum API version required from the Ghidra plugin |
---
## Docker Image
These variables control the MCP server's Docker integration -- how it builds, tags, and starts containers.
| Variable | Default | Description |
|----------|---------|-------------|
| `MCGHIDRAMCP_VERSION` | `latest` | Docker image tag to use when starting containers. |
| `MCGHIDRA_PORT` | `8192` | Default port for container API mapping. Overridden by auto-allocation in multi-container mode. |
| `MCGHIDRA_MAXMEM` | `2G` | Max JVM heap size passed to containers. Increase for large binaries. |
| `MCGHIDRA_DOCKER_AUTO` | `false` | When `true`, the server will automatically start a Docker container when a binary is loaded and no Ghidra instance is available. |
---
## Port Pool
The port pool prevents conflicts when multiple MCP sessions run containers simultaneously. Ports are allocated using `flock`-based locking.
| Variable | Default | Description |
|----------|---------|-------------|
| `MCGHIDRA_PORT_START` | `8192` | First port in the allocation pool. |
| `MCGHIDRA_PORT_END` | `8319` | Last port in the allocation pool (128 ports total). |
| `MCGHIDRA_PORT_LOCK_DIR` | `/tmp/mcghidra-ports` | Directory for port lock files. Created automatically on first use. |
Port lock files are named `port-{N}.lock` and contain JSON with the session ID, PID, and timestamp. The `docker_cleanup` tool removes stale locks from crashed processes.
---
## Container Environment
These variables are read by the Docker entrypoint script (`entrypoint.sh`) inside the container. They configure how Ghidra runs in headless mode.
### Core Settings
| Variable | Default | Description |
|----------|---------|-------------|
| `MCGHIDRA_MODE` | `headless` | Container operating mode. See modes below. |
| `MCGHIDRA_PORT` | `8192` | HTTP API port inside the container. The MCP server maps this to a host port from the pool. |
| `MCGHIDRA_MAXMEM` | `2G` | Max JVM heap size. Passed to Ghidra's `analyzeHeadless` command. |
### Ghidra Paths
| Variable | Default | Description |
|----------|---------|-------------|
| `GHIDRA_HOME` | `/opt/ghidra` | Ghidra installation directory inside the container. |
| `SCRIPT_DIR` | `/home/ghidra/ghidra_scripts` | Directory for Ghidra Python scripts (MCGhidraServer.py lives here). |
| `PROJECT_DIR` | `/projects` | Directory where Ghidra stores its project files (.gpr, .rep). |
| `PROJECT_NAME` | `MCGhidra` | Name of the Ghidra project created for the imported binary. |
### Firmware Import Options
These are optional. When omitted, Ghidra auto-detects the binary format.
| Variable | Default | Description |
|----------|---------|-------------|
| `GHIDRA_LANGUAGE` | *auto-detect* | Processor language ID. Must match `ARCH:ENDIAN:SIZE:VARIANT` format (e.g., `ARM:LE:32:v4t`). Setting this causes the container to use `BinaryLoader` unless `GHIDRA_LOADER` overrides it. |
| `GHIDRA_BASE_ADDRESS` | *auto-detect* | Base address for raw binaries. Hex format: `0x00000000` or `00000000`. |
| `GHIDRA_LOADER` | *auto-detect* | Loader type override. Common values: `BinaryLoader` (raw bytes), `AutoImporter` (header-based detection). Must be alphanumeric with underscores. |
### Container Modes
The `MCGHIDRA_MODE` variable selects the operating mode:
| Mode | Description |
|------|-------------|
| `headless` | Default. Imports the binary, runs auto-analysis, starts the HTTP API server. This is what `docker_auto_start` and `docker_start` use. |
| `server` | Opens an existing project (no import). Requires a program name as an argument. Useful for re-analyzing a previously imported binary. |
| `analyze` | Imports and analyzes a binary, then exits. No HTTP server. Use for batch processing. |
| `shell` | Drops into an interactive bash shell. Useful for debugging the container environment. |
### Validation
All firmware import parameters are validated before reaching Ghidra:
- `GHIDRA_LANGUAGE` must match `ARCH:ENDIAN:SIZE:VARIANT` (regex-validated).
- `GHIDRA_BASE_ADDRESS` must be valid hex, max 64-bit.
- `GHIDRA_LOADER` must be alphanumeric with underscores.
Invalid values are rejected with a descriptive error before any Docker or Ghidra operations run. The MCP server validates these on the client side as well, so errors surface in tool responses rather than buried in container logs.

View File

@ -1,93 +1,742 @@
--- ---
title: MCP Tools title: MCP Tools
description: Complete reference for MCGhidra's MCP tool interface description: Complete reference for all MCGhidra MCP tools, grouped by domain
--- ---
MCGhidra exposes Ghidra's capabilities as MCP tools. These are grouped by function. MCGhidra exposes Ghidra's capabilities as MCP tools. There are 64 tools across 14 categories.
## Analysis Tools ## Pagination Convention
### `functions_list` Most list tools share a common set of pagination and filtering parameters. Rather than repeating them in every table, they are documented once here:
List all functions in the current program. Returns function names, addresses, and sizes.
### `functions_decompile` | Parameter | Type | Default | Description |
Decompile a function by name or address. Returns C-like pseudocode from Ghidra's decompiler. |-----------|------|---------|-------------|
| `port` | int | *current* | Ghidra instance port. Uses the active instance if omitted. |
| `page_size` | int | `50` | Items per page. Maximum: 500. |
| `grep` | string | *none* | Client-side regex pattern applied to results after fetching. |
| `grep_ignorecase` | bool | `true` | Case-insensitive grep matching. |
| `return_all` | bool | `false` | Bypass pagination and return everything. Triggers a budget guard if the response exceeds ~8000 tokens. |
| `fields` | list[str] | *none* | Field projection -- keep only these keys per item. Reduces response size. |
### `functions_disassemble` Tools that accept these parameters are marked with "Supports pagination" below. Use `cursor_next(cursor_id)` to advance through pages.
Get assembly-level disassembly for a function or address range.
### `functions_create` ---
Define a new function at a given address.
## Memory & Data
### `memory_read`
Read bytes from a memory address. Returns hex-encoded data.
### `data_types`
List available data types in the program's data type manager.
### `data_create`
Apply a data type at an address (e.g., mark bytes as a struct).
## Navigation
### `xrefs_to`
Find all cross-references *to* an address — who calls this function or references this data.
### `xrefs_from`
Find all cross-references *from* an address — what does this function call or reference.
### `symbols_search`
Search for symbols by name pattern (supports wildcards).
## Annotation
### `symbols_rename`
Rename a symbol (function, label, variable) at a given address.
### `comments_set`
Set a comment at an address (EOL, pre, post, plate, or repeatable).
## Instance Management ## Instance Management
Tools for discovering, registering, and switching between Ghidra instances.
### `instances_list` ### `instances_list`
List all known Ghidra instances (both local and Docker).
List all active Ghidra instances. Runs a quick discovery scan before returning results.
Returns a dict with an `instances` list containing port, URL, project, and file for each instance.
### `instances_use` ### `instances_use`
Switch the active instance by port number.
## Docker Management Set the current working instance. All subsequent tool calls default to this instance.
Uses lazy registration -- the instance is recorded immediately without a blocking HTTP call. If the instance is unreachable, the next actual tool call will fail with a clear error.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `port` | int | *required* | Port number of the instance to activate |
Returns confirmation with instance details.
### `instances_current`
Show which instance is currently active, including its port, URL, project, and file. Returns an error message with available instance ports if none is set.
### `instances_register`
Manually register an instance by port. Verifies the instance is responsive and checks API version compatibility before registering.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `port` | int | *required* | Port number |
| `url` | string | *auto* | URL override (defaults to `http://{GHIDRA_HOST}:{port}`) |
Returns confirmation or error message.
### `instances_unregister`
Remove an instance from the registry. If the unregistered instance was the current working instance, the current selection is cleared.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `port` | int | *required* | Port number to unregister |
### `instances_discover`
Force a full discovery scan across the configured port range (ports 18400-18600). Use this when you need to find instances on a different host. For normal use, `instances_list` already runs a quick scan.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `host` | string | *configured* | Host to scan |
### `program_info`
Get full program metadata from the current Ghidra instance: architecture, language ID, compiler spec, image base address, and total memory size.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `port` | int | *current* | Ghidra instance port |
---
## Functions
Tools for listing, decompiling, disassembling, and modifying functions. Supports pagination.
### `functions_list`
List functions with cursor-based pagination and server-side filtering. For large binaries, use `name_contains` or `name_regex` for server-side filtering before results reach the client.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `name_contains` | string | *none* | Server-side substring filter (faster than grep for large binaries) |
| `name_regex` | string | *none* | Server-side regex filter on function name |
| `address` | string | *none* | Filter by exact function address (hex) |
Supports pagination.
### `functions_get`
Get detailed information about a single function: name, address, signature, size, stack depth, calling convention.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `name` | string | *none* | Function name (mutually exclusive with address) |
| `address` | string | *none* | Function address in hex |
| `port` | int | *current* | Ghidra instance port |
### `functions_decompile`
Decompile a function to C pseudocode. Output is split into lines for pagination -- use `grep` to search within the decompiled code.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `name` | string | *none* | Function name (mutually exclusive with address) |
| `address` | string | *none* | Function address in hex |
| `syntax_tree` | bool | `false` | Include the decompiler syntax tree (JSON) |
| `style` | string | `"normalize"` | Decompiler simplification style |
Supports pagination (over decompiled lines).
### `functions_disassemble`
Get assembly-level disassembly for a function. Output is split into instruction lines for pagination.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `name` | string | *none* | Function name (mutually exclusive with address) |
| `address` | string | *none* | Function address in hex |
Supports pagination (over instruction lines).
### `functions_rename`
Rename a function. Identify it by current name or address.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `old_name` | string | *none* | Current function name |
| `address` | string | *none* | Function address in hex |
| `new_name` | string | *required* | New name for the function |
| `port` | int | *current* | Ghidra instance port |
### `functions_set_signature`
Set the full prototype of a function, including return type, name, and parameter types.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `name` | string | *none* | Function name |
| `address` | string | *none* | Function address in hex |
| `signature` | string | *required* | Full signature (e.g., `"int foo(char* arg1, int arg2)"`) |
| `port` | int | *current* | Ghidra instance port |
### `functions_set_comment`
Set a decompiler-level comment on a function. Tries the function comment first, then falls back to a pre-comment if the address is not a function entry point.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `address` | string | *required* | Memory address (preferably function entry point) |
| `comment` | string | `""` | Comment text. Empty string removes the comment. |
| `port` | int | *current* | Ghidra instance port |
### `functions_create`
Create a new function definition at the specified address. Ghidra will attempt to determine the function boundaries automatically.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `address` | string | *required* | Memory address in hex |
| `port` | int | *current* | Ghidra instance port |
### `functions_variables`
List local variables and parameters for a specific function. Supports pagination.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `address` | string | *required* | Function address in hex |
Supports pagination.
---
## Data
Tools for working with defined data items and strings.
### `data_list`
List defined data items with filtering and cursor-based pagination.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `addr` | string | *none* | Filter by address (hex) |
| `name` | string | *none* | Exact name match (case-sensitive) |
| `name_contains` | string | *none* | Substring name filter (case-insensitive) |
| `type` | string | *none* | Filter by data type (e.g., `"string"`, `"dword"`) |
Supports pagination.
### `data_list_strings`
List all defined strings in the binary. Use `filter` for server-side content matching, or `grep` for client-side regex.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `filter` | string | *none* | Server-side string content filter |
Supports pagination.
### `data_create`
Define a new data item at the specified address.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `address` | string | *required* | Memory address in hex |
| `data_type` | string | *required* | Data type (e.g., `"string"`, `"dword"`, `"byte"`) |
| `size` | int | *none* | Size in bytes (optional) |
| `port` | int | *current* | Ghidra instance port |
### `data_rename`
Rename a data item at the specified address.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `address` | string | *required* | Memory address in hex |
| `name` | string | *required* | New name |
| `port` | int | *current* | Ghidra instance port |
### `data_set_type`
Change the data type of an existing data item.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `address` | string | *required* | Memory address in hex |
| `data_type` | string | *required* | New data type (e.g., `"uint32_t"`, `"char[10]"`) |
| `port` | int | *current* | Ghidra instance port |
### `data_delete`
Remove a data definition at the specified address.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `address` | string | *required* | Memory address in hex |
| `port` | int | *current* | Ghidra instance port |
---
## Structs
Tools for creating and modifying struct (composite) data types.
### `structs_list`
List all struct data types. Supports pagination.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `category` | string | *none* | Filter by category path (e.g., `"/winapi"`) |
Supports pagination.
### `structs_get`
Get a struct with all its fields. If the struct has more than 10 fields, the field list is paginated. Use `fields` projection to reduce response size.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `name` | string | *required* | Struct name |
Supports pagination (over struct fields).
### `structs_create`
Create a new struct data type.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `name` | string | *required* | Name for the struct |
| `category` | string | *none* | Category path (e.g., `"/custom"`) |
| `description` | string | *none* | Description text |
| `port` | int | *current* | Ghidra instance port |
### `structs_add_field`
Add a field to an existing struct. If `offset` is omitted, the field is appended to the end of the struct.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `struct_name` | string | *required* | Name of the struct |
| `field_name` | string | *required* | Name for the new field |
| `field_type` | string | *required* | Data type (e.g., `"int"`, `"char"`, `"pointer"`) |
| `offset` | int | *none* | Byte offset within the struct |
| `comment` | string | *none* | Field comment |
| `port` | int | *current* | Ghidra instance port |
### `structs_update_field`
Modify an existing field in a struct. Identify the field by name or offset. At least one of `new_name`, `new_type`, or `new_comment` must be provided.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `struct_name` | string | *required* | Name of the struct |
| `field_name` | string | *none* | Current field name (or use `field_offset`) |
| `field_offset` | int | *none* | Field offset (or use `field_name`) |
| `new_name` | string | *none* | New name |
| `new_type` | string | *none* | New data type |
| `new_comment` | string | *none* | New comment |
| `port` | int | *current* | Ghidra instance port |
### `structs_delete`
Remove a struct data type definition.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `name` | string | *required* | Struct name to delete |
| `port` | int | *current* | Ghidra instance port |
---
## Symbols
Tools for working with the symbol table: labels, imports, and exports.
### `symbols_list`
List all symbols in the program. Supports pagination.
Supports pagination.
### `symbols_create`
Create a new label/symbol at the specified address. If a symbol already exists at that address, it is renamed.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `name` | string | *required* | Symbol name |
| `address` | string | *required* | Memory address in hex |
| `port` | int | *current* | Ghidra instance port |
### `symbols_rename`
Rename the primary symbol at an address.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `address` | string | *required* | Memory address in hex |
| `new_name` | string | *required* | New name |
| `port` | int | *current* | Ghidra instance port |
### `symbols_delete`
Delete the primary symbol at an address.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `address` | string | *required* | Memory address in hex |
| `port` | int | *current* | Ghidra instance port |
### `symbols_imports`
List imported symbols (external references). Supports pagination.
Supports pagination.
### `symbols_exports`
List exported symbols (entry points). Supports pagination.
Supports pagination.
---
## Analysis
Tools for triggering and inspecting Ghidra analysis results.
### `analysis_run`
Trigger Ghidra's auto-analysis on the current program.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `port` | int | *current* | Ghidra instance port |
| `analysis_options` | dict | *none* | Analysis options to enable/disable |
### `analysis_get_callgraph`
Generate a call graph starting from a function. Returns nodes and edges. Edges are paginated.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `name` | string | *none* | Starting function name |
| `address` | string | *none* | Starting function address |
| `max_depth` | int | `3` | Maximum call depth |
Supports pagination (over edges).
### `analysis_get_dataflow`
Trace data flow forward or backward from an address. Returns a list of steps showing how data propagates through the program.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `address` | string | *required* | Starting address in hex |
| `direction` | string | `"forward"` | `"forward"` or `"backward"` |
| `max_steps` | int | `50` | Maximum analysis steps |
Supports pagination (over steps).
### `xrefs_list`
Find cross-references to or from an address. At least one of `to_addr` or `from_addr` is required.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `to_addr` | string | *none* | Find references to this address |
| `from_addr` | string | *none* | Find references from this address |
| `type` | string | *none* | Filter by type: `"CALL"`, `"READ"`, `"WRITE"`, `"DATA"`, `"POINTER"` |
Supports pagination.
### `comments_get`
Get a comment at the specified address.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `address` | string | *required* | Memory address in hex |
| `comment_type` | string | `"plate"` | Type: `"plate"`, `"pre"`, `"post"`, `"eol"`, `"repeatable"` |
| `port` | int | *current* | Ghidra instance port |
### `comments_set`
Set a comment at the specified address. Pass an empty string to remove the comment.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `address` | string | *required* | Memory address in hex |
| `comment` | string | `""` | Comment text |
| `comment_type` | string | `"plate"` | Type: `"plate"`, `"pre"`, `"post"`, `"eol"`, `"repeatable"` |
| `port` | int | *current* | Ghidra instance port |
---
## Memory
Direct memory access tools.
### `memory_read`
Read bytes from a memory address.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `address` | string | *required* | Memory address in hex |
| `length` | int | `16` | Number of bytes to read |
| `format` | string | `"hex"` | Output format: `"hex"`, `"base64"`, or `"string"` |
| `port` | int | *current* | Ghidra instance port |
Returns the bytes in the requested format along with the actual byte count.
### `memory_write`
Write bytes to a memory address. Use with caution -- this modifies the program state.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `address` | string | *required* | Memory address in hex |
| `bytes_data` | string | *required* | Data to write |
| `format` | string | `"hex"` | Input format: `"hex"`, `"base64"`, or `"string"` |
| `port` | int | *current* | Ghidra instance port |
---
## Variables
Tools for querying and renaming variables.
### `variables_list`
List variables with optional global-only filtering. Supports pagination.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `global_only` | bool | `false` | Return only global variables |
Supports pagination.
### `variables_rename`
Rename a variable within a function, and optionally change its data type.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `function_address` | string | *required* | Function address in hex |
| `variable_name` | string | *required* | Current variable name |
| `new_name` | string | *required* | New name |
| `new_type` | string | *none* | New data type (e.g., `"int"`, `"char*"`) |
| `port` | int | *current* | Ghidra instance port |
---
## Bookmarks
Tools for managing Ghidra bookmarks (annotations at addresses).
### `bookmarks_list`
List bookmarks with optional type and category filtering. Supports pagination.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `type` | string | *none* | Filter by type: `"Note"`, `"Warning"`, `"Error"`, `"Info"` |
| `category` | string | *none* | Filter by category |
Supports pagination.
### `bookmarks_create`
Create a bookmark at the specified address.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `address` | string | *required* | Memory address in hex |
| `type` | string | `"Note"` | Bookmark type: `Note`, `Warning`, `Error`, `Info` |
| `category` | string | `""` | Category string for grouping |
| `comment` | string | `""` | Bookmark comment text |
| `port` | int | *current* | Ghidra instance port |
### `bookmarks_delete`
Delete all bookmarks at the specified address.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `address` | string | *required* | Memory address in hex |
| `port` | int | *current* | Ghidra instance port |
---
## Enums and Typedefs
Tools for managing enum and typedef data types.
### `enums_list`
List enum data types with their members. Supports pagination.
Supports pagination.
### `enums_create`
Create a new enum data type.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `name` | string | *required* | Enum name |
| `size` | int | `4` | Size in bytes |
| `port` | int | *current* | Ghidra instance port |
### `typedefs_list`
List typedef data types. Supports pagination.
Supports pagination.
### `typedefs_create`
Create a new typedef data type.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `name` | string | *required* | Typedef name |
| `base_type` | string | *required* | Base data type (e.g., `"int"`, `"uint32_t"`, `"char*"`) |
| `port` | int | *current* | Ghidra instance port |
---
## Namespaces
Tools for querying namespaces and class definitions.
### `namespaces_list`
List all non-global namespaces in the program. Supports pagination.
Supports pagination.
### `classes_list`
List class namespaces with qualified names. Supports pagination.
Supports pagination.
---
## Segments
### `segments_list`
List memory segments (`.text`, `.data`, `.bss`, etc.) with read/write/execute permissions, start address, and size.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `name` | string | *none* | Filter by segment name (exact match) |
Supports pagination.
---
## Cursors
Tools for managing pagination state. Every paginated tool response includes a `cursor_id` in the pagination metadata when more pages are available.
### `cursor_next`
Fetch the next page of results for a cursor. Cursors expire after 5 minutes of inactivity.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `cursor_id` | string | *required* | Cursor identifier from a previous paginated response |
Returns the next page of results with updated pagination info.
### `cursor_list`
List active cursors for the current session.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `all_sessions` | bool | `false` | Include cursors from all sessions |
### `cursor_delete`
Delete a specific cursor to free resources.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `cursor_id` | string | *required* | Cursor identifier to delete |
### `cursor_delete_all`
Delete all cursors for the current session.
---
## Docker
Tools for managing Ghidra Docker containers. See the [Docker Usage](/reference/docker/) page for environment variables and firmware import details.
### `docker_auto_start` ### `docker_auto_start`
The primary entry point. Finds an existing container for the binary or starts a new one.
**Parameters:** The primary entry point for automatic container management. Checks all pooled ports for an existing instance analyzing the same binary. If none is found, allocates a port and starts a new container. Returns connection info immediately -- poll `docker_health` to check when the API is ready.
- `binary_path` (required) — Path to the binary file
- `language` — Ghidra language ID (e.g., `ARM:LE:32:v4t`) | Parameter | Type | Default | Description |
- `base_address` — Memory base address (e.g., `0x00000000`) |-----------|------|---------|-------------|
- `loader` — Loader type (e.g., `BinaryLoader`) | `binary_path` | string | *required* | Path to the binary file |
| `language` | string | *none* | Ghidra processor language ID (e.g., `"ARM:LE:32:v4t"`) |
| `base_address` | string | *none* | Base address for raw binaries (e.g., `"0x00000000"`) |
| `loader` | string | *none* | Loader type. Auto-set to `"BinaryLoader"` when language is specified. |
### `docker_start` ### `docker_start`
Start a container with explicit control over all parameters.
Start a container with explicit control over all parameters. Ports are auto-allocated from the pool (8192-8319).
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `binary_path` | string | *required* | Path to the binary file |
| `memory` | string | `"2G"` | Max JVM heap size |
| `name` | string | *auto* | Container name (auto-generated with session ID) |
| `language` | string | *none* | Ghidra processor language ID |
| `base_address` | string | *none* | Base address (hex) |
| `loader` | string | *none* | Loader type |
### `docker_stop` ### `docker_stop`
Stop a container (session-scoped — can only stop your own).
Stop and optionally remove a container. Session-scoped: you can only stop containers started by your own MCP session.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `name_or_id` | string | *required* | Container name or ID |
| `remove` | bool | `true` | Also remove the container |
### `docker_health` ### `docker_health`
Poll a container's HTTP API to check readiness.
Check if a container's HTTP API is responding. Tries `/health` first, then falls back to the root endpoint for older plugin versions.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `port` | int | *current* | API port to check |
| `timeout` | float | `5.0` | Request timeout in seconds |
### `docker_logs` ### `docker_logs`
Retrieve container stdout/stderr.
### `docker_build` Get stdout/stderr from a container. Useful for monitoring analysis progress.
Build the MCGhidra Docker image from source.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `name_or_id` | string | *required* | Container name or ID |
| `tail` | int | `100` | Number of lines to show |
| `follow` | bool | `false` | Follow log output (not recommended for MCP) |
### `docker_status` ### `docker_status`
Overview of all containers, images, and port allocations.
List all MCGhidra containers, Docker images, port pool allocation status, and whether Docker/Compose are available.
### `docker_build`
Build the MCGhidra Docker image from source.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `tag` | string | `"latest"` | Image tag |
| `no_cache` | bool | `false` | Build without Docker cache |
| `project_dir` | string | *auto* | Path to MCGhidra project root |
### `docker_cleanup` ### `docker_cleanup`
Remove orphaned containers and stale port locks.
Remove orphaned containers and stale port lock files. By default, only cleans containers from the current session for safety.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `session_only` | bool | `true` | Only clean this session's containers |
| `max_age_hours` | float | `24.0` | Max age for orphaned containers |
| `dry_run` | bool | `false` | Report what would be cleaned without acting |
### `docker_session_info` ### `docker_session_info`
Show containers and ports for the current MCP session.
Show containers and allocated ports for the current MCP session.

View File

@ -0,0 +1,203 @@
---
title: MCP Resources
description: Reference for MCGhidra's read-only MCP resource URIs
---
MCGhidra registers 19 MCP resources that provide read-only access to Ghidra data. Resources are a good fit for quick enumeration -- they return data without requiring tool calls and work well for populating context at the start of a conversation.
## Resources vs Tools
Resources and tools serve different purposes:
- **Resources** return a capped snapshot of data. They have no pagination controls, no filtering, and a fixed maximum result size. Use them for a quick overview: "what functions exist in this binary?" or "what strings are defined?"
- **Tools** support pagination, grep filtering, field projection, and mutation operations. Use them when you need to page through large result sets, search for specific items, or modify the program.
If a resource hits its cap, the response includes a `_hint` field suggesting which tool to use for full pagination.
## Result Caps
Each resource type has a configurable maximum number of items it will return. These defaults are set in `MCGhidraConfig.resource_caps`:
| Resource Type | Default Cap |
|---------------|-------------|
| functions | 1000 |
| strings | 500 |
| data | 1000 |
| structs | 500 |
| xrefs | 500 |
| symbols | 1000 |
| segments | 500 |
| variables | 1000 |
| namespaces | 500 |
| classes | 500 |
| bookmarks | 1000 |
| enums | 500 |
| typedefs | 500 |
---
## Instance Resources
### `ghidra://instances`
List all active Ghidra instances. Runs a quick discovery scan before returning.
Returns: `instances` array (port, project, file), `count`, and `current_port`.
### `ghidra://instance/{port}`
Get detailed information about a specific Ghidra instance, including program metadata from the plugin's root endpoint.
**URI parameter:** `port` -- the instance port number.
### `ghidra://instance/{port}/summary`
Program overview with aggregate statistics. Fetches function count and string count in addition to basic program metadata (name, language, processor, format).
**URI parameter:** `port` -- the instance port number.
### `ghidra://instance/{port}/program`
Program metadata: architecture, language ID, compiler spec, image base address, and memory size. This is the same data returned by the REST API's `GET /program` endpoint.
**URI parameter:** `port` -- the instance port number.
---
## Function Resources
### `ghidra://instance/{port}/functions`
List functions in the program. Capped at 1000 items.
Returns: `functions` array, `count`, and `capped_at` (non-null if the cap was reached).
If capped, use the `functions_list()` tool for full pagination.
### `ghidra://instance/{port}/function/decompile/address/{address}`
Decompile a function by its address. Returns the C pseudocode as a plain text string.
**URI parameters:** `port`, `address` (hex, e.g., `0x401000`).
### `ghidra://instance/{port}/function/decompile/name/{name}`
Decompile a function by name. Returns the C pseudocode as a plain text string.
**URI parameters:** `port`, `name` (function name, e.g., `main`).
---
## Data Resources
### `ghidra://instance/{port}/strings`
List defined strings in the binary. Capped at 500 items.
Returns: `strings` array, `count`, and `capped_at`.
If capped, use `data_list_strings()` for full pagination.
### `ghidra://instance/{port}/data`
List defined data items. Capped at 1000 items.
Returns: `data` array, `count`, and `capped_at`.
If capped, use `data_list()` for full pagination.
### `ghidra://instance/{port}/structs`
List struct data types. Capped at 500 items.
Returns: `structs` array, `count`, and `capped_at`.
If capped, use `structs_list()` for full pagination.
---
## Cross-Reference Resources
### `ghidra://instance/{port}/xrefs/to/{address}`
Get all cross-references pointing to the specified address. Capped at 500 items.
**URI parameters:** `port`, `address` (hex).
Returns: `address`, `xrefs_to` array, `count`, and `capped_at`.
If capped, use `xrefs_list(to_addr=...)` for full pagination.
### `ghidra://instance/{port}/xrefs/from/{address}`
Get all cross-references originating from the specified address. Capped at 500 items.
**URI parameters:** `port`, `address` (hex).
Returns: `address`, `xrefs_from` array, `count`, and `capped_at`.
If capped, use `xrefs_list(from_addr=...)` for full pagination.
---
## Symbol Resources
### `ghidra://instance/{port}/symbols`
List all symbols in the program. Capped at 1000 items.
Returns: `symbols` array, `count`, and `capped_at`.
If capped, use `symbols_list()` for full pagination.
### `ghidra://instance/{port}/symbols/imports`
List imported symbols (external references). Capped at 1000 items.
Returns: `imports` array, `count`, and `capped_at`.
If capped, use `symbols_imports()` for full pagination.
### `ghidra://instance/{port}/symbols/exports`
List exported symbols (entry points). Capped at 1000 items.
Returns: `exports` array, `count`, and `capped_at`.
If capped, use `symbols_exports()` for full pagination.
---
## Other Resources
### `ghidra://instance/{port}/segments`
List memory segments with names, address ranges, sizes, and permissions. Capped at 500 items.
Returns: `segments` array, `count`, and `capped_at`.
If capped, use `segments_list()` for full pagination.
### `ghidra://instance/{port}/namespaces`
List all non-global namespaces. Capped at 500 items.
Returns: `namespaces` array, `count`, and `capped_at`.
If capped, use `namespaces_list()` for full pagination.
### `ghidra://instance/{port}/classes`
List class namespaces with qualified names. Capped at 500 items.
Returns: `classes` array, `count`, and `capped_at`.
If capped, use `classes_list()` for full pagination.
### `ghidra://instance/{port}/variables`
List variables. Capped at 1000 items.
Returns: `variables` array, `count`, and `capped_at`.
If capped, use `variables_list()` for full pagination.

View File

@ -0,0 +1,423 @@
---
title: REST API
description: Reference for the Ghidra plugin's HATEOAS HTTP API
---
The Ghidra plugin runs an HTTP server inside the JVM and exposes a HATEOAS REST API. Every response includes hypermedia links (`_links`) to related resources, so clients can discover the API by following links rather than hardcoding paths.
The MCP server wraps this API as MCP tools. You generally do not need to call the REST API directly, but understanding it helps when debugging or building custom integrations.
## General Concepts
### Request Format
Standard HTTP verbs: `GET` to read, `POST` to create, `PATCH` to modify, `PUT` to replace, `DELETE` to remove. Request bodies use JSON (`Content-Type: application/json`). Include an `X-Request-ID` header for correlation if needed.
### Response Envelope
Every response follows this structure:
```json
{
"id": "req-123",
"instance": "http://localhost:8192",
"success": true,
"result": { ... },
"_links": {
"self": { "href": "/path/to/resource" },
"related": { "href": "/path/to/related" }
}
}
```
- `id` -- Correlation identifier from `X-Request-ID`, or a generated value.
- `instance` -- URL of the plugin instance that handled the request.
- `result` -- The payload. A single object for detail endpoints, an array for list endpoints.
- `_links` -- HATEOAS links to related resources and actions.
### Error Responses
Errors use standard HTTP status codes and include a structured error object:
```json
{
"id": "req-456",
"instance": "http://localhost:8192",
"success": false,
"error": {
"code": "RESOURCE_NOT_FOUND",
"message": "No function at address 0x999999"
}
}
```
Common status codes: `200` OK, `201` Created, `400` Bad Request, `404` Not Found, `500` Internal Server Error.
### Pagination
List endpoints accept `offset` and `limit` query parameters. Responses include `size` (total count), `offset`, `limit`, and `_links` with `next`/`prev` when applicable.
```
GET /functions?offset=50&limit=50
```
### Addressing and Search
Resources can be accessed by hex address or searched by name:
- By address: `GET /functions/0x401000`
- By exact name: `GET /functions?name=main`
- By substring: `GET /functions?name_contains=init`
- By regex: `GET /functions?name_matches_regex=^FUN_`
---
## Meta Endpoints
### `GET /plugin-version`
Returns the plugin build version and API version number. The MCP server uses this for compatibility checks.
```json
{
"result": {
"plugin_version": "v2.0.0",
"api_version": 2
}
}
```
### `GET /info`
Returns details about the current plugin instance: loaded file, architecture, processor, address size, project name, and server port.
```json
{
"result": {
"file": "example.exe",
"architecture": "x86:LE:64:default",
"processor": "x86",
"addressSize": 64,
"project": "MyProject",
"serverPort": 8192,
"instanceCount": 1
}
}
```
### `GET /instances`
Lists all active plugin instances (one per open program in the Ghidra project). Each entry includes port, type, project, file, and links to connect.
### `GET /program`
Returns program metadata: language ID, compiler spec, image base address, memory size, and analysis status.
```json
{
"result": {
"name": "mybinary.exe",
"languageId": "x86:LE:64:default",
"compilerSpecId": "gcc",
"imageBase": "0x400000",
"memorySize": 1048576,
"analysisComplete": true
}
}
```
---
## Functions
### `GET /functions`
List functions. Supports pagination and search parameters (`name`, `name_contains`, `name_matches_regex`, `addr`).
```json
{
"result": [
{ "name": "main", "address": "0x401000" },
{ "name": "init_peripherals", "address": "0x08001cf0" }
],
"size": 150,
"offset": 0,
"limit": 50
}
```
### `POST /functions`
Create a function at an address. Body: `{ "address": "0x401000" }`.
### `GET /functions/{address}`
Get function details: name, signature, size, stack depth, calling convention, varargs status.
```json
{
"result": {
"name": "process_data",
"address": "0x4010a0",
"signature": "int process_data(char * data, int size)",
"size": 128,
"calling_convention": "__stdcall"
}
}
```
### `PATCH /functions/{address}`
Modify a function. Payload can include `name`, `signature`, and `comment`.
```json
{ "name": "calculate_checksum", "signature": "uint32_t calculate_checksum(uint8_t* buffer, size_t length)" }
```
### `DELETE /functions/{address}`
Delete the function definition at the specified address.
### `GET /functions/{address}/decompile`
Get decompiled C pseudocode. Optional query parameters:
| Parameter | Description |
|-----------|-------------|
| `syntax_tree` | `true` to include the syntax tree as JSON |
| `style` | Decompiler simplification style (e.g., `normalize`) |
| `timeout` | Decompilation timeout in seconds |
```json
{
"result": {
"address": "0x4010a0",
"ccode": "int process_data(char *param_1, int param_2)\n{\n ...\n}\n"
}
}
```
### `GET /functions/{address}/disassembly`
Get assembly listing. Supports pagination (`offset`, `limit`).
```json
{
"result": [
{ "address": "0x4010a0", "mnemonic": "PUSH", "operands": "RBP", "bytes": "55" },
{ "address": "0x4010a1", "mnemonic": "MOV", "operands": "RBP, RSP", "bytes": "4889E5" }
]
}
```
### `GET /functions/{address}/variables`
List local variables for a function. Supports name search.
### `PATCH /functions/{address}/variables/{variable_name}`
Modify a local variable. Payload: `{ "name": "new_name", "type": "int" }`.
---
## Data
### `GET /data`
List defined data items. Supports search (`name`, `name_contains`, `addr`, `type`) and pagination.
### `POST /data`
Define data at an address. Body: `{ "address": "0x402000", "type": "dword" }`.
### `GET /data/{address}`
Get data item details (type, size, value representation).
### `PATCH /data/{address}`
Modify a data item: change `name`, `type`, or `comment`.
### `DELETE /data/{address}`
Undefine the data item at the specified address.
### `GET /strings`
List defined strings. Supports pagination and a `filter` parameter for substring matching.
```json
{
"result": [
{ "address": "0x00401234", "value": "Hello, world!", "length": 14, "type": "string" },
{ "address": "0x00401250", "value": "Error: could not open file", "length": 26, "type": "string" }
]
}
```
---
## Structs
### `GET /structs`
List struct data types. Supports pagination and `category` filtering.
### `GET /structs?name={name}`
Get detailed struct information including all fields with offsets, types, and comments.
```json
{
"result": {
"name": "MyStruct",
"size": 16,
"category": "/custom",
"fields": [
{ "name": "id", "offset": 0, "length": 4, "type": "int", "comment": "Unique identifier" },
{ "name": "flags", "offset": 4, "length": 4, "type": "dword", "comment": "" }
]
}
}
```
### `POST /structs/create`
Create a struct. Body: `{ "name": "NetworkPacket", "category": "/network" }`.
### `POST /structs/addfield`
Add a field. Body: `{ "struct": "NetworkPacket", "fieldName": "header", "fieldType": "dword" }`.
### `POST /structs/updatefield`
Update a field. Identify by `fieldName` or `fieldOffset`, then provide `newName`, `newType`, and/or `newComment`.
### `POST /structs/delete`
Delete a struct. Body: `{ "name": "NetworkPacket" }`.
---
## Symbols
### `GET /symbols`
List all symbols. Supports search and pagination. Can filter by `type` (`function`, `data`, `label`).
### `POST /symbols`
Create or rename a symbol. Body: `{ "address": "0x401000", "name": "my_label" }`.
### `PATCH /symbols/{address}`
Modify a symbol (rename, change namespace, set as primary).
### `DELETE /symbols/{address}`
Remove the symbol at the specified address.
---
## Memory
### `GET /memory/{address}`
Read bytes from memory.
| Parameter | Description |
|-----------|-------------|
| `length` | Number of bytes (required, server-imposed max) |
| `format` | `hex`, `base64`, or `string` (default: `hex`) |
```json
{
"result": {
"address": "0x402000",
"length": 16,
"format": "hex",
"bytes": "48656C6C6F20576F726C642100000000"
}
}
```
### `PATCH /memory/{address}`
Write bytes. Body: `{ "bytes": "DEADBEEF", "format": "hex" }`. Use with caution.
---
## Segments
### `GET /segments`
List memory segments (`.text`, `.data`, `.bss`, etc.) with address ranges, sizes, and R/W/X permissions.
### `GET /segments/{name}`
Get details for a specific segment.
---
## Cross-References
### `GET /xrefs`
Find cross-references. At least one query parameter is required:
| Parameter | Description |
|-----------|-------------|
| `to_addr` | References pointing to this address |
| `from_addr` | References originating from this address |
| `type` | Filter: `CALL`, `READ`, `WRITE`, `DATA`, `POINTER` |
Supports pagination.
---
## Analysis
### `GET /analysis`
Get analysis status and list of available analyzers.
```json
{
"result": {
"program": "mybinary.exe",
"analysis_enabled": true,
"available_analyzers": [
"Function Start Analyzer",
"Reference Analyzer",
"Decompiler Parameter ID"
]
}
}
```
### `POST /analysis`
Trigger re-analysis of the program.
### `GET /analysis/callgraph`
Generate a call graph.
| Parameter | Default | Description |
|-----------|---------|-------------|
| `function` | *entry point* | Starting function name |
| `max_depth` | `3` | Maximum call depth |
Returns `nodes` (functions) and `edges` (calls between them with call-site addresses).
### `GET /analysis/dataflow`
Trace data flow from an address.
| Parameter | Default | Description |
|-----------|---------|-------------|
| `address` | *required* | Starting address |
| `direction` | `forward` | `forward` or `backward` |
| `max_steps` | `50` | Maximum analysis steps |
Returns a list of `steps`, each with an address, instruction, and description.

View File

@ -1,6 +1,6 @@
[project] [project]
name = "mcghidra" name = "mcghidra"
version = "2026.3.6.1" version = "2026.3.7"
description = "Reverse engineering bridge: multi-instance Ghidra plugin with HATEOAS REST API and MCP server for decompilation, analysis & binary manipulation" description = "Reverse engineering bridge: multi-instance Ghidra plugin with HATEOAS REST API and MCP server for decompilation, analysis & binary manipulation"
readme = "README.md" readme = "README.md"
requires-python = ">=3.11" requires-python = ">=3.11"
@ -17,6 +17,11 @@ dependencies = [
[project.scripts] [project.scripts]
mcghidra = "mcghidra:main" mcghidra = "mcghidra:main"
[project.urls]
Documentation = "https://mcghidra.warehack.ing"
Repository = "https://git.supported.systems/MCP/mcghidra"
Issues = "https://git.supported.systems/MCP/mcghidra/issues"
[build-system] [build-system]
requires = ["hatchling"] requires = ["hatchling"]
build-backend = "hatchling.build" build-backend = "hatchling.build"