Compare commits

..

No commits in common. "04f30114138e2ff00ff17a9831007763a3f114bb" and "f1986db6cc5fd97fd1e813ac3d173f8d955b160c" have entirely different histories.

3 changed files with 624 additions and 280 deletions

View File

@ -59,12 +59,11 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
### Changed
- **Docker Port Allocation:** Ports are now auto-allocated from pool (8192-8223) instead of client-specified. Prevents session collisions in multi-agent environments.
- **docker_auto_start:** Removed `wait` and `timeout` parameters. Always returns immediately after starting container.
- **Removed docker_wait tool:** This tool blocked for up to 5 minutes in a single call. LLMs should poll `docker_health(port)` in their own loop instead — this gives visibility into progress and ability to check logs between polls.
- **docker_auto_start:** Default `wait=False` for immediate return. Use `docker_wait` separately to poll for container readiness.
### Fixed
- **instances_use Hanging:** Eliminated 4+ hour hangs by removing blocking HTTP call. Now uses lazy registration — just creates a stub entry, validates on first real tool call.
- **All Docker Operations Non-Blocking:** ALL Docker subprocess calls (`docker ps`, `docker run`, `docker stop`, etc.) now run in thread executor via `run_in_executor()`. Previously only `docker_health` was fixed, but `docker_status`, `docker_start`, `docker_stop`, `docker_logs`, `docker_build`, and `docker_cleanup` still blocked the event loop. This caused `docker_auto_start(wait=True)` to freeze the MCP server.
- **Event Loop Blocking:** `docker_health` now runs HTTP checks in thread executor via `run_in_executor()`, preventing MCP server freeze during health polling.
- **Session Isolation:** `docker_stop` now validates container belongs to current session before stopping. `docker_cleanup` defaults to `session_only=True` to prevent cross-session interference.
- **Background Discovery Thread:** Fixed timeout from 30s to 0.5s for port scanning, reducing discovery cycle from 300s+ to ~15s.
- **Typedef/Variable Type Resolution:** Fixed `handle_typedef_create` and `handle_variable_rename` to use shared `resolve_data_type()` for builtin types (int, char, etc.).

753
README.md
View File

@ -1,302 +1,601 @@
# GhydraMCP
**AI-native reverse engineering.** Give Claude (or any MCP client) direct access to Ghidra's analysis engine.
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ "Analyze the authentication bypass in this firmware" │
│ │
│ Claude: I'll decompile the auth functions and trace the validation logic. │
│ │
│ [functions_list grep="auth|login|verify"] │
│ [functions_decompile name="verify_password"] │
│ [xrefs_list to_addr="0x0040156c"] │
│ [analysis_get_dataflow address="0x00401234" direction="backward"] │
│ │
│ Found it. The password check at 0x401580 compares against a hardcoded │
│ hash, but there's a debug backdoor at 0x401590 that bypasses validation │
│ when the username starts with "debug_". Let me show you the call graph... │
└─────────────────────────────────────────────────────────────────────────────┘
```
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://www.apache.org/licenses/LICENSE-2.0)
[![GitHub release (latest by date)](https://img.shields.io/github/v/release/starsong-consulting/GhydraMCP)](https://github.com/starsong-consulting/GhydraMCP/releases)
[![API Version](https://img.shields.io/badge/API-v2.1-orange)](https://github.com/starsong-consulting/GhydraMCP/blob/main/GHIDRA_HTTP_API.md)
[![GitHub stars](https://img.shields.io/github/stars/starsong-consulting/GhydraMCP)](https://github.com/starsong-consulting/GhydraMCP/stargazers)
[![GitHub forks](https://img.shields.io/github/forks/starsong-consulting/GhydraMCP)](https://github.com/starsong-consulting/GhydraMCP/network/members)
[![GitHub contributors](https://img.shields.io/github/contributors/starsong-consulting/GhydraMCP)](https://github.com/starsong-consulting/GhydraMCP/graphs/contributors)
[![Build Status](https://github.com/starsong-consulting/GhydraMCP/actions/workflows/build.yml/badge.svg)](https://github.com/starsong-consulting/GhydraMCP/actions/workflows/build.yml)
## What You Get
# GhydraMCP v2.1
**64 MCP tools** across 12 categories:
GhydraMCP is a powerful bridge between [Ghidra](https://ghidra-sre.org/) and AI assistants that enables comprehensive AI-assisted reverse engineering through the [Model Context Protocol (MCP)](https://github.com/modelcontextprotocol/mcp).
| Category | Tools | What it does |
|----------|-------|--------------|
| **Functions** | 11 | Decompile, disassemble, rename, set signatures, list variables |
| **Data** | 8 | Create/modify data items, list strings, set types |
| **Structs** | 7 | Create structs, add/update fields, manage data types |
| **Symbols** | 9 | Create labels, rename symbols, list imports/exports |
| **Analysis** | 6 | Call graphs, data flow, cross-references, run analysis |
| **Memory** | 2 | Read/write raw bytes |
| **Variables** | 4 | List/rename function variables, set types |
| **Bookmarks** | 3 | Create/list/delete analysis bookmarks |
| **Enums/Typedefs** | 4 | Create enum and typedef data types |
| **Namespaces** | 2 | List namespaces and classes |
| **Segments** | 1 | List memory segments with permissions |
| **Docker** | 7 | Auto-start containers, health checks, session management |
![GhydraMCP logo](https://github.com/user-attachments/assets/86b9b2de-767c-4ed5-b082-510b8109f00f)
**13 analysis prompts** for common RE workflows:
- `malware_triage` — Quick capability assessment
- `identify_crypto` — Find crypto functions and constants
- `find_authentication` — Locate auth, license checks, credentials
- `analyze_protocol` — Reverse network/file protocols
- `trace_data_flow` — Taint analysis through functions
- And 8 more specialized prompts...
## Overview
**11 MCP resources** for quick enumeration without tool calls.
GhydraMCP v2.1 integrates three key components:
---
1. **Modular Ghidra Plugin**: Exposes Ghidra's powerful reverse engineering capabilities through a HATEOAS-driven REST API
2. **MCP Bridge**: A Python script that translates MCP requests into API calls with comprehensive type checking
3. **Multi-instance Architecture**: Connect multiple Ghidra instances to analyze different binaries simultaneously
## Quick Start
This architecture enables AI assistants like Claude to seamlessly:
- Decompile and analyze binary code with customizable output formats
- Map program structures, function relationships, and complex data types
- Perform advanced binary analysis (cross-references, call graphs, data flow, etc.)
- Make precise modifications to the analysis (rename, annotate, create/delete/modify data, etc.)
- Read memory directly and manipulate binary at a low level
- Navigate resources through discoverable HATEOAS links
### Option 1: Docker (Easiest)
GhydraMCP is based on [GhidraMCP by Laurie Wired](https://github.com/LaurieWired/GhidraMCP/) but has evolved into a comprehensive reverse engineering platform with enhanced multi-instance support, extensive data manipulation capabilities, and a robust HATEOAS-compliant API architecture.
No Ghidra installation needed. Analyze binaries in isolated containers.
# Features
```bash
# Build the image (once)
cd GhydraMCP && docker build -t ghydramcp:latest -f docker/Dockerfile .
GhydraMCP version 2.1 provides a comprehensive set of reverse engineering capabilities to AI assistants through its HATEOAS-driven API:
# Add to your MCP config
claude mcp add ghydramcp -- uv run --directory /path/to/GhydraMCP ghydramcp
```
## Advanced Program Analysis
Then in Claude:
```
Analyze /path/to/suspicious.exe
```
- **Enhanced Decompilation**:
- Convert binary functions to readable C code
- Toggle between clean C-like pseudocode and raw decompiler output
- Show/hide syntax trees for detailed analysis
- Multiple simplification styles for different analysis approaches
- **Comprehensive Static Analysis**:
- Cross-reference analysis (find callers and callees)
- Complete call graph generation and traversal
- Data flow analysis with variable tracking
- Type propagation and reconstruction
- Function relationship mapping
- **Memory Operations**:
- Direct memory reading with hex and raw byte representation
- Address space navigation and mapping
- Memory segment analysis
Claude will auto-start a container, wait for analysis, and begin work.
- **Symbol Management**:
- View and analyze imports and exports
- Identify library functions and dependencies
- Symbol table exploration and manipulation
- Namespace hierarchy visualization
### Option 2: Native Ghidra
## Interactive Reverse Engineering
1. **Install the Ghidra plugin:**
- Download latest [release](https://github.com/starsong-consulting/GhydraMCP/releases)
- In Ghidra: `File → Install Extensions → +` → select the `.zip`
- Restart Ghidra
- Enable in `File → Configure → Developer → GhydraMCPPlugin`
- **Code Understanding**:
- Explore function code with rich context
- Analyze data structures and complex types
- View disassembly with linking to decompiled code
- Examine function prototypes and signatures
- **Comprehensive Annotation**:
- Rename functions, variables, and data
- Add multiple comment types (EOL, plate, pre/post)
- Create and modify data types
- Set and update function signatures and prototypes
2. **Add MCP server:**
```bash
claude mcp add ghydramcp -- uv run --directory /path/to/GhydraMCP ghydramcp
```
## Complete Data Manipulation
3. **Open a binary in Ghidra**, then ask Claude to analyze it.
- **Data Creation and Management**:
- Create new data items with specified types
- Delete existing data items
- Rename data items with proper scope handling
- Set and update data types for existing items
- Combined rename and retype operations
- Type definition management
---
- **Function Manipulation**:
- Rename functions with proper scoping
- Update function signatures with parameter information
- Modify local variable names and types
- Set function return types
## How It Works
## Multi-instance Support
```
┌──────────────┐ MCP ┌──────────────┐ HTTP ┌──────────────┐
│ Claude │◄────────────►│ GhydraMCP │◄────────────►│ Ghidra │
│ (or other │ stdio │ (Python) │ REST API │ Plugin │
│ MCP client) │ │ │ │ (Java) │
└──────────────┘ └──────────────┘ └──────────────┘
```
- Run multiple Ghidra instances simultaneously
- Analyze different binaries in parallel
- Connect to specific instances using port numbers
- Auto-discovery of running Ghidra instances
- Instance metadata with project and file information
- Plugin version and API checking for compatibility
- **Ghidra Plugin**: Exposes Ghidra's analysis via HTTP REST API (HATEOAS)
- **GhydraMCP Server**: Translates MCP tool calls to API requests
- **Multi-instance**: Analyze multiple binaries simultaneously on different ports
- **Session isolation**: Docker containers get unique ports, preventing conflicts
## Program Navigation and Discovery
---
- List and search functions, classes, and namespaces
- View memory segments and layout
- Search by name, pattern, or signature
- Resource discovery through HATEOAS links
- Pagination for handling large result sets
- Filtering capabilities across all resources
## Usage Patterns
# Installation
### Set Current Instance (Then Forget About Ports)
## Prerequisites
- Install [Ghidra](https://ghidra-sre.org)
- Python3
- MCP [SDK](https://github.com/modelcontextprotocol/python-sdk)
## Ghidra
First, download the latest [release](https://github.com/teal-bauer/GhydraMCP/releases) from this repository. The "Complete" artifact contains the zipped Ghidra plugin and the Python MCP bridge. Unpack the outer archive, then, add the plugin to Ghidra:
1. Run Ghidra
2. Select `File` -> `Install Extensions`
3. Click the `+` button
4. Select the `GhydraMCP-[version].zip` file from the downloaded release
5. Restart Ghidra
6. Make sure the GhydraMCPPlugin is enabled in `File` -> `Configure` -> `Developer`
> **Note:** By default, the first CodeBrowser opened in Ghidra gets port 8192, the second gets 8193, and so on. You can check which ports are being used by looking at the Console in the Ghidra main (project) window - click the computer icon in the bottom right to "Open Console". Look for log entries like:
> ```
> (HydraMCPPlugin) Plugin loaded on port 8193
> (HydraMCPPlugin) HydraMCP HTTP server started on port 8193
> ```
>
> GhydraMCP now includes auto-discovery of running Ghidra instances, so manually registering each instance is typically not necessary. The MCP bridge will automatically discover and register instances on startup and periodically check for new ones.
Video Installation Guide:
https://github.com/user-attachments/assets/75f0c176-6da1-48dc-ad96-c182eb4648c3
## MCP Clients
GhydraMCP works with any MCP-compatible client using **stdio transport**. It has been tested and confirmed working with:
- **Claude Desktop** - Anthropic's official desktop application
- **Claude Code** - Anthropic's VS Code extension and CLI tool
- **Cline** - Popular VS Code extension for AI-assisted coding
See the [Client Setup](#client-setup) section below for detailed configuration instructions for each client.
## API Reference (Updated for v2.1)
### Available Tools
GhydraMCP v2.1 organizes tools into logical namespaces for better discoverability and organization:
**Instance Management** (`instances_*`):
- `instances_list`: List active Ghidra instances (auto-discovers on default host) - **use this first**
- `instances_discover`: Discover instances on a specific host (params: host [optional]) - **only use for non-default hosts**
- `instances_register`: Register new instance (params: port, url [optional])
- `instances_unregister`: Remove instance (params: port)
- `instances_use`: Set current working instance (params: port)
- `instances_current`: Get current working instance info
**Function Analysis** (`functions_*`):
- `functions_list`: List all functions (params: offset, limit, port [optional])
- `functions_get`: Get function details (params: name or address, port [optional])
- `functions_decompile`: Get decompiled C code (params: name or address, syntax_tree, style, timeout, port [optional])
- `functions_disassemble`: Get disassembled instructions (params: name or address, port [optional])
- `functions_create`: Create function at address (params: address, port [optional])
- `functions_rename`: Rename a function (params: old_name or address, new_name, port [optional])
- `functions_set_signature`: Update function prototype (params: name or address, signature, port [optional])
- `functions_get_variables`: Get function variables (params: name or address, port [optional])
- `functions_set_comment`: Set function comment (params: address, comment, port [optional])
**Data Manipulation** (`data_*`):
- `data_list`: List data items (params: offset, limit, addr, name, name_contains, port [optional])
- `data_list_strings`: List all defined strings (params: offset, limit, filter, port [optional])
- `data_create`: Create data at address (params: address, data_type, size [optional], port [optional])
- `data_rename`: Rename data item (params: address, name, port [optional])
- `data_delete`: Delete data item (params: address, port [optional])
- `data_set_type`: Change data type (params: address, data_type, port [optional])
**Struct Management** (`structs_*`):
- `structs_list`: List all struct data types (params: offset, limit, category [optional], port [optional])
- `structs_get`: Get detailed struct information (params: name, port [optional])
- `structs_create`: Create new struct (params: name, category [optional], description [optional], port [optional])
- `structs_add_field`: Add field to struct (params: struct_name, field_name, field_type, offset [optional], comment [optional], port [optional])
- `structs_update_field`: Update struct field (params: struct_name, field_name or field_offset, new_name [optional], new_type [optional], new_comment [optional], port [optional])
- `structs_delete`: Delete struct (params: name, port [optional])
**Memory Operations** (`memory_*`):
- `memory_read`: Read bytes from memory (params: address, length, format, port [optional])
- `memory_write`: Write bytes to memory (params: address, bytes_data, format, port [optional])
**Cross-References** (`xrefs_*`):
- `xrefs_list`: List cross-references (params: to_addr [optional], from_addr [optional], type [optional], offset, limit, port [optional])
**Analysis** (`analysis_*`):
- `analysis_run`: Trigger program analysis (params: port [optional], analysis_options [optional])
- `analysis_get_callgraph`: Get function call graph (params: name or address, max_depth, port [optional])
- `analysis_get_dataflow`: Perform data flow analysis (params: address, direction, max_steps, port [optional])
**Example Usage**:
```python
instances_list() # Discover running Ghidra instances
instances_use(port=8192) # Set as current
functions_list() # No port needed!
data_list_strings(grep="password") # Uses current instance
# Instance Management - Always start here
client.use_tool("ghydra", "instances_list") # Auto-discovers instances on localhost
client.use_tool("ghydra", "instances_use", {"port": 8192}) # Set working instance
client.use_tool("ghydra", "instances_current") # Check current instance
# Function Analysis
client.use_tool("ghydra", "functions_list", {"offset": 0, "limit": 100})
client.use_tool("ghydra", "functions_get", {"name": "main"})
client.use_tool("ghydra", "functions_decompile", {"address": "0x00401000"})
client.use_tool("ghydra", "functions_disassemble", {"name": "main"})
client.use_tool("ghydra", "functions_rename", {"address": "0x00401000", "new_name": "process_data"})
client.use_tool("ghydra", "functions_set_signature", {"address": "0x00401000", "signature": "int process_data(char* buf, int len)"})
client.use_tool("ghydra", "functions_set_comment", {"address": "0x00401000", "comment": "Main processing function"})
# Data Manipulation
client.use_tool("ghydra", "data_list_strings", {"filter": "password"}) # Find strings containing "password"
client.use_tool("ghydra", "data_list", {"offset": 0, "limit": 50})
client.use_tool("ghydra", "data_create", {"address": "0x00401234", "data_type": "int"})
client.use_tool("ghydra", "data_rename", {"address": "0x00401234", "name": "counter"})
client.use_tool("ghydra", "data_set_type", {"address": "0x00401238", "data_type": "char *"})
client.use_tool("ghydra", "data_delete", {"address": "0x0040123C"})
# Struct Management
client.use_tool("ghydra", "structs_create", {"name": "NetworkPacket", "category": "/network"})
client.use_tool("ghydra", "structs_add_field", {
"struct_name": "NetworkPacket",
"field_name": "header",
"field_type": "dword",
"comment": "Packet header"
})
client.use_tool("ghydra", "structs_add_field", {
"struct_name": "NetworkPacket",
"field_name": "data_ptr",
"field_type": "pointer"
})
client.use_tool("ghydra", "structs_update_field", {
"struct_name": "NetworkPacket",
"field_name": "header",
"new_name": "packet_header",
"new_comment": "Updated header field"
})
client.use_tool("ghydra", "structs_get", {"name": "NetworkPacket"})
client.use_tool("ghydra", "structs_list", {"category": "/network"})
# Memory Operations
client.use_tool("ghydra", "memory_read", {"address": "0x00401000", "length": 16, "format": "hex"})
client.use_tool("ghydra", "memory_write", {"address": "0x00401000", "bytes_data": "90909090", "format": "hex"})
# Cross-References
client.use_tool("ghydra", "xrefs_list", {"to_addr": "0x00401000"}) # Find callers
client.use_tool("ghydra", "xrefs_list", {"from_addr": "0x00401000"}) # Find callees
# Analysis
client.use_tool("ghydra", "analysis_get_callgraph", {"name": "main", "max_depth": 5})
client.use_tool("ghydra", "analysis_get_dataflow", {"address": "0x00401050", "direction": "forward"})
client.use_tool("ghydra", "analysis_run") # Trigger full analysis
```
### Docker Workflow
## Client Setup
```python
# Start container (returns immediately)
result = docker_auto_start(binary_path="/path/to/malware.exe")
# → {port: 8195, message: "Poll docker_health(port=8195)..."}
GhydraMCP works with any MCP-compatible client. Below are configuration examples for popular AI coding assistants.
# Poll until ready
while True:
health = docker_health(port=8195)
if health["healthy"]:
break
# Can check docker_logs() while waiting
### Installation Methods
# Register and use
instances_use(port=8195)
functions_list() # Ready to analyze
```
#### Recommended: Local Installation from Release
### Cursor-Based Pagination
Download the latest [release](https://github.com/starsong-consulting/GhydraMCP/releases) to ensure the bridge and plugin versions are in sync.
Large binaries can have 100K+ functions. Use cursors:
```python
result = functions_list(page_size=100)
# → {items: [...], cursor_id: "abc123", has_more: true}
# Get next page
cursor_next(cursor_id="abc123")
# Or filter server-side
functions_list(grep="crypto|encrypt", page_size=50)
```
### Analysis Prompts
Built-in prompts for common workflows:
```
/prompt malware_triage
/prompt identify_crypto
/prompt find_authentication
```
These guide Claude through systematic analysis with progress reporting.
---
## Configuration
### Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `GHIDRA_HYDRA_HOST` | `localhost` | Ghidra instance host |
| `GHIDRA_HYDRA_PORT` | `8192` | Default port |
### MCP Config Examples
**Claude Desktop** (`~/Library/Application Support/Claude/claude_desktop_config.json`):
```json
{
"mcpServers": {
"ghydramcp": {
"ghydra": {
"command": "uv",
"args": ["run", "--directory", "/path/to/GhydraMCP", "ghydramcp"]
"args": [
"run",
"/ABSOLUTE_PATH_TO/bridge_mcp_hydra.py"
],
"env": {
"GHIDRA_HYDRA_HOST": "localhost"
}
}
}
}
```
**Claude Code**:
```bash
claude mcp add ghydramcp -- uv run --directory /path/to/GhydraMCP ghydramcp
Replace `/ABSOLUTE_PATH_TO/` with the actual path to your `bridge_mcp_hydra.py` file.
> **Note:** You can also use `python` instead of `uv run`, but then you'll need to manually install the requirements first with `pip install mcp requests`.
#### Alternative: Direct from Repository with uvx
If you want to use the latest development version, you can run directly from the GitHub repository:
```json
{
"mcpServers": {
"ghydra": {
"command": "uvx",
"args": [
"--from",
"git+https://github.com/starsong-consulting/GhydraMCP",
"ghydramcp"
],
"env": {
"GHIDRA_HYDRA_HOST": "localhost"
}
}
}
}
```
---
> **Warning:** This method may pull a bridge version that's out of sync with your installed plugin. Only use this if you're tracking the latest development branch.
## Tool Reference
### Claude Desktop Configuration
### Instance Management
```
instances_list # Discover Ghidra instances (use this first!)
instances_use # Set current working instance
instances_current # Show current instance info
Add your chosen configuration method to your Claude Desktop configuration file:
- **macOS**: `~/Library/Application Support/Claude/claude_desktop_config.json`
- **Windows**: `%APPDATA%\Claude\claude_desktop_config.json`
### Claude Code Configuration
Claude Code automatically discovers MCP servers configured in Claude Desktop. If you've set up the configuration above, Claude Code will have access to GhydraMCP tools immediately.
Alternatively, you can configure Claude Code separately by adding the same configuration to the MCP settings in Claude Code's configuration.
### Cline Configuration
Cline (VS Code extension) uses a separate configuration file. To set up GhydraMCP with Cline:
1. Open VS Code with Cline installed
2. Click the "MCP Servers" icon in Cline's interface
3. Select the "Configure" tab
4. Click "Configure MCP Servers" to edit `cline_mcp_settings.json`
5. Add the following configuration:
```json
{
"mcpServers": {
"ghydra": {
"command": "uv",
"args": [
"run",
"/ABSOLUTE_PATH_TO/bridge_mcp_hydra.py"
],
"env": {
"GHIDRA_HYDRA_HOST": "localhost"
},
"disabled": false
}
}
}
```
### Function Analysis
```
functions_list # List functions (supports grep, pagination)
functions_get # Get function details by name or address
functions_decompile # Decompile to C pseudocode
functions_disassemble # Get assembly instructions
functions_rename # Rename a function
functions_set_signature # Set function prototype
functions_set_comment # Add decompiler comment
functions_create # Create function at address
functions_variables # List local variables and parameters
If you prefer to use `python` directly instead of `uv`:
```json
{
"mcpServers": {
"ghydra": {
"command": "python",
"args": [
"/ABSOLUTE_PATH_TO/bridge_mcp_hydra.py"
],
"env": {
"GHIDRA_HYDRA_HOST": "localhost"
},
"disabled": false
}
}
}
```
### Data Operations
> **Important:** Replace `/ABSOLUTE_PATH_TO/` with the actual absolute path to your `bridge_mcp_hydra.py` file. For example:
> - **Linux/macOS**: `/home/username/GhydraMCP/bridge_mcp_hydra.py`
> - **Windows**: `C:\\Users\\username\\GhydraMCP\\bridge_mcp_hydra.py`
After saving the configuration, restart Cline to load the GhydraMCP server.
### 5ire Configuration
1. Tool Key: ghydra
2. Name: GhydraMCP
3. Command: `uv run /ABSOLUTE_PATH_TO/bridge_mcp_hydra.py`
## Example Session
Below is an example of how an AI assistant might interact with Ghidra using GhydraMCP v2.1:
**User:**
```
data_list # List defined data items
data_list_strings # List strings (with grep filtering)
data_create # Define data at address
data_rename # Rename data item
data_set_type # Change data type
data_delete # Remove data definition
Analyze the main function in the binary and identify any data structures being used.
```
### Cross-References & Analysis
**Assistant:**
```
xrefs_list # Find cross-references to/from address
analysis_get_callgraph # Generate call graph
analysis_get_dataflow # Trace data flow forward/backward
analysis_run # Trigger Ghidra auto-analysis
I'll analyze the binary for you. Let me start by checking available Ghidra instances.
[Uses instances_list]
Found Ghidra instance on port 8192 analyzing "malware_sample.exe".
[Uses instances_use with port 8192]
Set instance 8192 as current working instance.
[Uses functions_decompile with name "main"]
Here's the decompiled main function:
void main(void) {
NetworkPacket *packet;
int status;
packet = (NetworkPacket *)malloc(0x20);
packet->header = 0x1234;
packet->data_ptr = allocate_buffer(0x100);
status = process_packet(packet);
...
}
I can see this code uses a "NetworkPacket" structure. Let me check if it's already defined.
[Uses structs_get with name "NetworkPacket"]
The struct doesn't exist yet. Let me create it based on the decompiled code.
[Uses structs_create with name "NetworkPacket", category "/network"]
Created NetworkPacket struct.
[Uses structs_add_field for each field]
Added fields:
- header (dword at offset 0)
- data_ptr (pointer at offset 4)
- size (dword at offset 8)
- status (dword at offset 12)
[Uses data_list_strings with filter "Error"]
I also found several error strings that suggest this binary handles network communications with error checking.
The main function appears to be the entry point for a network packet processing routine that allocates and processes data structures for network communication.
```
### Structs & Types
**User:**
```
structs_list # List struct definitions
structs_get # Get struct with all fields
structs_create # Create new struct
structs_add_field # Add field to struct
structs_update_field # Modify existing field
structs_delete # Remove struct
enums_list / enums_create
typedefs_list / typedefs_create
Can you trace where process_packet is called from and show me the call graph?
```
### Docker Management
**Assistant:**
```
docker_auto_start # Start container for binary (auto port allocation)
docker_health # Check if container API is responding
docker_status # List all containers and images
docker_start # Manual container start
docker_stop # Stop container (session-scoped)
docker_logs # Get container logs
docker_cleanup # Remove orphaned containers
I'll trace the call relationships for process_packet.
[Uses xrefs_list with to_addr for process_packet]
Found 3 callers of process_packet:
- main at 0x401000
- worker_thread at 0x401200
- handle_request at 0x401450
[Uses analysis_get_callgraph with name "process_packet", max_depth 3]
Here's the call graph starting from process_packet:
process_packet (0x401100)
├── validate_header (0x401150)
│ └── check_magic (0x401180)
├── parse_data (0x4011A0)
│ ├── extract_field (0x4011D0)
│ └── validate_checksum (0x4011E0)
└── send_response (0x401220)
└── network_send (0x401250)
This shows process_packet coordinates validation, parsing, and response transmission.
```
See `--help` or the [API docs](GHIDRA_HTTP_API.md) for full parameter details.
# JSON Communication
---
GhydraMCP uses structured JSON for all communication between the Python bridge and Java plugin. This ensures consistent and reliable data exchange.
## Building from Source
## API Architecture
```bash
# Clone
git clone https://github.com/starsong-consulting/GhydraMCP
cd GhydraMCP
GhydraMCP v2.1 implements a comprehensive HATEOAS-driven REST API that follows hypermedia design principles:
# Build Ghidra plugin
### Core API Design
- **HATEOAS Architecture**: Each response includes navigational links for resource discovery
- **Versioned Endpoints**: All requests verified against API version for compatibility
- **Structured Responses**: Standardized JSON format with consistent field naming
- **Proper HTTP Methods**: GET for retrieval, POST for creation, PATCH for updates, DELETE for removal
- **Appropriate Status Codes**: Uses standard HTTP status codes for clear error handling
### Response Format
All responses follow this HATEOAS-driven format:
```json
{
"id": "req-123",
"instance": "http://localhost:8192",
"success": true,
"result": "...",
"timestamp": 1712159482123,
"_links": {
"self": {"href": "/endpoint/current"},
"related": [
{"href": "/endpoint/related1", "name": "Related Resource 1"},
{"href": "/endpoint/related2", "name": "Related Resource 2"}
]
}
}
```
For list responses, pagination information is included:
```json
{
"id": "req-123",
"instance": "http://localhost:8192",
"success": true,
"result": [ ... objects ... ],
"size": 150,
"offset": 0,
"limit": 50,
"_links": {
"self": { "href": "/functions?offset=0&limit=50" },
"next": { "href": "/functions?offset=50&limit=50" },
"prev": { "href": "/functions?offset=0&limit=50" }
}
}
```
Error responses include detailed information:
```json
{
"id": "req-123",
"instance": "http://localhost:8192",
"success": false,
"error": {
"code": "RESOURCE_NOT_FOUND",
"message": "Function 'main' not found in current program"
},
"status_code": 404,
"timestamp": 1712159482123,
"_links": {
"self": {"href": "/functions/main"}
}
}
```
This HATEOAS approach enables resource discovery and self-documenting APIs, making integration and exploration significantly easier.
# Testing
GhydraMCP includes comprehensive test suites for both the HTTP API and MCP bridge. See [TESTING.md](TESTING.md) for details on running the tests.
## HTTP API Tests
Tests the HTTP endpoints exposed by the Java plugin:
- Response format and structure
- JSON structure consistency
- Required fields in responses
- Error handling
## MCP Bridge Tests
Tests the MCP bridge functionality:
- MCP protocol communication
- Tool availability and structure
- Response format and structure
- JSON structure consistency
# Building from Source
You can build different artifacts with Maven:
## Build Everything (Default)
Build both the Ghidra plugin and the complete package:
```
mvn clean package
# → target/GhydraMCP-[version].zip
# Build Docker image
docker build -t ghydramcp:latest -f docker/Dockerfile .
# Run MCP server (for development)
uv run ghydramcp
```
---
This creates:
- `target/GhydraMCP-[version].zip` - The Ghidra plugin only
- `target/GhydraMCP-Complete-[version].zip` - Complete package with plugin and bridge script
## Architecture
## Build Ghidra Plugin Only
If you only need the Ghidra plugin:
GhydraMCP is designed for AI agents:
```
mvn clean package -P plugin-only
```
- **Lazy registration**: `instances_use` doesn't block — validates on first real call
- **Non-blocking I/O**: All Docker/HTTP operations run in thread executors
- **Session isolation**: Each MCP session gets unique container ports
- **Cursor pagination**: Handle 100K+ item responses without context overflow
- **Server-side grep**: Filter results before they hit the wire
## Build Complete Package Only
If you only need the combined package:
Based on [GhidraMCP by Laurie Wired](https://github.com/LaurieWired/GhidraMCP/), evolved into a comprehensive RE platform.
```
mvn clean package -P complete-only
```
---
## License
Apache 2.0
The Ghidra plugin includes these files required for Ghidra to recognize the extension:
- lib/GhydraMCP.jar
- extension.properties
- Module.manifest

View File

@ -240,10 +240,10 @@ class DockerMixin(MCPMixin):
"""Check if Docker is available on the system."""
return shutil.which("docker") is not None
def _run_docker_cmd_sync(
def _run_docker_cmd(
self, args: List[str], check: bool = True, capture: bool = True
) -> subprocess.CompletedProcess:
"""Run a docker command synchronously (internal use only).
"""Run a docker command.
Args:
args: Command arguments (after 'docker')
@ -261,26 +261,6 @@ class DockerMixin(MCPMixin):
text=True,
)
async def _run_docker_cmd(
self, args: List[str], check: bool = True, capture: bool = True
) -> subprocess.CompletedProcess:
"""Run a docker command without blocking the event loop.
Uses run_in_executor to run subprocess in thread pool.
Args:
args: Command arguments (after 'docker')
check: Raise exception on non-zero exit
capture: Capture stdout/stderr
Returns:
CompletedProcess result
"""
loop = asyncio.get_event_loop()
return await loop.run_in_executor(
None, self._run_docker_cmd_sync, args, check, capture
)
def _run_compose_cmd(
self,
args: List[str],
@ -356,7 +336,7 @@ class DockerMixin(MCPMixin):
f"{self.LABEL_PREFIX}.pid": str(os.getpid()),
}
async def _find_containers_by_label(
def _find_containers_by_label(
self,
label_filter: Optional[str] = None,
session_only: bool = False,
@ -379,7 +359,7 @@ class DockerMixin(MCPMixin):
if label_filter:
filter_args.extend(["--filter", f"label={self.LABEL_PREFIX}.{label_filter}"])
ps_result = await self._run_docker_cmd(
ps_result = self._run_docker_cmd(
[
"ps", "-a",
*filter_args,
@ -445,23 +425,23 @@ class DockerMixin(MCPMixin):
# Check if docker daemon is running
try:
await self._run_docker_cmd(["info"], check=True)
self._run_docker_cmd(["info"], check=True)
result["docker_running"] = True
except (subprocess.CalledProcessError, FileNotFoundError):
return result
# Check for docker compose
try:
await self._run_docker_cmd(["compose", "version"], check=True)
self._run_docker_cmd(["compose", "version"], check=True)
result["compose_available"] = True
except subprocess.CalledProcessError:
pass
# List all GhydraMCP containers (from any session)
result["containers"] = await self._find_containers_by_label()
result["containers"] = self._find_containers_by_label()
# List containers from this session only
result["session_containers"] = await self._find_containers_by_label(session_only=True)
result["session_containers"] = self._find_containers_by_label(session_only=True)
# Get port pool status
if self._port_pool:
@ -469,7 +449,7 @@ class DockerMixin(MCPMixin):
# Also check by name pattern for containers without labels
try:
ps_result = await self._run_docker_cmd(
ps_result = self._run_docker_cmd(
[
"ps",
"-a",
@ -498,7 +478,7 @@ class DockerMixin(MCPMixin):
# List GhydraMCP images
try:
images_result = await self._run_docker_cmd(
images_result = self._run_docker_cmd(
[
"images",
"--filter",
@ -577,7 +557,7 @@ class DockerMixin(MCPMixin):
try:
# Check if container with this name already exists
check_result = await self._run_docker_cmd(
check_result = self._run_docker_cmd(
["ps", "-a", "-q", "-f", f"name=^{name}$"], check=False
)
if check_result.stdout.strip():
@ -587,7 +567,7 @@ class DockerMixin(MCPMixin):
}
# Check if port is already in use by a non-pool container
port_check = await self._run_docker_cmd(
port_check = self._run_docker_cmd(
["ps", "-q", "-f", f"publish={port}"], check=False
)
if port_check.stdout.strip():
@ -603,7 +583,7 @@ class DockerMixin(MCPMixin):
label_args.extend(["-l", f"{k}={v}"])
# Start the container
run_result = await self._run_docker_cmd(
run_result = self._run_docker_cmd(
[
"run",
"-d",
@ -677,7 +657,7 @@ class DockerMixin(MCPMixin):
container_port = None
container_session = None
try:
inspect_result = await self._run_docker_cmd(
inspect_result = self._run_docker_cmd(
[
"inspect",
"--format",
@ -703,10 +683,10 @@ class DockerMixin(MCPMixin):
try:
# Stop the container
await self._run_docker_cmd(["stop", name_or_id])
self._run_docker_cmd(["stop", name_or_id])
if remove:
await self._run_docker_cmd(["rm", name_or_id])
self._run_docker_cmd(["rm", name_or_id])
# Release the port back to the pool
if container_port:
@ -759,7 +739,7 @@ class DockerMixin(MCPMixin):
args.append("-f")
args.append(name_or_id)
result = await self._run_docker_cmd(args)
result = self._run_docker_cmd(args)
return {
"success": True,
"container": name_or_id,
@ -823,7 +803,7 @@ class DockerMixin(MCPMixin):
args.append(str(proj_path))
# Run build (this can take a while)
result = await self._run_docker_cmd(args, capture=True)
result = self._run_docker_cmd(args, capture=True)
return {
"success": True,
@ -881,23 +861,63 @@ class DockerMixin(MCPMixin):
description="Check if a GhydraMCP container's API is responding",
)
async def docker_health(
self, port: Optional[int] = None, timeout: float = 5.0, ctx: Optional[Context] = None
self, port: int = 8192, timeout: float = 5.0, ctx: Optional[Context] = None
) -> Dict[str, Any]:
"""Check if a GhydraMCP container's API is healthy.
Args:
port: API port to check (uses current instance if not specified)
port: API port to check (default: 8192)
timeout: Request timeout in seconds
Returns:
Health status and API info if available
"""
port = self.get_instance_port(port)
loop = asyncio.get_event_loop()
return await loop.run_in_executor(
None, self._sync_health_check, port, timeout
)
@mcp_tool(
name="docker_wait",
description="Wait for a GhydraMCP container to become healthy",
)
async def docker_wait(
self,
port: int = 8192,
timeout: float = 300.0,
interval: float = 5.0,
ctx: Optional[Context] = None,
) -> Dict[str, Any]:
"""Wait for a GhydraMCP container to become healthy.
Polls the API endpoint until it responds or timeout is reached.
Args:
port: API port to check (default: 8192)
timeout: Maximum time to wait in seconds (default: 300)
interval: Polling interval in seconds (default: 5)
Returns:
Health status once healthy, or error on timeout
"""
start_time = time.time()
last_error = None
while (time.time() - start_time) < timeout:
result = await self.docker_health(port=port, timeout=interval, ctx=ctx)
if result.get("healthy"):
result["waited_seconds"] = round(time.time() - start_time, 1)
return result
last_error = result.get("error")
await asyncio.sleep(interval)
return {
"healthy": False,
"port": port,
"error": f"Timeout after {timeout}s waiting for container",
"last_error": last_error,
}
@mcp_tool(
name="docker_auto_start",
description="Automatically start a GhydraMCP container with dynamic port allocation",
@ -905,6 +925,8 @@ class DockerMixin(MCPMixin):
async def docker_auto_start(
self,
binary_path: str,
wait: bool = False,
timeout: float = 300.0,
ctx: Optional[Context] = None,
) -> Dict[str, Any]:
"""Automatically start a Docker container with intelligent port allocation.
@ -912,20 +934,19 @@ class DockerMixin(MCPMixin):
This is the main entry point for automatic Docker management:
1. Checks if a Ghidra instance with the SAME binary is already running
2. If not, allocates a port from the pool and starts a new container
3. Returns connection info immediately
3. Optionally waits for the container to become healthy
4. Returns connection info for the instance
Ports are auto-allocated from the pool (8192-8223) to prevent
conflicts between concurrent sessions.
After starting, poll docker_health(port) in a loop to check readiness.
This gives you visibility into progress and ability to check logs.
Args:
binary_path: Path to the binary to analyze
wait: Wait for container to be ready (default: False, use docker_wait separately)
timeout: Max wait time in seconds (default: 300)
Returns:
Instance connection info with session ID and port details.
Poll docker_health(port) to check when container is ready.
Instance connection info with session ID and port details
"""
import os
@ -976,6 +997,31 @@ class DockerMixin(MCPMixin):
actual_port = start_result.get("port")
if wait:
# Wait for the container to become healthy
wait_result = await self.docker_wait(port=actual_port, timeout=timeout, ctx=ctx)
if wait_result.get("healthy"):
return {
"source": "docker",
"session_id": self.session_id,
"container_id": start_result.get("container_id"),
"container_name": start_result.get("name"),
"port": actual_port,
"api_url": f"http://localhost:{actual_port}/",
"program": wait_result.get("program"),
"waited_seconds": wait_result.get("waited_seconds"),
"message": f"Docker container ready on port {actual_port} after {wait_result.get('waited_seconds')}s",
}
else:
return {
"warning": "Container started but not yet healthy",
"session_id": self.session_id,
"container_id": start_result.get("container_id"),
"port": actual_port,
"last_error": wait_result.get("error"),
"message": "Container may still be analyzing. Check docker_logs() for progress.",
}
return {
"source": "docker",
"session_id": self.session_id,
@ -983,7 +1029,7 @@ class DockerMixin(MCPMixin):
"container_name": start_result.get("name"),
"port": actual_port,
"api_url": f"http://localhost:{actual_port}/",
"message": f"Container starting on port {actual_port}. Poll docker_health(port={actual_port}), then call instances_use(port={actual_port}) when healthy.",
"message": f"Container starting on port {actual_port}. Use docker_wait() or docker_health() to check status.",
}
@mcp_tool(
@ -1025,12 +1071,12 @@ class DockerMixin(MCPMixin):
}
# Find orphaned containers
containers = await self._find_containers_by_label(session_only=session_only)
containers = self._find_containers_by_label(session_only=session_only)
for container in containers:
# Check if container is old enough to be considered orphaned
try:
inspect_result = await self._run_docker_cmd(
inspect_result = self._run_docker_cmd(
["inspect", "--format", "{{index .Config.Labels \"" + self.LABEL_PREFIX + ".started\"}}", container["id"]],
check=False,
)