From 3fd0cf499a43ed8ec1acc93ae944124de51ad17b Mon Sep 17 00:00:00 2001 From: Teal Bauer Date: Mon, 14 Apr 2025 21:37:42 +0200 Subject: [PATCH] docs: Update README for v2.0.0-beta.1 - Add comprehensive description of v2.0.0 features and capabilities - Update API reference to include all available tools and operations - Document HATEOAS architecture and response format - Add detailed examples of using the new data manipulation API - Update installation instructions for v2.0.0-beta.1 --- README.md | 200 +++++++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 161 insertions(+), 39 deletions(-) diff --git a/README.md b/README.md index 16fe73c..7cee624 100644 --- a/README.md +++ b/README.md @@ -1,56 +1,94 @@ [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://www.apache.org/licenses/LICENSE-2.0) [![GitHub release (latest by date)](https://img.shields.io/github/v/release/teal-bauer/GhydraMCP)](https://github.com/teal-bauer/GhydraMCP/releases) +[![API Version](https://img.shields.io/badge/API-v2.0-orange)](https://github.com/teal-bauer/GhydraMCP/blob/main/GHIDRA_HTTP_API.md) [![GitHub stars](https://img.shields.io/github/stars/teal-bauer/GhydraMCP)](https://github.com/teal-bauer/GhydraMCP/stargazers) [![GitHub forks](https://img.shields.io/github/forks/teal-bauer/GhydraMCP)](https://github.com/teal-bauer/GhydraMCP/network/members) [![GitHub contributors](https://img.shields.io/github/contributors/teal-bauer/GhydraMCP)](https://github.com/teal-bauer/GhydraMCP/graphs/contributors) [![Build Status](https://github.com/teal-bauer/GhydraMCP/actions/workflows/build.yml/badge.svg)](https://github.com/teal-bauer/GhydraMCP/actions/workflows/build.yml) -# GhydraMCP +# GhydraMCP v2.0 -GhydraMCP is a bridge between [Ghidra](https://ghidra-sre.org/) and AI assistants that enables AI-assisted reverse engineering through the [Model Context Protocol (MCP)](https://github.com/modelcontextprotocol/mcp). +GhydraMCP is a powerful bridge between [Ghidra](https://ghidra-sre.org/) and AI assistants that enables comprehensive AI-assisted reverse engineering through the [Model Context Protocol (MCP)](https://github.com/modelcontextprotocol/mcp). ![GhydraMCP logo](https://github.com/user-attachments/assets/86b9b2de-767c-4ed5-b082-510b8109f00f) ## Overview -GhydraMCP consists of: +GhydraMCP v2.0 integrates three key components: -1. **Ghidra Plugin**: Exposes Ghidra's powerful reverse engineering capabilities through a REST API -2. **MCP Bridge**: A Python script that translates MCP requests into API calls -3. **Multi-instance Support**: Connect multiple Ghidra instances to analyze different binaries simultaneously +1. **Modular Ghidra Plugin**: Exposes Ghidra's powerful reverse engineering capabilities through a HATEOAS-driven REST API +2. **MCP Bridge**: A Python script that translates MCP requests into API calls with comprehensive type checking +3. **Multi-instance Architecture**: Connect multiple Ghidra instances to analyze different binaries simultaneously -This allows AI assistants like Claude to directly: -- Decompile functions and analyze binary code -- Understand program structure, function relationships, and data types -- Perform binary analysis tasks (identify cross-references, data flow, etc.) -- Make meaningful changes to the analysis (rename functions, add comments, etc.) +This architecture enables AI assistants like Claude to seamlessly: +- Decompile and analyze binary code with customizable output formats +- Map program structures, function relationships, and complex data types +- Perform advanced binary analysis (cross-references, call graphs, data flow, etc.) +- Make precise modifications to the analysis (rename, annotate, create/delete/modify data, etc.) +- Read memory directly and manipulate binary at a low level +- Navigate resources through discoverable HATEOAS links -GhydraMCP is based on [GhidraMCP by Laurie Wired](https://github.com/LaurieWired/GhidraMCP/) with added multi-instance support and numerous enhancements. +GhydraMCP is based on [GhidraMCP by Laurie Wired](https://github.com/LaurieWired/GhidraMCP/) but has evolved into a comprehensive reverse engineering platform with enhanced multi-instance support, extensive data manipulation capabilities, and a robust HATEOAS-compliant API architecture. # Features -GhydraMCP combines a Ghidra plugin with an MCP server to provide a comprehensive set of reverse engineering capabilities to AI assistants: +GhydraMCP version 2.0 provides a comprehensive set of reverse engineering capabilities to AI assistants through its HATEOAS-driven API: -## Program Analysis +## Advanced Program Analysis -- **Decompilation**: Convert binary functions to readable C code -- **Static Analysis**: - - Cross-reference analysis (find who calls what) - - Data flow analysis +- **Enhanced Decompilation**: + - Convert binary functions to readable C code + - Toggle between clean C-like pseudocode and raw decompiler output + - Show/hide syntax trees for detailed analysis + - Multiple simplification styles for different analysis approaches + +- **Comprehensive Static Analysis**: + - Cross-reference analysis (find callers and callees) + - Complete call graph generation and traversal + - Data flow analysis with variable tracking - Type propagation and reconstruction + - Function relationship mapping + +- **Memory Operations**: + - Direct memory reading with hex and raw byte representation + - Address space navigation and mapping + - Memory segment analysis + - **Symbol Management**: - View and analyze imports and exports - Identify library functions and dependencies + - Symbol table exploration and manipulation + - Namespace hierarchy visualization ## Interactive Reverse Engineering - **Code Understanding**: - - Explore function code and relationships - - Analyze data structures and types -- **Annotation**: + - Explore function code with rich context + - Analyze data structures and complex types + - View disassembly with linking to decompiled code + - Examine function prototypes and signatures + +- **Comprehensive Annotation**: - Rename functions, variables, and data - - Add comments and documentation + - Add multiple comment types (EOL, plate, pre/post) - Create and modify data types + - Set and update function signatures and prototypes + +## Complete Data Manipulation + +- **Data Creation and Management**: + - Create new data items with specified types + - Delete existing data items + - Rename data items with proper scope handling + - Set and update data types for existing items + - Combined rename and retype operations + - Type definition management + +- **Function Manipulation**: + - Rename functions with proper scoping + - Update function signatures with parameter information + - Modify local variable names and types + - Set function return types ## Multi-instance Support @@ -59,12 +97,16 @@ GhydraMCP combines a Ghidra plugin with an MCP server to provide a comprehensive - Connect to specific instances using port numbers - Auto-discovery of running Ghidra instances - Instance metadata with project and file information +- Plugin version and API checking for compatibility -## Program Navigation +## Program Navigation and Discovery - List and search functions, classes, and namespaces - View memory segments and layout - Search by name, pattern, or signature +- Resource discovery through HATEOAS links +- Pagination for handling large result sets +- Filtering capabilities across all resources # Installation @@ -79,7 +121,7 @@ First, download the latest [release](https://github.com/teal-bauer/GhydraMCP/rel 1. Run Ghidra 2. Select `File` -> `Install Extensions` 3. Click the `+` button -4. Select the `GhydraMCP-1.1.zip` (or your chosen version) from the downloaded release +4. Select the `GhydraMCP-2.0.0-beta.1.zip` (or your chosen version) from the downloaded release 5. Restart Ghidra 6. Make sure the GhydraMCPPlugin is enabled in `File` -> `Configure` -> `Developer` @@ -99,16 +141,16 @@ https://github.com/user-attachments/assets/75f0c176-6da1-48dc-ad96-c182eb4648c3 Theoretically, any MCP client should work with GhydraMCP. Two examples are given below. -## API Reference +## API Reference (Updated for v2.0) ### Available Tools **Program Analysis**: -- `list_methods`: List all functions (params: offset, limit) +- `list_functions`: List all functions (params: offset, limit) - `list_classes`: List all classes/namespaces (params: offset, limit) -- `decompile_function`: Get decompiled C code (params: name) -- `rename_function`: Rename a function (params: old_name, new_name) -- `rename_data`: Rename data at address (params: address, new_name) +- `decompile_function`: Get decompiled C code (params: name or address) +- `get_function`: Get function details (params: name or address) +- `get_callgraph`: Get function call graph (params: address) - `list_segments`: View memory segments (params: offset, limit) - `list_imports`: List imported symbols (params: offset, limit) - `list_exports`: List exported functions (params: offset, limit) @@ -116,6 +158,23 @@ Theoretically, any MCP client should work with GhydraMCP. Two examples are given - `list_data_items`: View data labels (params: offset, limit) - `search_functions_by_name`: Find functions (params: query, offset, limit) +**Function Operations**: +- `rename_function`: Rename a function (params: name, new_name) +- `set_function_signature`: Update function prototype (params: address, signature) +- `set_comment`: Add comments (params: address, comment, comment_type) +- `remove_comment`: Remove comments (params: address, comment_type) + +**Memory Operations**: +- `read_memory`: Read bytes from memory (params: address, length) +- `get_disassembly`: Get disassembled instructions (params: address, length) + +**Data Manipulation**: +- `create_data`: Create new data at address (params: address, data_type) +- `delete_data`: Delete data at address (params: address) +- `set_data_type`: Change data type at address (params: address, data_type) +- `rename_data`: Rename data at address (params: address, name) +- `update_data`: Update both name and type (params: address, name, data_type) + **Instance Management**: - `list_instances`: List active Ghidra instances (no params) - `register_instance`: Register new instance (params: port, url) @@ -126,6 +185,23 @@ Theoretically, any MCP client should work with GhydraMCP. Two examples are given ```python # Program analysis client.use_tool("ghydra", "decompile_function", {"name": "main"}) +client.use_tool("ghydra", "get_function", {"address": "0x00401000"}) +client.use_tool("ghydra", "get_callgraph", {"address": "0x00401000"}) + +# Memory and disassembly operations +client.use_tool("ghydra", "read_memory", {"address": "0x00401000", "length": 16}) +client.use_tool("ghydra", "get_disassembly", {"address": "0x00401000", "length": 32}) + +# Function operations +client.use_tool("ghydra", "set_function_signature", {"address": "0x00401000", "signature": "int main(int argc, char **argv)"}) +client.use_tool("ghydra", "set_comment", {"address": "0x00401100", "comment": "This instruction initializes the counter", "comment_type": "plate"}) + +# Data manipulation +client.use_tool("ghydra", "create_data", {"address": "0x00401234", "data_type": "int"}) +client.use_tool("ghydra", "set_data_type", {"address": "0x00401238", "data_type": "char *"}) +client.use_tool("ghydra", "rename_data", {"address": "0x00401234", "name": "my_variable"}) +client.use_tool("ghydra", "update_data", {"address": "0x00401238", "name": "ptr_var", "data_type": "char *"}) +client.use_tool("ghydra", "delete_data", {"address": "0x0040123C"}) # Instance management client.use_tool("ghydra", "register_instance", {"port": 8192, "url": "http://localhost:8192/"}) @@ -258,32 +334,78 @@ Based on this analysis, I can see these binaries communicate using a simple prot GhydraMCP uses structured JSON for all communication between the Python bridge and Java plugin. This ensures consistent and reliable data exchange. -## Response Format +## API Architecture -All responses follow a standard format: +GhydraMCP v2.0 implements a comprehensive HATEOAS-driven REST API that follows hypermedia design principles: + +### Core API Design + +- **HATEOAS Architecture**: Each response includes navigational links for resource discovery +- **Versioned Endpoints**: All requests verified against API version for compatibility +- **Structured Responses**: Standardized JSON format with consistent field naming +- **Proper HTTP Methods**: GET for retrieval, POST for creation, PATCH for updates, DELETE for removal +- **Appropriate Status Codes**: Uses standard HTTP status codes for clear error handling + +### Response Format + +All responses follow this HATEOAS-driven format: ```json { + "id": "req-123", + "instance": "http://localhost:8192", "success": true, "result": "...", "timestamp": 1712159482123, - "port": 8192, - "instanceType": "base" + "_links": { + "self": {"href": "/endpoint/current"}, + "related": [ + {"href": "/endpoint/related1", "name": "Related Resource 1"}, + {"href": "/endpoint/related2", "name": "Related Resource 2"} + ] + } } ``` -Error responses include additional information: +For list responses, pagination information is included: ```json { - "success": false, - "error": "Error message", - "status_code": 404, - "timestamp": 1712159482123 + "id": "req-123", + "instance": "http://localhost:8192", + "success": true, + "result": [ ... objects ... ], + "size": 150, + "offset": 0, + "limit": 50, + "_links": { + "self": { "href": "/functions?offset=0&limit=50" }, + "next": { "href": "/functions?offset=50&limit=50" }, + "prev": { "href": "/functions?offset=0&limit=50" } + } } ``` -This structured approach makes the communication more reliable and easier to debug. +Error responses include detailed information: + +```json +{ + "id": "req-123", + "instance": "http://localhost:8192", + "success": false, + "error": { + "code": "RESOURCE_NOT_FOUND", + "message": "Function 'main' not found in current program" + }, + "status_code": 404, + "timestamp": 1712159482123, + "_links": { + "self": {"href": "/functions/main"} + } +} +``` + +This HATEOAS approach enables resource discovery and self-documenting APIs, making integration and exploration significantly easier. # Testing