From 5d6b202599b299b8b91b68c68a250d460a0267ff Mon Sep 17 00:00:00 2001 From: Teal Bauer Date: Sun, 13 Apr 2025 20:46:49 +0200 Subject: [PATCH] Restore the correct API doc --- GHIDRA_HTTP_API.md | 322 ++++++++++++++++++++++++++++++++++++ JAVA_PLUGIN_API.md | 399 --------------------------------------------- 2 files changed, 322 insertions(+), 399 deletions(-) create mode 100644 GHIDRA_HTTP_API.md delete mode 100644 JAVA_PLUGIN_API.md diff --git a/GHIDRA_HTTP_API.md b/GHIDRA_HTTP_API.md new file mode 100644 index 0000000..d056d80 --- /dev/null +++ b/GHIDRA_HTTP_API.md @@ -0,0 +1,322 @@ +# GhydraMCP Ghidra Plugin HTTP API v1 + +## Overview + +This API provides a Hypermedia-driven interface (HATEOAS) to interact with Ghidra's CodeBrowser, enabling AI-driven and automated reverse engineering workflows. It allows interaction with Ghidra projects, programs (binaries), functions, symbols, data, memory segments, cross-references, and analysis features. Programs are addressed by their unique identifier within Ghidra (`project:/path/to/file`). + +## General Concepts + +### Request Format + +- Use standard HTTP verbs: + - `GET`: Retrieve resources or lists. + - `POST`: Create new resources. + - `PATCH`: Modify existing resources partially. + - `PUT`: Replace existing resources entirely (Use with caution, `PATCH` is often preferred). + - `DELETE`: Remove resources. +- Request bodies for `POST`, `PUT`, `PATCH` should be JSON (`Content-Type: application/json`). +- Include an optional `X-Request-ID` header with a unique identifier for correlation. + +### Response Format + +All non-error responses are JSON (`Content-Type: application/json`) containing at least the following keys: + +```json +{ + "id": "[correlation identifier]", + "instance": "[instance url]", + "success": true, + "result": Object | Array, + "_links": { // Optional: HATEOAS links + "self": { "href": "/path/to/current/resource" }, + "related_resource": { "href": "/path/to/related" } + // ... other relevant links + } +} +``` + +- `id`: The identifier from the `X-Request-ID` header if provided, or a random opaque identifier otherwise. +- `instance`: The URL of the Ghidra plugin instance that handled the request. +- `success`: Boolean `true` for successful operations. +- `result`: The main data payload, either a single JSON object or an array of objects for lists. +- `_links`: (Optional) Contains HATEOAS-style links to related resources or actions, facilitating discovery. + +#### List Responses + +List results (arrays in `result`) will typically include pagination information and a total count: + +```json +{ + "id": "req-123", + "instance": "http://localhost:1337", + "success": true, + "result": [ ... objects ... ], + "size": 150, // Total number of items matching the query across all pages + "offset": 0, + "limit": 50, + "_links": { + "self": { "href": "/programs/proj:/file.bin/functions?offset=0&limit=50" }, + "next": { "href": "/programs/proj:/file.bin/functions?offset=50&limit=50" }, // Present if more items exist + "prev": { "href": "/programs/proj:/file.bin/functions?offset=0&limit=50" } // Present if not the first page + } +} +``` + +### Error Responses + +Errors use appropriate HTTP status codes (4xx, 5xx) and have a JSON payload with an `error` key: + +```json +{ + "id": "[correlation identifier]", + "instance": "[instance url]", + "success": false, + "error": { + "code": "RESOURCE_NOT_FOUND", // Optional: Machine-readable code + "message": "Descriptive error message" + // Potentially other details like invalid parameters + } +} +``` + +Common HTTP Status Codes: +- `200 OK`: Successful `GET`, `PATCH`, `PUT`, `DELETE`. +- `201 Created`: Successful `POST` resulting in resource creation. +- `204 No Content`: Successful `DELETE` or `PATCH`/`PUT` where no body is returned. +- `400 Bad Request`: Invalid syntax, missing required parameters, invalid data format. +- `401 Unauthorized`: Authentication required or failed (if implemented). +- `403 Forbidden`: Authenticated user lacks permission (if implemented). +- `404 Not Found`: Resource or endpoint does not exist, or query yielded no results. +- `405 Method Not Allowed`: HTTP verb not supported for this endpoint. +- `500 Internal Server Error`: Unexpected error within the Ghidra plugin. + +### Addressing and Searching + +Resources like functions, data, and symbols often exist at specific memory addresses and may have names. The primary identifier for a program is its Ghidra path, e.g., `myproject:/path/to/mybinary.exe`. + +- **By Address:** Use the resource's path with the address (hexadecimal, e.g., `0x401000` or `08000004`). + - Example: `GET /programs/myproject:/mybinary.exe/functions/0x401000` +- **Querying Lists:** List endpoints (e.g., `/functions`, `/symbols`, `/data`) support filtering via query parameters: + - `?addr=[address in hex]`: Find item at a specific address. + - `?name=[full_name]`: Find item(s) with an exact name match (case-sensitive). + - `?name_contains=[substring]`: Find item(s) whose name contains the substring (case-insensitive). + - `?name_matches_regex=[regex]`: Find item(s) whose name matches the Java-compatible regular expression. + +### Pagination + +List endpoints support pagination using query parameters: +- `?offset=[int]`: Number of items to skip (default: 0). +- `?limit=[int]`: Maximum number of items to return (default: implementation-defined, e.g., 100). + +## Meta Endpoints + +### `GET /plugin-version` +Returns the version of the running Ghidra plugin and its API. Essential for compatibility checks by clients like the MCP bridge. +```json +{ + "id": "req-meta-ver", + "instance": "http://localhost:1337", + "success": true, + "result": { + "plugin_version": "v1.4.0", // Example plugin build version + "api_version": 1 // Ordinal API version + }, + "_links": { + "self": { "href": "/plugin-version" } + } +} +``` + +## Resource Types + +Base path for all program-specific resources: `/programs/{program_id}` where `program_id` is the URL-encoded Ghidra identifier (e.g., `myproject%3A%2Fpath%2Fto%2Fmybinary.exe`). + +### 1. Projects + +Represents Ghidra projects, containers for programs. + +- **`GET /projects`**: List all available Ghidra projects. +- **`POST /projects`**: Create a new Ghidra project. Request body should specify `name` and optionally `directory`. +- **`GET /projects/{project_name}`**: Get details about a specific project (e.g., location, list of open programs within it via links). + +### 2. Programs + +Represents individual binaries loaded in Ghidra projects. + +- **`GET /programs`**: List all programs across all projects. Can be filtered by project (`?project={project_name}`). +- **`POST /programs`**: Load/import a new binary into a specified project. Request body needs `project_name`, `file_path`, and optionally `language_id`, `compiler_spec_id`, and loader options. Returns the newly created program resource details upon successful import and analysis (which might take time). +- **`GET /programs/{program_id}`**: Get metadata for a specific program (e.g., name, architecture, memory layout, analysis status). + ```json + // Example Response Fragment for GET /programs/myproject%3A%2Fmybinary.exe + "result": { + "program_id": "myproject:/mybinary.exe", + "name": "mybinary.exe", + "project": "myproject", + "language_id": "x86:LE:64:default", + "compiler_spec_id": "gcc", + "image_base": "0x400000", + "memory_size": 1048576, + "is_open": true, + "analysis_complete": true + // ... other metadata + }, + "_links": { + "self": { "href": "/programs/myproject%3A%2Fmybinary.exe" }, + "project": { "href": "/projects/myproject" }, + "functions": { "href": "/programs/myproject%3A%2Fmybinary.exe/functions" }, + "symbols": { "href": "/programs/myproject%3A%2Fmybinary.exe/symbols" }, + "data": { "href": "/programs/myproject%3A%2Fmybinary.exe/data" }, + "segments": { "href": "/programs/myproject%3A%2Fmybinary.exe/segments" }, + "memory": { "href": "/programs/myproject%3A%2Fmybinary.exe/memory" }, + "xrefs": { "href": "/programs/myproject%3A%2Fmybinary.exe/xrefs" }, + "analysis": { "href": "/programs/myproject%3A%2Fmybinary.exe/analysis" } + // Potentially actions like "close", "analyze" + } + ``` +- **`DELETE /programs/{program_id}`**: Close and potentially remove a program from its project (behavior depends on Ghidra state). + +### 3. Functions + +Represents functions within a program. Base path: `/programs/{program_id}/functions`. + +- **`GET /functions`**: List functions. Supports searching (by name/address/regex) and pagination. + ```json + // Example Response Fragment + "result": [ + { "name": "FUN_08000004", "address": "08000004", "_links": { "self": { "href": "/programs/proj%3A%2Ffile.bin/functions/08000004" } } }, + { "name": "init_peripherals", "address": "08001cf0", "_links": { "self": { "href": "/programs/proj%3A%2Ffile.bin/functions/08001cf0" } } } + ] + ``` +- **`POST /functions`**: Create a function at a specific address. Requires `address` in the request body. Returns the created function resource. +- **`GET /functions/{address}`**: Get details for a specific function (name, signature, size, stack info, etc.). + ```json + // Example Response Fragment for GET /programs/proj%3A%2Ffile.bin/functions/0x4010a0 + "result": { + "name": "process_data", + "address": "0x4010a0", + "signature": "int process_data(char * data, int size)", + "size": 128, + "stack_depth": 16, + "has_varargs": false, + "calling_convention": "__stdcall" + // ... other details + }, + "_links": { + "self": { "href": "/programs/proj%3A%2Ffile.bin/functions/0x4010a0" }, + "decompile": { "href": "/programs/proj%3A%2Ffile.bin/functions/0x4010a0/decompile" }, + "disassembly": { "href": "/programs/proj%3A%2Ffile.bin/functions/0x4010a0/disassembly" }, + "variables": { "href": "/programs/proj%3A%2Ffile.bin/functions/0x4010a0/variables" }, + "xrefs_to": { "href": "/programs/proj%3A%2Ffile.bin/xrefs?to_addr=0x4010a0" }, + "xrefs_from": { "href": "/programs/proj%3A%2Ffile.bin/xrefs?from_addr=0x4010a0" } + } + ``` +- **`PATCH /functions/{address}`**: Modify a function. Addressable only by address. Payload can contain: + - `name`: New function name. + - `signature`: Full function signature string (e.g., `void my_func(int p1, char * p2)`). + - `comment`: Set/update the function's primary comment. + ```json + // Example PATCH payload + { "name": "calculate_checksum", "signature": "uint32_t calculate_checksum(uint8_t* buffer, size_t length)" } + ``` +- **`DELETE /functions/{address}`**: Delete the function definition at the specified address. + +#### Function Sub-Resources + +- **`GET /functions/{address}/decompile`**: Get decompiled C-like code for the function. + - Query Parameters: + - `?syntax_tree=true`: Include the decompiler's internal syntax tree (JSON). + - `?style=[style_name]`: Apply a specific decompiler simplification style (e.g., `normalize`, `paramid`). + - `?timeout=[seconds]`: Set a timeout for the decompilation process. + ```json + // Example Response Fragment (without syntax tree) + "result": { + "address": "0x4010a0", + "ccode": "int process_data(char *param_1, int param_2)\n{\n // ... function body ...\n return result;\n}\n" + } + ``` +- **`GET /functions/{address}/disassembly`**: Get assembly listing for the function. Supports pagination (`?offset=`, `?limit=`). + ```json + // Example Response Fragment + "result": [ + { "address": "0x4010a0", "mnemonic": "PUSH", "operands": "RBP", "bytes": "55" }, + { "address": "0x4010a1", "mnemonic": "MOV", "operands": "RBP, RSP", "bytes": "4889E5" }, + // ... more instructions + ] + ``` +- **`GET /functions/{address}/variables`**: List local variables defined within the function. Supports searching by name. +- **`PATCH /functions/{address}/variables/{variable_name}`**: Modify a local variable (rename, change type). Requires `name` and/or `type` in the payload. + +### 4. Symbols & Labels + +Represents named locations (functions, data, labels). Base path: `/programs/{program_id}/symbols`. + +- **`GET /symbols`**: List all symbols in the program. Supports searching (by name/address/regex) and pagination. Can filter by type (`?type=function`, `?type=data`, `?type=label`). +- **`POST /symbols`**: Create or rename a symbol at a specific address. Requires `address` and `name` in the payload. If a symbol exists, it's renamed; otherwise, a new label is created. +- **`GET /symbols/{address}`**: Get details of the symbol at the specified address. +- **`PATCH /symbols/{address}`**: Modify properties of the symbol (e.g., set as primary, change namespace). Payload specifies changes. +- **`DELETE /symbols/{address}`**: Remove the symbol at the specified address. + +### 5. Data + +Represents defined data items in memory. Base path: `/programs/{program_id}/data`. + +- **`GET /data`**: List defined data items. Supports searching (by name/address/regex) and pagination. Can filter by type (`?type=string`, `?type=dword`, etc.). +- **`POST /data`**: Define a new data item. Requires `address`, `type`, and optionally `size` or `length` in the payload. +- **`GET /data/{address}`**: Get details of the data item at the specified address (type, size, value representation). +- **`PATCH /data/{address}`**: Modify a data item (e.g., change `name`, `type`, `comment`). Payload specifies changes. +- **`DELETE /data/{address}`**: Undefine the data item at the specified address. + +### 6. Memory Segments + +Represents memory blocks/sections defined in the program. Base path: `/programs/{program_id}/segments`. + +- **`GET /segments`**: List all memory segments (e.g., `.text`, `.data`, `.bss`). +- **`GET /segments/{segment_name}`**: Get details for a specific segment (address range, permissions, size). + +### 7. Memory Access + +Provides raw memory access. Base path: `/programs/{program_id}/memory`. + +- **`GET /memory/{address}`**: Read bytes from memory. + - Query Parameters: + - `?length=[bytes]`: Number of bytes to read (required, max limit applies). + - `?format=[hex|base64|string]`: How to encode the returned bytes (default: hex). + ```json + // Example Response Fragment for GET /programs/proj%3A%2Ffile.bin/memory/0x402000?length=16&format=hex + "result": { + "address": "0x402000", + "length": 16, + "format": "hex", + "bytes": "48656C6C6F20576F726C642100000000" // "Hello World!...." + } + ``` +- **`PATCH /memory/{address}`**: Write bytes to memory. Requires `bytes` (in specified `format`) and `format` in the payload. Use with extreme caution. + +### 8. Cross-References (XRefs) + +Provides information about references to/from addresses. Base path: `/programs/{program_id}/xrefs`. + +- **`GET /xrefs`**: Search for cross-references. Supports pagination. + - Query Parameters (at least one required): + - `?to_addr=[address]`: Find references *to* this address. + - `?from_addr=[address]`: Find references *from* this address or within the function/data at this address. + - `?type=[CALL|READ|WRITE|DATA|POINTER|...]`: Filter by reference type. +- **`GET /functions/{address}/xrefs`**: Convenience endpoint, equivalent to `GET /xrefs?to_addr={address}` and potentially `GET /xrefs?from_addr={address}` combined or linked. + +### 9. Analysis + +Provides access to Ghidra's analysis results. Base path: `/programs/{program_id}/analysis`. + +- **`GET /analysis/callgraph`**: Retrieve the function call graph (potentially filtered or paginated). Format might be nodes/edges JSON or a standard graph format like DOT. +- **`GET /analysis/dataflow/{address}`**: Perform data flow analysis starting from a specific address or instruction. Requires parameters specifying forward/backward, context, etc. (Details TBD). +- **`POST /analysis/analyze`**: Trigger a full or partial re-analysis of the program. + +## Design Considerations for AI Usage + +- **Structured responses**: JSON format ensures predictable parsing by AI agents. +- **HATEOAS Links**: `_links` allow agents to discover available actions and related resources without hardcoding paths. +- **Address and Name Resolution**: Key elements like functions and symbols are addressable by both memory address and name where applicable. +- **Explicit Operations**: Actions like decompilation, disassembly, and analysis are distinct endpoints. +- **Pagination & Filtering**: Essential for handling potentially large datasets (symbols, functions, xrefs, disassembly). +- **Clear Error Reporting**: `success: false` and the `error` object provide actionable feedback. +- **No Injected Summaries**: The API should return raw or structured Ghidra data, leaving interpretation and summarization to the AI agent. diff --git a/JAVA_PLUGIN_API.md b/JAVA_PLUGIN_API.md deleted file mode 100644 index 68e2905..0000000 --- a/JAVA_PLUGIN_API.md +++ /dev/null @@ -1,399 +0,0 @@ -# GhydraMCP Java Plugin REST API Documentation - -## Base URL -`http://localhost:8192` (default port, may vary) - -## Endpoints - -### 1. Instance Information -- `GET /info` -- `GET /` (root path) - -Returns basic instance information including: -- Port number -- Whether this is the base instance -- Current project name (if available) -- Current program name (if available) - -Example Response: -```json -{ - "port": 8192, - "isBaseInstance": true, - "project": "MyProject", - "file": "program.exe" -} -``` - -### 2. Function Operations - -#### List Functions -- `GET /functions` - -Parameters: -- `offset` (optional): Pagination offset (default: 0) -- `limit` (optional): Maximum results (default: 100) -- `query` (optional): Search term to filter functions - -Example Response: -```json -{ - "success": true, - "result": [ - { - "name": "init_peripherals", - "address": "08000200" - }, - { - "name": "uart_rx_valid_command", - "address": "0800029c" - } - ], - "timestamp": 1743778219516, - "port": 8192, - "instanceType": "base" -} -``` - -#### Get Function Details -- `GET /functions/{name}` - -Returns decompiled code for the specified function. - -Example Response: -```json -{ - "success": true, - "result": "int main() {\n // Decompiled code here\n}", - "timestamp": 1743778219516 -} -``` - -#### Rename Function -- `POST /functions/{name}` - -Body Parameters: -- `newName`: New name for the function - -Example Response: -```json -{ - "success": true, - "result": "Renamed successfully", - "timestamp": 1743778219516 -} -``` - -#### Function Variables -- `GET /functions/{name}/variables` - -Lists all variables (parameters and locals) in a function. - -Example Response: -```json -{ - "success": true, - "result": { - "function": "myFunction", - "parameters": [ - { - "name": "param1", - "type": "int", - "kind": "parameter" - }, - { - "name": "param2", - "type": "char*", - "kind": "parameter" - } - ], - "localVariables": [ - { - "name": "var1", - "type": "int", - "address": "08000234" - }, - { - "name": "var2", - "type": "float", - "address": "08000238" - } - ] - } -} -``` - -#### Rename/Retype Variable -- `POST /functions/{name}/variables/{varName}` - -Body Parameters (one of): -- `newName`: New name for variable -- `dataType`: New data type for variable - -Example Response: -```json -{ - "success": true, - "result": "Variable renamed", - "timestamp": 1743778219516 -} -``` - -### 3. Class Operations -- `GET /classes` - -Parameters: -- `offset` (optional): Pagination offset (default: 0) -- `limit` (optional): Maximum results (default: 100) - -Example Response: -```json -{ - "success": true, - "result": [ - "MyClass1", - "MyClass2" - ], - "timestamp": 1743778219516 -} -``` - -### 4. Memory Segments -- `GET /segments` - -Parameters: -- `offset` (optional): Pagination offset (default: 0) -- `limit` (optional): Maximum results (default: 100) - -Example Response: -```json -{ - "success": true, - "result": [ - { - "name": ".text", - "start": "08000000", - "end": "08001000" - }, - { - "name": ".data", - "start": "08001000", - "end": "08002000" - } - ] -} -``` - -### 5. Symbol Operations - -#### Imports -- `GET /symbols/imports` - -Parameters: -- `offset` (optional): Pagination offset (default: 0) -- `limit` (optional): Maximum results (default: 100) - -Example Response: -```json -{ - "success": true, - "result": [ - { - "name": "printf", - "address": "EXTERNAL:00000000" - }, - { - "name": "malloc", - "address": "EXTERNAL:00000004" - } - ] -} -``` - -#### Exports -- `GET /symbols/exports` - -Parameters: -- `offset` (optional): Pagination offset (default: 0) -- `limit` (optional): Maximum results (default: 100) - -Example Response: -```json -{ - "success": true, - "result": [ - { - "name": "main", - "address": "08000200" - }, - { - "name": "_start", - "address": "08000100" - } - ] -} -``` - -### 6. Namespace Operations -- `GET /namespaces` - -Parameters: -- `offset` (optional): Pagination offset (default: 0) -- `limit` (optional): Maximum results (default: 100) - -Example Response: -```json -{ - "success": true, - "result": [ - "std", - "MyNamespace" - ] -} -``` - -### 7. Data Operations - -#### List Defined Data -- `GET /data` - -Parameters: -- `offset` (optional): Pagination offset (default: 0) -- `limit` (optional): Maximum results (default: 100) - -Example Response: -```json -{ - "success": true, - "result": [ - { - "address": "08001000", - "name": "myVar", - "value": "42" - }, - { - "address": "08001004", - "name": "myString", - "value": "\"Hello\"" - } - ] -} -``` - -#### Rename Data -- `POST /data` - -Body Parameters: -- `address`: Address of data to rename (hex string) -- `newName`: New name for data - -Example Response: -```json -{ - "success": true, - "result": { - "name": "main", - "decompiled": "int main() {\n // Decompiled code here\n}", - "metadata": { - "size": 256, - "entryPoint": "08000200" - } - }, - "timestamp": 1743778219516 -} -``` - -### 8. Variable Operations - -#### Global Variables -- `GET /variables` - -Parameters: -- `offset` (optional): Pagination offset (default: 0) -- `limit` (optional): Maximum results (default: 100) -- `search` (optional): Search term to filter variables - -Example Response: -```json -{ - "success": true, - "result": [ - { - "name": "globalVar1", - "address": "08001000" - }, - { - "name": "globalVar2", - "address": "08001004" - } - ] -} -``` - -### 9. Instance Management - -#### List Active Instances -- `GET /instances` - -Example Response: -```json -{ - "success": true, - "result": [ - { - "port": 8192, - "type": "base" - }, - { - "port": 8193, - "type": "secondary" - } - ] -} -``` - -#### Register Instance -- `POST /registerInstance` - -Body Parameters: -- `port`: Port number to register - -Example Response: -```json -{ - "success": true, - "result": "Instance registered on port 8193", - "timestamp": 1743778219516 -} -``` - -#### Unregister Instance -- `POST /unregisterInstance` - -Body Parameters: -- `port`: Port number to unregister - -Example Response: -```json -{ - "success": true, - "result": "Unregistered instance on port 8193", - "timestamp": 1743778219516 -} -``` - -## Error Responses -All endpoints return JSON with success=false on errors: -```json -{ - "success": false, - "error": "Error message", - "status": 500 -} -``` - -Common status codes: -- 400: Bad request (invalid parameters) -- 404: Not found (invalid endpoint or resource) -- 405: Method not allowed -- 500: Internal server error