mcghidra/GHIDRA_HTTP_API.md
Teal Bauer 30d9bb17da feat: add struct data type management API
Add endpoints and MCP tools to create, read, update, and delete struct
data types in Ghidra's data type manager. Enables programmatic definition
of complex data structures for reverse engineering workflows.

Includes pagination, category filtering, and field-level operations
(add, update by name or offset).
2025-11-14 12:10:34 +01:00

25 KiB

GhydraMCP Ghidra Plugin HTTP API v2

Overview

This API provides a Hypermedia-driven interface (HATEOAS) to interact with Ghidra's CodeBrowser, enabling AI-driven and automated reverse engineering workflows. It allows interaction with Ghidra projects, programs (binaries), functions, symbols, data, memory segments, cross-references, and analysis features. Each program open in Ghidra will have its own plugin instance, so all resources are specific to that program.

General Concepts

Request Format

  • Use standard HTTP verbs:
    • GET: Retrieve resources or lists.
    • POST: Create new resources.
    • PATCH: Modify existing resources partially.
    • PUT: Replace existing resources entirely (Use with caution, PATCH is often preferred).
    • DELETE: Remove resources.
  • Request bodies for POST, PUT, PATCH should be JSON (Content-Type: application/json).
  • Include an optional X-Request-ID header with a unique identifier for correlation.

Response Format

All non-error responses are JSON (Content-Type: application/json) containing at least the following keys:

{
  "id": "[correlation identifier]",
  "instance": "[instance url]",
  "success": true,
  "result": Object | Array<Object>,
  "_links": { // Optional: HATEOAS links
    "self": { "href": "/path/to/current/resource" },
    "related_resource": { "href": "/path/to/related" }
    // ... other relevant links
  }
}
  • id: The identifier from the X-Request-ID header if provided, or a random opaque identifier otherwise.
  • instance: The URL of the Ghidra plugin instance that handled the request.
  • success: Boolean true for successful operations.
  • result: The main data payload, either a single JSON object or an array of objects for lists.
  • _links: (Optional) Contains HATEOAS-style links to related resources or actions, facilitating discovery.

List Responses

List results (arrays in result) will typically include pagination information and a total count:

{
  "id": "req-123",
  "instance": "http://localhost:8192",
  "success": true,
  "result": [ ... objects ... ],
  "size": 150, // Total number of items matching the query across all pages
  "offset": 0,
  "limit": 50,
  "_links": {
    "self": { "href": "/functions?offset=0&limit=50" },
    "next": { "href": "/functions?offset=50&limit=50" }, // Present if more items exist
    "prev": { "href": "/functions?offset=0&limit=50" }  // Present if not the first page
  }
}

Error Responses

Errors use appropriate HTTP status codes (4xx, 5xx) and have a JSON payload with an error key:

{
  "id": "[correlation identifier]",
  "instance": "[instance url]",
  "success": false,
  "error": {
    "code": "RESOURCE_NOT_FOUND", // Optional: Machine-readable code
    "message": "Descriptive error message"
    // Potentially other details like invalid parameters
  }
}

Common HTTP Status Codes:

  • 200 OK: Successful GET, PATCH, PUT, DELETE.
  • 201 Created: Successful POST resulting in resource creation.
  • 204 No Content: Successful DELETE or PATCH/PUT where no body is returned.
  • 400 Bad Request: Invalid syntax, missing required parameters, invalid data format.
  • 401 Unauthorized: Authentication required or failed (if implemented).
  • 403 Forbidden: Authenticated user lacks permission (if implemented).
  • 404 Not Found: Resource or endpoint does not exist, or query yielded no results.
  • 405 Method Not Allowed: HTTP verb not supported for this endpoint.
  • 500 Internal Server Error: Unexpected error within the Ghidra plugin.

Addressing and Searching

Resources like functions, data, and symbols often exist at specific memory addresses and may have names.

  • By Address: Use the resource's path with the address (hexadecimal, e.g., 0x401000 or 08000004).
    • Example: GET /functions/0x401000
  • Querying Lists: List endpoints (e.g., /functions, /symbols, /data) support filtering via query parameters:
    • ?addr=[address in hex]: Find item at a specific address.
    • ?name=[full_name]: Find item(s) with an exact name match (case-sensitive).
    • ?name_contains=[substring]: Find item(s) whose name contains the substring (case-insensitive).
    • ?name_matches_regex=[regex]: Find item(s) whose name matches the Java-compatible regular expression.

Pagination

List endpoints support pagination using query parameters:

  • ?offset=[int]: Number of items to skip (default: 0).
  • ?limit=[int]: Maximum number of items to return (default: implementation-defined, e.g., 100).

Meta Endpoints

GET /plugin-version

Returns the version of the running Ghidra plugin and its API. Essential for compatibility checks by clients like the MCP bridge.

{
  "id": "req-meta-ver",
  "instance": "http://localhost:8192",
  "success": true,
  "result": {
    "plugin_version": "v2.0.0", // Example plugin build version
    "api_version": 2            // Ordinal API version
  },
  "_links": {
    "self": { "href": "/plugin-version" },
    "root": { "href": "/" }
  }
}

GET /info

Returns information about the current plugin instance, including details about the loaded program and project.

{
  "id": "req-info",
  "instance": "http://localhost:8192",
  "success": true,
  "result": {
    "isBaseInstance": true,
    "file": "example.exe",
    "architecture": "x86:LE:64:default",
    "processor": "x86",
    "addressSize": 64,
    "creationDate": "2023-01-01T12:00:00Z",
    "executable": "/path/to/example.exe",
    "project": "MyProject",
    "projectLocation": "/path/to/MyProject",
    "serverPort": 8192,
    "serverStartTime": 1672531200000,
    "instanceCount": 1
  },
  "_links": {
    "self": { "href": "/info" },
    "root": { "href": "/" },
    "instances": { "href": "/instances" },
    "program": { "href": "/program" }
  }
}

GET /instances

Returns information about all active GhydraMCP plugin instances.

{
  "id": "req-instances",
  "instance": "http://localhost:8192",
  "success": true,
  "result": [
    {
      "port": 8192,
      "url": "http://localhost:8192",
      "type": "base",
      "project": "MyProject",
      "file": "example.exe",
      "_links": {
        "self": { "href": "/instances/8192" },
        "info": { "href": "http://localhost:8192/info" },
        "connect": { "href": "http://localhost:8192" }
      }
    },
    {
      "port": 8193,
      "url": "http://localhost:8193",
      "type": "standard",
      "project": "MyProject",
      "file": "library.dll",
      "_links": {
        "self": { "href": "/instances/8193" },
        "info": { "href": "http://localhost:8193/info" },
        "connect": { "href": "http://localhost:8193" }
      }
    }
  ],
  "_links": {
    "self": { "href": "/instances" },
    "register": { "href": "/registerInstance", "method": "POST" },
    "unregister": { "href": "/unregisterInstance", "method": "POST" },
    "programs": { "href": "/programs" }
  }
}

Resource Types

Each Ghidra plugin instance runs in the context of a single program, so all resources are relative to the current program. The program's details are available through the GET /info and GET /program endpoints.

1. Project

Represents the current Ghidra project, which is a container for programs.

  • GET /project: Get details about the current project (e.g., location, list of open programs within it via links).

2. Program

Represents the current binary loaded in Ghidra.

  • GET /program: Get metadata for the current program (e.g., name, architecture, memory layout, analysis status).
    // Example Response Fragment for GET /program
    "result": {
      "programId": "myproject:/path/to/mybinary.exe",
      "name": "mybinary.exe",
      "isOpen": true,
      "languageId": "x86:LE:64:default",
      "compilerSpecId": "gcc",
      "imageBase": "0x400000",
      "memorySize": 1048576,
      "analysisComplete": true
    },
    "_links": {
      "self": { "href": "/program" },
      "project": { "href": "/project" },
      "functions": { "href": "/functions" },
      "symbols": { "href": "/symbols" },
      "data": { "href": "/data" },
      "segments": { "href": "/segments" },
      "memory": { "href": "/memory" },
      "xrefs": { "href": "/xrefs" },
      "analysis": { "href": "/analysis" }
    }
    

3. Current Location

Provides information about the current cursor position and function in Ghidra's CodeBrowser.

  • GET /address: Get the current cursor position.

    // Example Response
    "result": {
      "address": "0x401000",
      "program": "mybinary.exe"
    },
    "_links": {
      "self": { "href": "/address" },
      "program": { "href": "/program" },
      "memory": { "href": "/memory/0x401000?length=16" },
      "function": { "href": "/functions/0x401000" },
      "decompile": { "href": "/functions/0x401000/decompile" }
    }
    
  • GET /function: Get information about the function at the current cursor position.

    // Example Response
    "result": {
      "name": "main",
      "address": "0x401000",
      "signature": "int main(int argc, char** argv)",
      "size": 256
    },
    "_links": {
      "self": { "href": "/function" },
      "program": { "href": "/program" },
      "function": { "href": "/functions/0x401000" },
      "decompile": { "href": "/functions/0x401000/decompile" },
      "disassembly": { "href": "/functions/0x401000/disassembly" },
      "variables": { "href": "/functions/0x401000/variables" },
      "xrefs": { "href": "/xrefs?to_addr=0x401000" }
    }
    

4. Functions

Represents functions within the current program.

  • GET /functions: List functions. Supports searching (by name/address/regex) and pagination.
    // Example Response Fragment
    "result": [
      { "name": "FUN_08000004", "address": "08000004", "_links": { "self": { "href": "/functions/08000004" } } },
      { "name": "init_peripherals", "address": "08001cf0", "_links": { "self": { "href": "/functions/08001cf0" } } }
    ]
    
  • POST /functions: Create a function at a specific address. Requires address in the request body. Returns the created function resource.
  • GET /functions/{address}: Get details for a specific function (name, signature, size, stack info, etc.).
    // Example Response Fragment for GET /functions/0x4010a0
    "result": {
      "name": "process_data",
      "address": "0x4010a0",
      "signature": "int process_data(char * data, int size)",
      "size": 128,
      "stack_depth": 16,
      "has_varargs": false,
      "calling_convention": "__stdcall"
      // ... other details
    },
    "_links": {
      "self": { "href": "/functions/0x4010a0" },
      "decompile": { "href": "/functions/0x4010a0/decompile" },
      "disassembly": { "href": "/functions/0x4010a0/disassembly" },
      "variables": { "href": "/functions/0x4010a0/variables" },
      "xrefs_to": { "href": "/xrefs?to_addr=0x4010a0" },
      "xrefs_from": { "href": "/xrefs?from_addr=0x4010a0" }
    }
    
  • PATCH /functions/{address}: Modify a function. Addressable only by address. Payload can contain:
    • name: New function name.
    • signature: Full function signature string (e.g., void my_func(int p1, char * p2)).
    • comment: Set/update the function's primary comment.
    // Example PATCH payload
    { "name": "calculate_checksum", "signature": "uint32_t calculate_checksum(uint8_t* buffer, size_t length)" }
    
  • DELETE /functions/{address}: Delete the function definition at the specified address.

Function Sub-Resources

  • GET /functions/{address}/decompile: Get decompiled C-like code for the function.
    • Query Parameters:
      • ?syntax_tree=true: Include the decompiler's internal syntax tree (JSON).
      • ?style=[style_name]: Apply a specific decompiler simplification style (e.g., normalize, paramid).
      • ?timeout=[seconds]: Set a timeout for the decompilation process.
    // Example Response Fragment (without syntax tree)
    "result": {
      "address": "0x4010a0",
      "ccode": "int process_data(char *param_1, int param_2)\n{\n  // ... function body ...\n  return result;\n}\n"
    }
    
  • GET /functions/{address}/disassembly: Get assembly listing for the function. Supports pagination (?offset=, ?limit=).
    // Example Response Fragment
    "result": [
      { "address": "0x4010a0", "mnemonic": "PUSH", "operands": "RBP", "bytes": "55" },
      { "address": "0x4010a1", "mnemonic": "MOV", "operands": "RBP, RSP", "bytes": "4889E5" },
      // ... more instructions
    ]
    
  • GET /functions/{address}/variables: List local variables defined within the function. Supports searching by name.
  • PATCH /functions/{address}/variables/{variable_name}: Modify a local variable (rename, change type). Requires name and/or type in the payload.

5. Symbols & Labels

Represents named locations (functions, data, labels).

  • GET /symbols: List all symbols in the program. Supports searching (by name/address/regex) and pagination. Can filter by type (?type=function, ?type=data, ?type=label).
  • POST /symbols: Create or rename a symbol at a specific address. Requires address and name in the payload. If a symbol exists, it's renamed; otherwise, a new label is created.
  • GET /symbols/{address}: Get details of the symbol at the specified address.
  • PATCH /symbols/{address}: Modify properties of the symbol (e.g., set as primary, change namespace). Payload specifies changes.
  • DELETE /symbols/{address}: Remove the symbol at the specified address.

6. Data

Represents defined data items in memory.

  • GET /data: List defined data items. Supports searching (by name/address/regex) and pagination. Can filter by type (?type=string, ?type=dword, etc.).
  • POST /data: Define a new data item. Requires address, type, and optionally size or length in the payload.
  • GET /data/{address}: Get details of the data item at the specified address (type, size, value representation).
  • PATCH /data/{address}: Modify a data item (e.g., change name, type, comment). Payload specifies changes.
  • DELETE /data/{address}: Undefine the data item at the specified address.

6.1 Strings

Provides access to string data in the binary.

  • GET /strings: List all defined strings in the binary. Supports pagination and filtering.
    • Query Parameters:
      • ?offset=[int]: Number of strings to skip (default: 0).
      • ?limit=[int]: Maximum number of strings to return (default: 2000).
      • ?filter=[string]: Only include strings containing this substring (case-insensitive).
    // Example Response
    "result": [
      {
        "address": "0x00401234",
        "value": "Hello, world!",
        "length": 14,
        "type": "string",
        "name": "aHelloWorld"
      },
      {
        "address": "0x00401250",
        "value": "Error: could not open file",
        "length": 26,
        "type": "string",
        "name": "aErrorCouldNotO"
      }
    ],
    "_links": {
      "self": { "href": "/strings?offset=0&limit=10" },
      "next": { "href": "/strings?offset=10&limit=10" }
    }
    

6.2 Structs

Provides functionality for creating and managing struct (composite) data types.

  • GET /structs: List all struct data types in the program. Supports pagination and filtering.

    • Query Parameters:
      • ?offset=[int]: Number of structs to skip (default: 0).
      • ?limit=[int]: Maximum number of structs to return (default: 100).
      • ?category=[string]: Filter by category path (e.g. "/winapi").
    // Example Response
    "result": [
      {
        "name": "MyStruct",
        "path": "/custom/MyStruct",
        "size": 16,
        "numFields": 4,
        "category": "/custom",
        "description": "Custom data structure"
      },
      {
        "name": "FileHeader",
        "path": "/FileHeader",
        "size": 32,
        "numFields": 8,
        "category": "/",
        "description": ""
      }
    ],
    "_links": {
      "self": { "href": "/structs?offset=0&limit=100" },
      "program": { "href": "/program" }
    }
    
  • GET /structs?name={struct_name}: Get detailed information about a specific struct including all fields.

    // Example Response for GET /structs?name=MyStruct
    "result": {
      "name": "MyStruct",
      "path": "/custom/MyStruct",
      "size": 16,
      "category": "/custom",
      "description": "Custom data structure",
      "numFields": 4,
      "fields": [
        {
          "name": "id",
          "offset": 0,
          "length": 4,
          "type": "int",
          "typePath": "/int",
          "comment": "Unique identifier"
        },
        {
          "name": "flags",
          "offset": 4,
          "length": 4,
          "type": "dword",
          "typePath": "/dword",
          "comment": ""
        },
        {
          "name": "data_ptr",
          "offset": 8,
          "length": 4,
          "type": "pointer",
          "typePath": "/pointer",
          "comment": "Pointer to data"
        },
        {
          "name": "size",
          "offset": 12,
          "length": 4,
          "type": "uint",
          "typePath": "/uint",
          "comment": ""
        }
      ]
    },
    "_links": {
      "self": { "href": "/structs?name=MyStruct" },
      "structs": { "href": "/structs" },
      "program": { "href": "/program" }
    }
    
  • POST /structs/create: Create a new struct data type.

    • Request Payload:
      • name: Name for the new struct (required).
      • category: Category path (optional, defaults to root).
      • description: Description for the struct (optional).
    // Example Request Payload
    {
      "name": "NetworkPacket",
      "category": "/network",
      "description": "Network packet structure"
    }
    
    // Example Response
    "result": {
      "name": "NetworkPacket",
      "path": "/network/NetworkPacket",
      "category": "/network",
      "size": 0,
      "message": "Struct created successfully"
    }
    
  • POST /structs/addfield: Add a field to an existing struct.

    • Request Payload:
      • struct: Name of the struct to modify (required).
      • fieldName: Name for the new field (required).
      • fieldType: Data type for the field (required, e.g. "int", "char", "pointer").
      • offset: Specific offset to insert field (optional, appends to end if not specified).
      • comment: Comment for the field (optional).
    // Example Request Payload
    {
      "struct": "NetworkPacket",
      "fieldName": "header",
      "fieldType": "dword",
      "comment": "Packet header"
    }
    
    // Example Response
    "result": {
      "struct": "NetworkPacket",
      "fieldName": "header",
      "fieldType": "dword",
      "offset": 0,
      "length": 4,
      "structSize": 4,
      "message": "Field added successfully"
    }
    
  • POST /structs/updatefield: Update an existing field in a struct (rename, change type, or modify comment).

    • Request Payload:
      • struct: Name of the struct to modify (required).
      • fieldOffset OR fieldName: Identify the field to update (one required).
      • newName: New name for the field (optional).
      • newType: New data type for the field (optional).
      • newComment: New comment for the field (optional).
      • At least one of newName, newType, or newComment must be provided.
    // Example Request Payload - rename a field
    {
      "struct": "NetworkPacket",
      "fieldName": "header",
      "newName": "packet_header",
      "newComment": "Updated packet header field"
    }
    
    // Example Request Payload - change type by offset
    {
      "struct": "NetworkPacket",
      "fieldOffset": 0,
      "newType": "qword"
    }
    
    // Example Response
    "result": {
      "struct": "NetworkPacket",
      "offset": 0,
      "originalName": "header",
      "originalType": "dword",
      "originalComment": "Packet header",
      "newName": "packet_header",
      "newType": "dword",
      "newComment": "Updated packet header field",
      "length": 4,
      "message": "Field updated successfully"
    }
    
  • POST /structs/delete: Delete a struct data type.

    • Request Payload:
      • name: Name of the struct to delete (required).
    // Example Request Payload
    {
      "name": "NetworkPacket"
    }
    
    // Example Response
    "result": {
      "name": "NetworkPacket",
      "path": "/network/NetworkPacket",
      "category": "/network",
      "message": "Struct deleted successfully"
    }
    

7. Memory Segments

Represents memory blocks/sections defined in the program.

  • GET /segments: List all memory segments (e.g., .text, .data, .bss).
  • GET /segments/{segment_name}: Get details for a specific segment (address range, permissions, size).

8. Memory Access

Provides raw memory access.

  • GET /memory/{address}: Read bytes from memory.
    • Query Parameters:
      • ?length=[bytes]: Number of bytes to read (required, max limit applies).
      • ?format=[hex|base64|string]: How to encode the returned bytes (default: hex).
    // Example Response Fragment for GET /programs/proj%3A%2Ffile.bin/memory/0x402000?length=16&format=hex
    "result": {
      "address": "0x402000",
      "length": 16,
      "format": "hex",
      "bytes": "48656C6C6F20576F726C642100000000" // "Hello World!...."
    }
    
  • PATCH /memory/{address}: Write bytes to memory. Requires bytes (in specified format) and format in the payload. Use with extreme caution.

9. Cross-References (XRefs)

Provides information about references to/from addresses.

  • GET /xrefs: Search for cross-references. Supports pagination.
    • Query Parameters (at least one required):
      • ?to_addr=[address]: Find references to this address.
      • ?from_addr=[address]: Find references from this address or within the function/data at this address.
      • ?type=[CALL|READ|WRITE|DATA|POINTER|...]: Filter by reference type.
  • GET /functions/{address}/xrefs: Convenience endpoint, equivalent to GET /xrefs?to_addr={address} and potentially GET /xrefs?from_addr={address} combined or linked.

10. Analysis

Provides access to Ghidra's analysis results.

  • GET /analysis: Get information about the analysis status and available analyzers.

    // Example Response
    "result": {
      "program": "mybinary.exe",
      "analysis_enabled": true,
      "available_analyzers": [
        "Function Start Analyzer",
        "Basic Block Model Analyzer",
        "Reference Analyzer",
        "Call Convention Analyzer",
        "Data Reference Analyzer",
        "Decompiler Parameter ID",
        "Stack Analyzer"
      ]
    },
    "_links": {
      "self": { "href": "/analysis" },
      "program": { "href": "/program" },
      "analyze": { "href": "/analysis", "method": "POST" },
      "callgraph": { "href": "/analysis/callgraph" }
    }
    
  • POST /analysis: Trigger a full or partial re-analysis of the program.

    // Example Response
    "result": {
      "program": "mybinary.exe",
      "analysis_triggered": true,
      "message": "Analysis initiated on program"
    }
    
  • GET /analysis/callgraph: Retrieve the function call graph.

    • Query Parameters:
      • ?function=[function_name]: Start the call graph from this function (default: entry point).
      • ?max_depth=[int]: Maximum depth of the call graph (default: 3).
    // Example Response
    "result": {
      "root": "main",
      "root_address": "0x401000",
      "max_depth": 3,
      "nodes": [
        {
          "id": "0x401000",
          "name": "main",
          "address": "0x401000",
          "depth": 0,
          "_links": {
            "self": { "href": "/functions/0x401000" }
          }
        },
        // ... more nodes
      ],
      "edges": [
        {
          "from": "0x401000",
          "to": "0x401100",
          "type": "call",
          "call_site": "0x401050"
        },
        // ... more edges
      ]
    }
    
  • GET /analysis/dataflow: Perform data flow analysis starting from a specific address.

    • Query Parameters:
      • ?address=[address]: Starting address for data flow analysis (required).
      • ?direction=[forward|backward]: Direction of data flow analysis (default: forward).
      • ?max_steps=[int]: Maximum number of steps to analyze (default: 50).
    // Example Response
    "result": {
      "start_address": "0x401050",
      "direction": "forward",
      "max_steps": 50,
      "steps": [
        {
          "address": "0x401050",
          "instruction": "MOV EAX, [RBP+0x8]",
          "description": "Starting point of data flow analysis"
        },
        // ... more steps
      ]
    }
    

Design Considerations for AI Usage

  • Structured responses: JSON format ensures predictable parsing by AI agents.
  • HATEOAS Links: _links allow agents to discover available actions and related resources without hardcoding paths.
  • Address and Name Resolution: Key elements like functions and symbols are addressable by both memory address and name where applicable.
  • Explicit Operations: Actions like decompilation, disassembly, and analysis are distinct endpoints.
  • Pagination & Filtering: Essential for handling potentially large datasets (symbols, functions, xrefs, disassembly).
  • Clear Error Reporting: success: false and the error object provide actionable feedback.
  • No Injected Summaries: The API should return raw or structured Ghidra data, leaving interpretation and summarization to the AI agent.