feat: Refactor bridge for better MCP integration
Implemented the refactoring proposal to optimize the bridge for better MCP integration: - Added resources for context loading (instances, functions, disassembly) - Added prompts for common analysis patterns - Reorganized tools into namespaced functions for better discoverability - Implemented current working instance concept for simpler usage - Added documentation for the namespaces-based approach
This commit is contained in:
parent
8aded2e6c3
commit
0f9aa2bb47
2582
bridge_mcp_hydra.py
2582
bridge_mcp_hydra.py
File diff suppressed because it is too large
Load Diff
1621
refactoring_namespaces.py
Normal file
1621
refactoring_namespaces.py
Normal file
File diff suppressed because it is too large
Load Diff
261
refactoring_proposal.md
Normal file
261
refactoring_proposal.md
Normal file
@ -0,0 +1,261 @@
|
||||
# GhydraMCP Bridge Refactoring Proposal
|
||||
|
||||
## Current Issues
|
||||
|
||||
The current bridge implementation exposes all functionality as MCP tools, which creates several problems:
|
||||
|
||||
1. **Discoverability**: With dozens of tool functions, it's difficult for AI agents to identify the correct tool to use for a specific task.
|
||||
|
||||
2. **Consistency**: The API surface is large and not organized by conceptual resources, making it harder to understand what's related.
|
||||
|
||||
3. **Context Loading**: Many operations require repeated loading of program information that could be provided more efficiently as resources.
|
||||
|
||||
4. **Default Selection**: The current approach requires explicit port selection for each operation, instead of following a "current working instance" pattern.
|
||||
|
||||
## Proposed MCP-Oriented Refactoring
|
||||
|
||||
Restructure the bridge to follow MCP patterns more closely:
|
||||
|
||||
### 1. Resources (for Context Loading)
|
||||
|
||||
Resources provide information that can be loaded directly into the LLM's context.
|
||||
|
||||
```python
|
||||
@mcp.resource()
|
||||
def ghidra_instance(port: int = None) -> dict:
|
||||
"""Get information about a Ghidra instance or the current working instance
|
||||
|
||||
Args:
|
||||
port: Specific Ghidra instance port (optional, uses current if omitted)
|
||||
|
||||
Returns:
|
||||
dict: Detailed information about the Ghidra instance and loaded program
|
||||
"""
|
||||
# Implementation that gets instance info and the current program details
|
||||
# from the currently selected "working" instance or a specific port
|
||||
```
|
||||
|
||||
```python
|
||||
@mcp.resource()
|
||||
def decompiled_function(name: str = None, address: str = None) -> str:
|
||||
"""Get decompiled C code for a function
|
||||
|
||||
Args:
|
||||
name: Function name (mutually exclusive with address)
|
||||
address: Function address in hex format (mutually exclusive with name)
|
||||
|
||||
Returns:
|
||||
str: The decompiled C code as a string
|
||||
"""
|
||||
# Implementation that only returns the decompiled text directly
|
||||
```
|
||||
|
||||
```python
|
||||
@mcp.resource()
|
||||
def function_info(name: str = None, address: str = None) -> dict:
|
||||
"""Get detailed information about a function
|
||||
|
||||
Args:
|
||||
name: Function name (mutually exclusive with address)
|
||||
address: Function address in hex format (mutually exclusive with name)
|
||||
|
||||
Returns:
|
||||
dict: Complete function information including signature, parameters, etc.
|
||||
"""
|
||||
# Implementation that returns detailed function information
|
||||
```
|
||||
|
||||
```python
|
||||
@mcp.resource()
|
||||
def disassembly(name: str = None, address: str = None) -> str:
|
||||
"""Get disassembled instructions for a function
|
||||
|
||||
Args:
|
||||
name: Function name (mutually exclusive with address)
|
||||
address: Function address in hex format (mutually exclusive with name)
|
||||
|
||||
Returns:
|
||||
str: Formatted disassembly listing as a string
|
||||
"""
|
||||
# Implementation that returns formatted text disassembly
|
||||
```
|
||||
|
||||
### 2. Prompts (for Interaction Patterns)
|
||||
|
||||
Prompts define reusable templates for LLM interactions, making common workflows easier.
|
||||
|
||||
```python
|
||||
@mcp.prompt("analyze_function")
|
||||
def analyze_function_prompt(name: str = None, address: str = None):
|
||||
"""A prompt that guides the LLM through analyzing a function's purpose
|
||||
|
||||
Args:
|
||||
name: Function name (mutually exclusive with address)
|
||||
address: Function address in hex format (mutually exclusive with name)
|
||||
"""
|
||||
# Implementation returns a prompt template with decompiled code and disassembly
|
||||
# that helps the LLM systematically analyze a function
|
||||
return {
|
||||
"prompt": f"""
|
||||
Analyze the following function: {name or address}
|
||||
|
||||
Decompiled code:
|
||||
```c
|
||||
{decompiled_function(name=name, address=address)}
|
||||
```
|
||||
|
||||
Disassembly:
|
||||
```
|
||||
{disassembly(name=name, address=address)}
|
||||
```
|
||||
|
||||
1. What is the purpose of this function?
|
||||
2. What are the key parameters and their uses?
|
||||
3. What are the return values and their meanings?
|
||||
4. Are there any security concerns in this implementation?
|
||||
5. Describe the algorithm or process being implemented.
|
||||
""",
|
||||
"context": {
|
||||
"function_info": function_info(name=name, address=address)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
```python
|
||||
@mcp.prompt("identify_vulnerabilities")
|
||||
def identify_vulnerabilities_prompt(name: str = None, address: str = None):
|
||||
"""A prompt that helps the LLM identify potential vulnerabilities in a function
|
||||
|
||||
Args:
|
||||
name: Function name (mutually exclusive with address)
|
||||
address: Function address in hex format (mutually exclusive with name)
|
||||
"""
|
||||
# Implementation returns a prompt focused on finding security issues
|
||||
```
|
||||
|
||||
### 3. Tools (for Function Selection)
|
||||
|
||||
Tools are organized by domain concepts rather than just mirroring the low-level API.
|
||||
|
||||
```python
|
||||
@mcp.tool_group("instances")
|
||||
class InstanceTools:
|
||||
@mcp.tool()
|
||||
def list() -> dict:
|
||||
"""List all active Ghidra instances"""
|
||||
return list_instances()
|
||||
|
||||
@mcp.tool()
|
||||
def discover() -> dict:
|
||||
"""Discover available Ghidra instances"""
|
||||
return discover_instances()
|
||||
|
||||
@mcp.tool()
|
||||
def register(port: int, url: str = None) -> str:
|
||||
"""Register a new Ghidra instance"""
|
||||
return register_instance(port, url)
|
||||
|
||||
@mcp.tool()
|
||||
def use(port: int) -> str:
|
||||
"""Set the current working Ghidra instance"""
|
||||
# Implementation that sets the default instance
|
||||
global current_instance_port
|
||||
current_instance_port = port
|
||||
return f"Now using Ghidra instance on port {port}"
|
||||
```
|
||||
|
||||
```python
|
||||
@mcp.tool_group("functions")
|
||||
class FunctionTools:
|
||||
@mcp.tool()
|
||||
def list(offset: int = 0, limit: int = 100, **filters) -> dict:
|
||||
"""List functions with filtering and pagination"""
|
||||
# Implementation that uses the current instance
|
||||
return list_functions(port=current_instance_port, offset=offset, limit=limit, **filters)
|
||||
|
||||
@mcp.tool()
|
||||
def get(name: str = None, address: str = None) -> dict:
|
||||
"""Get detailed information about a function"""
|
||||
return get_function(port=current_instance_port, name=name, address=address)
|
||||
|
||||
@mcp.tool()
|
||||
def create(address: str) -> dict:
|
||||
"""Create a new function at the specified address"""
|
||||
return create_function(port=current_instance_port, address=address)
|
||||
|
||||
@mcp.tool()
|
||||
def rename(name: str = None, address: str = None, new_name: str = "") -> dict:
|
||||
"""Rename a function"""
|
||||
return rename_function(port=current_instance_port,
|
||||
name=name, address=address, new_name=new_name)
|
||||
|
||||
@mcp.tool()
|
||||
def set_signature(name: str = None, address: str = None, signature: str = "") -> dict:
|
||||
"""Set a function's signature/prototype"""
|
||||
return set_function_signature(port=current_instance_port,
|
||||
name=name, address=address, signature=signature)
|
||||
```
|
||||
|
||||
Similar tool groups would be created for:
|
||||
- `data`: Data manipulation tools
|
||||
- `memory`: Memory reading/writing tools
|
||||
- `analysis`: Program analysis tools
|
||||
- `xrefs`: Cross-reference navigation tools
|
||||
- `symbols`: Symbol management tools
|
||||
- `variables`: Variable manipulation tools
|
||||
|
||||
### 4. Simplified Instance Management
|
||||
|
||||
Add a "current working instance" pattern:
|
||||
|
||||
```python
|
||||
# Global state for the current instance
|
||||
current_instance_port = DEFAULT_GHIDRA_PORT
|
||||
|
||||
# Helper function to get the current instance or validate a specific port
|
||||
def _get_instance_port(port=None):
|
||||
port = port or current_instance_port
|
||||
# Validate that the instance exists and is active
|
||||
if port not in active_instances:
|
||||
# Try to register it if not found
|
||||
register_instance(port)
|
||||
if port not in active_instances:
|
||||
raise ValueError(f"No active Ghidra instance on port {port}")
|
||||
return port
|
||||
|
||||
# All tools would use this helper, falling back to the current instance if no port is specified
|
||||
def read_memory(address: str, length: int = 16, format: str = "hex", port: int = None) -> dict:
|
||||
"""Read bytes from memory
|
||||
|
||||
Args:
|
||||
address: Memory address in hex format
|
||||
length: Number of bytes to read (default: 16)
|
||||
format: Output format (default: "hex")
|
||||
port: Specific Ghidra instance port (optional, uses current if omitted)
|
||||
|
||||
Returns:
|
||||
dict: Memory content in the requested format
|
||||
"""
|
||||
port = _get_instance_port(port)
|
||||
# Rest of implementation...
|
||||
```
|
||||
|
||||
## Migration Strategy
|
||||
|
||||
1. Create a new MCP class structure in a separate file
|
||||
2. Implement resource loaders for key items (functions, data, memory regions)
|
||||
3. Implement prompt templates for common tasks
|
||||
4. Organize tools into logical groups by domain concept
|
||||
5. Add a current instance selection mechanism
|
||||
6. Update documentation with clear examples of the new patterns
|
||||
7. Create backward compatibility shims if needed
|
||||
|
||||
## Benefits of This Approach
|
||||
|
||||
1. **Better Discoverability**: Logical grouping helps agents find the right tool
|
||||
2. **Context Efficiency**: Resources load just what's needed without extra metadata
|
||||
3. **Streamlined Interaction**: Tools follow consistent patterns with sensible defaults
|
||||
4. **Prompt Templates**: Common patterns are codified in reusable prompts
|
||||
5. **More LLM-friendly**: Outputs optimized for consumption by language models
|
||||
|
||||
The refactored API would be easier to use, more efficient, and better aligned with MCP best practices, while maintaining all the current functionality.
|
||||
1286
refactoring_sample.py
Normal file
1286
refactoring_sample.py
Normal file
File diff suppressed because it is too large
Load Diff
Loading…
x
Reference in New Issue
Block a user