Refactor to MCPMixin architecture with injection-safe shell execution

Replaces single-file server with modular mixin architecture:
- 6 domain mixins (devices, input, apps, screenshot, ui, files)
- Injection-safe run_shell_args() using shlex.quote() for all tools
- Persistent developer mode config (~/.config/adb-mcp/config.json)
- Pydantic models for typed responses
- MCP elicitation for destructive operations
- Dynamic screen dimensions for scroll gestures
- Intent flag name resolution for activity_start
- 50 tools, 5 resources, tested on real hardware
This commit is contained in:
Ryan Malloy 2026-02-10 18:30:34 -07:00
parent da7cde1fc3
commit 7c414f8015
16 changed files with 4156 additions and 760 deletions

3
.gitignore vendored
View File

@ -56,10 +56,11 @@ htmlcov/
.ruff_cache/ .ruff_cache/
.mypy_cache/ .mypy_cache/
# Screenshots # Screenshots and recordings
*.png *.png
*.jpg *.jpg
*.jpeg *.jpeg
*.mp4
# Docker # Docker
.dockerignore .dockerignore

296
README.md
View File

@ -1,81 +1,23 @@
# Android MCP Server # Android ADB MCP Server
A Model Context Protocol (MCP) server for Android device automation via ADB. This server provides tools for interacting with Android devices through ADB commands in a structured, type-safe way. A [Model Context Protocol](https://modelcontextprotocol.io/) server that gives AI assistants direct control over Android devices through ADB. Point any MCP-compatible client at a phone plugged into USB, and it can take screenshots, tap buttons, launch apps, inspect UI elements, transfer files, and run shell commands — all through structured, type-safe tool calls.
## Features Built on [FastMCP](https://gofastmcp.com/) with a modular mixin architecture. 50 tools across 6 domains. Tested on real hardware.
- **Device Management**: List and interact with connected Android devices ## Quick Start
- **Screenshots**: Capture and retrieve device screenshots
- **Input Simulation**: Send taps, swipes, key events, and text input
- **App Control**: Launch apps by package name or open URLs
- **Package Management**: List installed packages
- **Shell Commands**: Execute arbitrary shell commands on device
## Tools Available
### `adb_devices()`
List all connected Android devices with their IDs and status.
### `adb_screenshot(device_id?, local_filename?)`
Take a screenshot and save it locally.
### `adb_input(action_type, ...parameters, device_id?)`
Send input events (key and text actions work reliably):
- `key`: Send key event - `adb_input(action_type="key", key_code="KEYCODE_BACK")`
- `text`: Type text - `adb_input(action_type="text", text="hello")`
### `adb_launch_app(package_name, device_id?)`
Launch an app by package name.
### `adb_launch_url(url, device_id?)`
Open URL in default browser.
### `adb_list_packages(device_id?, filter_text?)`
List installed packages, optionally filtered.
### `adb_shell_command(command, device_id?)` - **RECOMMENDED for tap/swipe**
Execute shell commands including reliable input simulation:
- **Tap**: `adb_shell_command(command="input tap 400 600")`
- **Swipe**: `adb_shell_command(command="input swipe 100 200 300 400")`
- **Scroll down**: `adb_shell_command(command="input swipe 500 800 500 300")`
- **Key press**: `adb_shell_command(command="input keyevent KEYCODE_BACK")`
- **Type text**: `adb_shell_command(command="input text \"hello world\"")`
- Other commands: `ls /sdcard`, `pm list packages | grep chrome`
## Usage
### Using uvx (Recommended)
```bash ```bash
# Run directly with uvx # Run directly (no install)
uvx android-mcp-server uvx android-mcp-server
# Or from PyPI once published # Or install and run
uvx android-mcp-server uv add android-mcp-server
android-mcp-server
``` ```
### Local Development ### MCP Client Configuration
```bash
# Install dependencies
uv sync
# Run the server Add to your MCP client's config (Claude Desktop, Claude Code, etc.):
uv run android-mcp-server
```
### Docker Development
```bash
# Build and run with Docker Compose
docker-compose up --build
# Or build manually
docker build -t android-mcp-server .
docker run --privileged -v /dev/bus/usb:/dev/bus/usb android-mcp-server
```
## MCP Client Configuration
### Using uvx (Recommended)
Add to your MCP client configuration:
```json ```json
{ {
@ -88,50 +30,222 @@ Add to your MCP client configuration:
} }
``` ```
### Local Development For Claude Code:
```json ```bash
{ claude mcp add android-adb -- uvx android-mcp-server
"mcpServers": {
"android-adb": {
"command": "uv",
"args": ["run", "android-mcp-server"],
"cwd": "/path/to/android-mcp-server"
}
}
}
``` ```
### Docker For local development:
```bash
claude mcp add android-adb -- uv run --directory /path/to/mcp-adb android-mcp-server
```
## Prerequisites
- **Python 3.11+**
- **ADB** installed and on `PATH` (`adb devices` should work)
- **USB debugging** enabled on the Android device
- Device connected via USB (or `adb connect` for network)
## What Can It Do?
### Standard Tools (always available)
| Domain | Tool | What it does |
|--------|------|-------------|
| **Devices** | `devices_list` | Discover connected devices (USB + network) |
| | `devices_use` | Set active device for multi-device setups |
| | `devices_current` | Show which device is selected |
| | `device_info` | Battery, WiFi, storage, Android version, model |
| **Input** | `input_tap` | Tap at screen coordinates |
| | `input_swipe` | Swipe between two points |
| | `input_scroll_down` | Scroll down (auto-detects screen size) |
| | `input_scroll_up` | Scroll up (auto-detects screen size) |
| | `input_back` | Press Back |
| | `input_home` | Press Home |
| | `input_recent_apps` | Open app switcher |
| | `input_key` | Send any key event (`VOLUME_UP`, `ENTER`, etc.) |
| | `input_text` | Type text into focused field |
| | `clipboard_set` | Set clipboard (handles special chars), optional auto-paste |
| **Apps** | `app_launch` | Launch app by package name |
| | `app_open_url` | Open URL in default browser |
| | `app_close` | Force stop an app |
| | `app_current` | Get the foreground app and activity |
| **Screen** | `screenshot` | Capture screen as PNG |
| | `screen_size` | Get display resolution |
| | `screen_density` | Get display DPI |
| | `screen_on` / `screen_off` | Wake or sleep the display |
| **UI** | `ui_dump` | Dump accessibility tree (all visible elements) |
| | `ui_find_element` | Search for elements by text, ID, class, or description |
| | `wait_for_text` | Poll until text appears on screen |
| | `wait_for_text_gone` | Poll until text disappears |
| | `tap_text` | Find an element by text and tap it |
| **Config** | `config_status` | Show current settings |
| | `config_set_developer_mode` | Toggle developer tools |
| | `config_set_screenshot_dir` | Set where screenshots are saved |
### Developer Mode Tools
Enable with `config_set_developer_mode(true)` to unlock power-user tools. Destructive operations (uninstall, clear data, reboot, delete) require user confirmation via MCP elicitation.
| Domain | Tool | What it does |
|--------|------|-------------|
| **Shell** | `shell_command` | Run any shell command on device |
| **Input** | `input_long_press` | Press and hold gesture |
| **Apps** | `app_list_packages` | List installed packages (with filters) |
| | `app_install` | Install APK from host |
| | `app_uninstall` | Remove an app (with confirmation) |
| | `app_clear_data` | Wipe app data (with confirmation) |
| | `activity_start` | Launch activity with full intent control |
| | `broadcast_send` | Send broadcast intents |
| **Screen** | `screen_record` | Record screen to MP4 |
| | `screen_set_size` | Override display resolution |
| | `screen_reset_size` | Restore original resolution |
| **Device** | `device_reboot` | Reboot device (with confirmation) |
| | `logcat_capture` | Capture system logs |
| | `logcat_clear` | Clear log buffer |
| **Files** | `file_push` | Transfer file to device |
| | `file_pull` | Transfer file from device |
| | `file_list` | List directory contents |
| | `file_delete` | Delete file (with confirmation) |
| | `file_exists` | Check if file exists |
### Resources
| URI | Description |
|-----|-------------|
| `adb://devices` | Connected device list |
| `adb://device/{id}` | Detailed device properties |
| `adb://apps/current` | Currently focused app |
| `adb://screen/info` | Screen resolution and DPI |
| `adb://help` | Tool reference and tips |
## Usage Examples
**Screenshot + UI inspection loop** (how an AI assistant typically navigates):
```
1. screenshot() → See what's on screen
2. ui_dump() → Get element tree with tap coordinates
3. tap_text("Settings") → Tap the "Settings" element
4. wait_for_text("Wi-Fi") → Wait for the screen to load
5. screenshot() → Verify the result
```
**Open a URL and check what loaded:**
```
1. app_open_url("https://example.com")
2. wait_for_text("Example Domain")
3. screenshot()
```
**Install and launch an APK** (developer mode):
```
1. config_set_developer_mode(true)
2. app_install("/path/to/app.apk")
3. app_launch("com.example.myapp")
4. logcat_capture(filter_spec="MyApp:D *:S")
```
**Multi-device workflow:**
```
1. devices_list() → See all connected devices
2. devices_use("SERIAL_NUMBER") → Select target device
3. device_info() → Check battery, WiFi, storage
4. screenshot() → Capture from selected device
```
## Architecture
The server uses FastMCP's [MCPMixin](https://gofastmcp.com/) pattern to organize 50 tools into focused, single-responsibility modules:
```
src/
server.py ← FastMCP app, ADBServer (thin orchestrator)
config.py ← Persistent config (~/.config/adb-mcp/config.json)
models.py ← Pydantic models (DeviceInfo, CommandResult, ScreenshotResult)
mixins/
base.py ← ADB command execution, injection-safe shell quoting
devices.py ← Device discovery, info, logcat, reboot
input.py ← Tap, swipe, scroll, keys, text, clipboard, shell
apps.py ← Launch, close, install, intents, broadcasts
screenshot.py ← Capture, recording, display settings
ui.py ← Accessibility tree, element search, text polling
files.py ← Push, pull, list, delete, exists
```
`ADBServer` inherits all six mixins. Each mixin calls `run_shell_args()` (injection-safe) or `run_adb()` on the base class. The base handles device targeting, subprocess execution, and timeouts.
## Security Model
All tools that accept user-provided values use **injection-safe command execution**:
- **`run_shell_args()`** quotes every argument with `shlex.quote()` before sending to the device shell. This is the default for all tools.
- **`run_shell()`** (string form) is only used by the developer-mode `shell_command` tool, where the user intentionally provides a raw command.
- **`input_text()`** rejects special characters (`$ ( ) ; | & < >` etc.) and directs users to `clipboard_set()` instead.
- **`input_key()`** strips non-alphanumeric characters from key codes.
- **Destructive operations** (uninstall, clear data, delete, reboot) require user confirmation via MCP elicitation.
- **Developer mode** is off by default and must be explicitly enabled. Settings persist at `~/.config/adb-mcp/config.json`.
## Docker
```bash
docker build -t android-mcp-server .
docker run --privileged -v /dev/bus/usb:/dev/bus/usb android-mcp-server
```
The `--privileged` flag and USB volume mount are required for ADB to detect physical devices.
MCP client config for Docker:
```json ```json
{ {
"mcpServers": { "mcpServers": {
"android-adb": { "android-adb": {
"command": "docker", "command": "docker",
"args": ["run", "--privileged", "-v", "/dev/bus/usb:/dev/bus/usb", "android-mcp-server"] "args": ["run", "-i", "--privileged", "-v", "/dev/bus/usb:/dev/bus/usb", "android-mcp-server"]
} }
} }
} }
``` ```
## Requirements
- Python 3.11+
- ADB (Android Debug Bridge)
- USB access to Android devices
- Device with USB debugging enabled
## Development ## Development
```bash ```bash
# Install dev dependencies # Clone and install
git clone https://git.supported.systems/MCP/mcp-adb.git
cd mcp-adb
uv sync --group dev uv sync --group dev
# Format code # Run locally
uv run black src/ uv run android-mcp-server
# Lint # Lint
uv run ruff check src/ uv run ruff check src/
# Format
uv run ruff format src/
# Type check # Type check
uv run mypy src/ uv run mypy src/
``` ```
## Configuration
Settings are stored at `~/.config/adb-mcp/config.json` (override with `ADB_MCP_CONFIG_DIR` env var):
```json
{
"developer_mode": false,
"default_screenshot_dir": null,
"auto_select_single_device": true
}
```
| Setting | Default | Description |
|---------|---------|-------------|
| `developer_mode` | `false` | Unlock advanced tools (shell, install, reboot, etc.) |
| `default_screenshot_dir` | `null` | Directory for screenshots/recordings (null = cwd) |
| `auto_select_single_device` | `true` | Skip device selection when only one is connected |
## License
MIT

View File

@ -1,49 +1,57 @@
[project] [project]
name = "android-mcp-server" name = "android-mcp-server"
version = "0.1.0" version = "0.3.1"
description = "Android ADB MCP Server for device automation" description = "Android ADB MCP Server for device automation via Model Context Protocol"
authors = [ authors = [
{name = "Ryan", email = "ryan@example.com"} {name = "Ryan Malloy", email = "ryan@supported.systems"}
]
readme = "README.md"
license = {text = "MIT"}
keywords = ["mcp", "android", "adb", "automation", "fastmcp"]
classifiers = [
"Development Status :: 4 - Beta",
"Intended Audience :: Developers",
"License :: OSI Approved :: MIT License",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
"Programming Language :: Python :: 3.13",
"Topic :: Software Development :: Testing",
"Topic :: System :: Hardware :: Hardware Drivers",
] ]
dependencies = [ dependencies = [
"fastmcp>=0.2.0", "fastmcp>=2.14.0,<3.0.0",
"pydantic>=2.0.0", "pydantic>=2.12.0",
] ]
requires-python = ">=3.11" requires-python = ">=3.11"
[project.urls]
Homepage = "https://github.com/supported-systems/android-mcp-server"
Documentation = "https://github.com/supported-systems/android-mcp-server#readme"
Repository = "https://github.com/supported-systems/android-mcp-server"
[project.scripts] [project.scripts]
android-mcp-server = "src.server:main" android-mcp-server = "src.server:main"
[project.optional-dependencies]
dev = [
"pytest>=7.0.0",
"pytest-asyncio>=0.21.0",
"black>=23.0.0",
"ruff>=0.1.0",
"mypy>=1.5.0",
]
[build-system] [build-system]
requires = ["hatchling"] requires = ["hatchling"]
build-backend = "hatchling.build" build-backend = "hatchling.build"
[tool.uv] [dependency-groups]
dev-dependencies = [ dev = [
"pytest>=7.0.0", "pytest>=9.0.0",
"pytest-asyncio>=0.21.0", "pytest-asyncio>=1.3.0",
"black>=23.0.0", "ruff>=0.15.0",
"ruff>=0.1.0", "mypy>=1.19.0",
"mypy>=1.5.0",
] ]
[tool.black]
line-length = 88
target-version = ['py311']
[tool.ruff] [tool.ruff]
target-version = "py311" target-version = "py311"
line-length = 88 line-length = 88
[tool.ruff.lint]
select = ["E", "F", "I", "UP", "B", "SIM"]
[tool.mypy] [tool.mypy]
python_version = "3.11" python_version = "3.11"
strict = true strict = true

View File

@ -1 +1,20 @@
# Android MCP Server """Android ADB MCP Server.
A Model Context Protocol server for Android device automation via ADB.
"""
from .config import get_config, is_developer_mode
from .models import CommandResult, DeviceInfo, ScreenshotResult
from .server import ADBServer, main, mcp, server
__all__ = [
"main",
"mcp",
"server",
"ADBServer",
"get_config",
"is_developer_mode",
"DeviceInfo",
"CommandResult",
"ScreenshotResult",
]

100
src/config.py Normal file
View File

@ -0,0 +1,100 @@
"""Configuration management for Android ADB MCP Server.
Supports developer mode and persistent settings.
"""
import json
import os
from pathlib import Path
from typing import Any, Optional
# Config file location
_default_config_dir = Path.home() / ".config" / "adb-mcp"
CONFIG_DIR = Path(os.environ.get("ADB_MCP_CONFIG_DIR", _default_config_dir))
CONFIG_FILE = CONFIG_DIR / "config.json"
class Config:
"""Singleton configuration manager with persistence."""
_instance: Optional["Config"] = None
_settings: dict[str, Any]
def __new__(cls) -> "Config":
if cls._instance is None:
cls._instance = super().__new__(cls)
cls._instance._settings = cls._instance._load()
return cls._instance
def _load(self) -> dict[str, Any]:
"""Load settings from disk."""
if CONFIG_FILE.exists():
try:
return json.loads(CONFIG_FILE.read_text())
except (json.JSONDecodeError, OSError):
pass
return self._defaults()
def _save(self) -> None:
"""Persist settings to disk."""
CONFIG_DIR.mkdir(parents=True, exist_ok=True)
CONFIG_FILE.write_text(json.dumps(self._settings, indent=2))
@staticmethod
def _defaults() -> dict[str, Any]:
"""Default configuration values."""
return {
"developer_mode": False,
"default_screenshot_dir": None,
"auto_select_single_device": True,
}
@property
def developer_mode(self) -> bool:
"""Check if developer mode is enabled."""
return self._settings.get("developer_mode", False)
@developer_mode.setter
def developer_mode(self, value: bool) -> None:
"""Enable or disable developer mode."""
self._settings["developer_mode"] = value
self._save()
@property
def auto_select_single_device(self) -> bool:
"""Auto-select device when only one is connected."""
return self._settings.get("auto_select_single_device", True)
@property
def default_screenshot_dir(self) -> str | None:
"""Default directory for screenshots."""
return self._settings.get("default_screenshot_dir")
@default_screenshot_dir.setter
def default_screenshot_dir(self, value: str | None) -> None:
"""Set default screenshot directory."""
self._settings["default_screenshot_dir"] = value
self._save()
def get(self, key: str, default: Any = None) -> Any:
"""Get a config value."""
return self._settings.get(key, default)
def set(self, key: str, value: Any) -> None:
"""Set a config value and persist."""
self._settings[key] = value
self._save()
def to_dict(self) -> dict[str, Any]:
"""Export all settings."""
return self._settings.copy()
def get_config() -> Config:
"""Get the singleton config instance."""
return Config()
def is_developer_mode() -> bool:
"""Quick check for developer mode status."""
return get_config().developer_mode

19
src/mixins/__init__.py Normal file
View File

@ -0,0 +1,19 @@
"""MCP Mixins for Android ADB Server."""
from .apps import AppsMixin
from .base import ADBBaseMixin
from .devices import DevicesMixin
from .files import FilesMixin
from .input import InputMixin
from .screenshot import ScreenshotMixin
from .ui import UIMixin
__all__ = [
"ADBBaseMixin",
"DevicesMixin",
"InputMixin",
"AppsMixin",
"ScreenshotMixin",
"UIMixin",
"FilesMixin",
]

569
src/mixins/apps.py Normal file
View File

@ -0,0 +1,569 @@
"""Apps mixin for Android ADB MCP Server.
Provides tools for app management and launching.
"""
import re
from typing import Any
from fastmcp import Context
from fastmcp.contrib.mcp_mixin import mcp_resource, mcp_tool
from ..config import is_developer_mode
from .base import ADBBaseMixin
# Common Android intent flags (hex values for am start -f)
_INTENT_FLAGS: dict[str, int] = {
"FLAG_ACTIVITY_NEW_TASK": 0x10000000,
"FLAG_ACTIVITY_CLEAR_TOP": 0x04000000,
"FLAG_ACTIVITY_SINGLE_TOP": 0x20000000,
"FLAG_ACTIVITY_NO_HISTORY": 0x40000000,
"FLAG_ACTIVITY_CLEAR_TASK": 0x00008000,
"FLAG_ACTIVITY_EXCLUDE_FROM_RECENTS": 0x00800000,
"FLAG_ACTIVITY_FORWARD_RESULT": 0x02000000,
"FLAG_ACTIVITY_MULTIPLE_TASK": 0x08000000,
}
class AppsMixin(ADBBaseMixin):
"""Mixin for Android app management.
Provides tools for:
- Launching apps
- Opening URLs
- Listing packages (developer mode)
- Installing/uninstalling apps (developer mode)
"""
@mcp_tool()
async def app_launch(
self,
package_name: str,
device_id: str | None = None,
) -> dict[str, Any]:
"""Launch an app by package name.
Starts the main activity of the specified application.
Common package names:
- com.android.chrome - Chrome browser
- com.android.settings - Settings
- com.android.vending - Play Store
- com.google.android.gm - Gmail
- com.google.android.apps.maps - Google Maps
Args:
package_name: Android package name (e.g., com.android.chrome)
device_id: Target device
Returns:
Success status
"""
result = await self.run_shell_args(
[
"monkey",
"-p",
package_name,
"-c",
"android.intent.category.LAUNCHER",
"1",
],
device_id,
)
return {
"success": result.success,
"action": "launch",
"package": package_name,
"error": result.stderr if not result.success else None,
}
@mcp_tool()
async def app_open_url(
self,
url: str,
device_id: str | None = None,
) -> dict[str, Any]:
"""Open a URL in the default browser.
Launches the default browser and navigates to the URL.
Supports http://, https://, and other URL schemes.
Args:
url: URL to open (e.g., https://example.com)
device_id: Target device
Returns:
Success status
"""
result = await self.run_shell_args(
[
"am",
"start",
"-a",
"android.intent.action.VIEW",
"-d",
url,
],
device_id,
)
return {
"success": result.success,
"action": "open_url",
"url": url,
"error": result.stderr if not result.success else None,
}
@mcp_tool()
async def app_close(
self,
package_name: str,
device_id: str | None = None,
) -> dict[str, Any]:
"""Force stop an app.
Stops the application and all its background services.
Args:
package_name: Package name to stop
device_id: Target device
Returns:
Success status
"""
result = await self.run_shell_args(
["am", "force-stop", package_name], device_id
)
return {
"success": result.success,
"action": "close",
"package": package_name,
"error": result.stderr if not result.success else None,
}
@mcp_tool()
async def app_current(
self,
device_id: str | None = None,
) -> dict[str, Any]:
"""Get the currently focused app.
Returns the package name of the app currently in foreground.
Args:
device_id: Target device
Returns:
Current app package and activity
"""
# Use plain "dumpsys window" — the focus fields live at the
# top level, not inside the "windows" subsection on many devices
result = await self.run_shell_args(["dumpsys", "window"], device_id)
if result.success:
package = None
activity = None
for line in result.stdout.split("\n"):
if "mFocusedApp" in line or "mCurrentFocus" in line:
# Match both formats:
# mFocusedApp=ActivityRecord{... com.pkg/.Act t123}
# mCurrentFocus=Window{... com.pkg/com.pkg.Act}
match = re.search(r"([\w.]+)/([\w.]*\.?\w+)", line)
if match:
package = match.group(1)
activity = match.group(2)
break
return {
"success": True,
"package": package,
"activity": activity,
"raw": result.stdout[:500] if not package else None,
}
return {
"success": False,
"error": result.stderr,
}
# === Developer Mode Tools ===
@mcp_tool(
tags={"developer"},
annotations={"requires": "developer_mode"},
)
async def app_list_packages(
self,
filter_text: str | None = None,
system_only: bool = False,
third_party_only: bool = False,
device_id: str | None = None,
) -> dict[str, Any]:
"""List installed packages.
[DEVELOPER MODE] Retrieves all installed application packages.
Args:
filter_text: Filter packages containing this text
system_only: Only show system packages
third_party_only: Only show third-party (user installed) packages
device_id: Target device
Returns:
List of package names
"""
if not is_developer_mode():
return {
"success": False,
"error": "Developer mode required",
}
cmd = ["pm", "list", "packages"]
if system_only:
cmd.append("-s")
elif third_party_only:
cmd.append("-3")
result = await self.run_shell_args(cmd, device_id)
if result.success:
packages = []
for line in result.stdout.split("\n"):
if line.startswith("package:"):
pkg = line.replace("package:", "").strip()
if filter_text is None or filter_text.lower() in pkg.lower():
packages.append(pkg)
return {
"success": True,
"packages": sorted(packages),
"count": len(packages),
}
return {
"success": False,
"error": result.stderr,
}
@mcp_tool(
tags={"developer"},
annotations={"requires": "developer_mode"},
)
async def app_install(
self,
apk_path: str,
device_id: str | None = None,
) -> dict[str, Any]:
"""Install an APK file.
[DEVELOPER MODE] Installs an APK from the host machine to the device.
Args:
apk_path: Path to APK file on host machine
device_id: Target device
Returns:
Installation result
"""
if not is_developer_mode():
return {
"success": False,
"error": "Developer mode required",
}
result = await self.run_adb(["install", "-r", apk_path], device_id)
return {
"success": result.success,
"action": "install",
"apk": apk_path,
"output": result.stdout,
"error": result.stderr if not result.success else None,
}
@mcp_tool(
tags={"developer"},
annotations={"requires": "developer_mode"},
)
async def app_uninstall(
self,
ctx: Context,
package_name: str,
keep_data: bool = False,
device_id: str | None = None,
) -> dict[str, Any]:
"""Uninstall an app.
[DEVELOPER MODE] Removes an application from the device.
Requires user confirmation before proceeding.
Args:
ctx: MCP context for elicitation/logging
package_name: Package to uninstall
keep_data: Keep app data after uninstall
device_id: Target device
Returns:
Uninstall result
"""
if not is_developer_mode():
return {
"success": False,
"error": "Developer mode required",
}
# Elicit confirmation
await ctx.warning(f"Uninstall requested: {package_name}")
data_note = " (keeping app data)" if keep_data else " (all data will be lost)"
confirmation = await ctx.elicit(
f"Are you sure you want to uninstall '{package_name}'?{data_note}",
["Yes, uninstall", "Cancel"],
)
if confirmation.action != "accept" or confirmation.content == "Cancel":
await ctx.info("Uninstall cancelled by user")
return {
"success": False,
"cancelled": True,
"message": "Uninstall cancelled by user",
}
await ctx.info(f"Uninstalling {package_name}...")
cmd = ["uninstall"]
if keep_data:
cmd.append("-k")
cmd.append(package_name)
result = await self.run_adb(cmd, device_id)
if result.success:
await ctx.info(f"Successfully uninstalled {package_name}")
else:
await ctx.error(f"Uninstall failed: {result.stderr}")
return {
"success": result.success,
"action": "uninstall",
"package": package_name,
"kept_data": keep_data,
"error": result.stderr if not result.success else None,
}
@mcp_tool(
tags={"developer"},
annotations={"requires": "developer_mode"},
)
async def app_clear_data(
self,
ctx: Context,
package_name: str,
device_id: str | None = None,
) -> dict[str, Any]:
"""Clear app data and cache.
[DEVELOPER MODE] Clears all data for an application (like a fresh
install). Requires user confirmation before proceeding.
Args:
ctx: MCP context for elicitation/logging
package_name: Package to clear
device_id: Target device
Returns:
Clear result
"""
if not is_developer_mode():
return {
"success": False,
"error": "Developer mode required",
}
# Elicit confirmation
await ctx.warning(f"Clear data requested: {package_name}")
confirmation = await ctx.elicit(
f"Are you sure you want to clear ALL data for "
f"'{package_name}'? This includes login state, settings, "
"saved files, and cache. The app will be reset to a "
"fresh install state.",
["Yes, clear all data", "Cancel"],
)
if confirmation.action != "accept" or confirmation.content == "Cancel":
await ctx.info("Clear data cancelled by user")
return {
"success": False,
"cancelled": True,
"message": "Clear data cancelled by user",
}
await ctx.info(f"Clearing data for {package_name}...")
result = await self.run_shell_args(["pm", "clear", package_name], device_id)
if result.success:
await ctx.info(f"Successfully cleared data for {package_name}")
else:
await ctx.error(f"Clear data failed: {result.stderr}")
return {
"success": result.success,
"action": "clear_data",
"package": package_name,
"error": result.stderr if not result.success else None,
}
@mcp_tool(
tags={"developer"},
annotations={"requires": "developer_mode"},
)
async def activity_start(
self,
component: str,
action: str | None = None,
data_uri: str | None = None,
extras: dict[str, str] | None = None,
flags: list[str] | None = None,
device_id: str | None = None,
) -> dict[str, Any]:
"""Start a specific activity with intent.
[DEVELOPER MODE] Launch an activity with full intent control.
More powerful than app_launch for deep linking and testing.
Args:
component: Activity component (e.g., "com.example/.MainActivity")
action: Intent action (e.g., "android.intent.action.VIEW")
data_uri: Data URI to pass to activity
extras: Extra key-value pairs to include in intent
flags: Intent flag names (e.g., ["FLAG_ACTIVITY_NEW_TASK"])
or hex values (e.g., ["0x10000000"])
device_id: Target device
Returns:
Activity start result
Examples:
Start specific activity:
component="com.example/.MainActivity"
Deep link with data:
component="com.example/.DeepLinkActivity"
data_uri="myapp://product/123"
"""
if not is_developer_mode():
return {
"success": False,
"error": "Developer mode required",
}
cmd_args = ["am", "start"]
if action:
cmd_args.extend(["-a", action])
if data_uri:
cmd_args.extend(["-d", data_uri])
# Resolve flag names to integer values for am start -f
if flags:
combined_flags = 0
for flag in flags:
flag_clean = flag.strip()
if flag_clean.startswith("0x"):
combined_flags |= int(flag_clean, 16)
elif flag_clean in _INTENT_FLAGS:
combined_flags |= _INTENT_FLAGS[flag_clean]
elif flag_clean.isdigit():
combined_flags |= int(flag_clean)
if combined_flags:
cmd_args.extend(["-f", str(combined_flags)])
if extras:
for key, value in extras.items():
if value.lower() in ("true", "false"):
cmd_args.extend(["--ez", key, value.lower()])
elif re.match(r"^-?\d+$", value):
cmd_args.extend(["--ei", key, value])
else:
cmd_args.extend(["--es", key, value])
cmd_args.extend(["-n", component])
result = await self.run_shell_args(cmd_args, device_id)
return {
"success": result.success,
"action": "activity_start",
"component": component,
"intent_action": action,
"data_uri": data_uri,
"output": result.stdout,
"error": result.stderr if not result.success else None,
}
@mcp_tool(
tags={"developer"},
annotations={"requires": "developer_mode"},
)
async def broadcast_send(
self,
action: str,
extras: dict[str, str] | None = None,
package: str | None = None,
device_id: str | None = None,
) -> dict[str, Any]:
"""Send a broadcast intent.
[DEVELOPER MODE] Sends a broadcast that can be received by
BroadcastReceivers. Useful for testing and triggering app behavior.
Args:
action: Broadcast action (e.g., "com.example.MY_ACTION")
extras: Extra key-value pairs to include
package: Limit to specific package (optional)
device_id: Target device
Returns:
Broadcast result
Common system broadcasts (for testing receivers):
- android.intent.action.AIRPLANE_MODE
- android.intent.action.BATTERY_LOW
- android.net.conn.CONNECTIVITY_CHANGE
"""
if not is_developer_mode():
return {
"success": False,
"error": "Developer mode required",
}
cmd_args = ["am", "broadcast", "-a", action]
if package:
cmd_args.extend(["-p", package])
if extras:
for key, value in extras.items():
if value.lower() in ("true", "false"):
cmd_args.extend(["--ez", key, value.lower()])
elif re.match(r"^-?\d+$", value):
cmd_args.extend(["--ei", key, value])
else:
cmd_args.extend(["--es", key, value])
result = await self.run_shell_args(cmd_args, device_id)
return {
"success": result.success,
"action": "broadcast_send",
"broadcast_action": action,
"package": package,
"output": result.stdout,
"error": result.stderr if not result.success else None,
}
# === Resources ===
@mcp_resource(uri="adb://apps/current")
async def resource_current_app(self) -> dict[str, Any]:
"""Resource: Get currently focused app."""
return await self.app_current()

169
src/mixins/base.py Normal file
View File

@ -0,0 +1,169 @@
"""Base mixin providing shared ADB command execution."""
import asyncio
import contextlib
import shlex
from fastmcp.contrib.mcp_mixin import MCPMixin
from ..models import CommandResult
# Default timeout for ADB commands (30 seconds)
DEFAULT_TIMEOUT = 30
class ADBBaseMixin(MCPMixin):
"""Base mixin with shared ADB functionality.
Provides:
- Async ADB command execution
- Device targeting (multi-device support)
- Command result parsing
"""
def __init__(self) -> None:
super().__init__()
self._current_device: str | None = None
def set_current_device(self, device_id: str | None) -> None:
"""Set the default device for subsequent commands."""
self._current_device = device_id
def get_current_device(self) -> str | None:
"""Get the current default device."""
return self._current_device
async def run_adb(
self,
cmd: list[str],
device_id: str | None = None,
timeout: int = DEFAULT_TIMEOUT,
) -> CommandResult:
"""Execute ADB command and return structured result.
Args:
cmd: Command arguments (without 'adb' prefix)
device_id: Target device (uses current if not specified)
timeout: Command timeout in seconds (default 30)
Returns:
CommandResult with success status, stdout, stderr, returncode
"""
full_cmd = ["adb"]
# Use specified device, fall back to current device
target_device = device_id or self._current_device
if target_device:
full_cmd.extend(["-s", target_device])
full_cmd.extend(cmd)
try:
result = await asyncio.create_subprocess_exec(
*full_cmd,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE,
)
stdout, stderr = await asyncio.wait_for(
result.communicate(), timeout=timeout
)
return CommandResult(
success=result.returncode == 0,
stdout=stdout.decode("utf-8", errors="ignore").strip(),
stderr=stderr.decode("utf-8", errors="ignore").strip(),
returncode=result.returncode or 0,
)
except TimeoutError:
# Kill the process if it timed out
with contextlib.suppress(ProcessLookupError):
result.kill() # type: ignore[possibly-undefined]
return CommandResult(
success=False,
stdout="",
stderr=f"Command timed out after {timeout}s",
returncode=-1,
)
except Exception as e:
return CommandResult(
success=False,
stdout="",
stderr=str(e),
returncode=-1,
)
async def run_shell(
self,
command: str,
device_id: str | None = None,
timeout: int = DEFAULT_TIMEOUT,
) -> CommandResult:
"""Execute shell command on device (string form).
WARNING: Only use for developer-mode shell_command where the
user explicitly provides the command string. For structured
commands with known arguments, use run_shell_args() instead
to prevent shell injection.
Args:
command: Shell command string
device_id: Target device
timeout: Command timeout in seconds
Returns:
CommandResult with command output
"""
# Use shlex to properly handle quoted strings
try:
parts = shlex.split(command)
except ValueError:
# Fall back to simple split if shlex fails
parts = command.split()
return await self.run_adb(["shell"] + parts, device_id, timeout=timeout)
async def run_shell_args(
self,
args: list[str],
device_id: str | None = None,
timeout: int = DEFAULT_TIMEOUT,
) -> CommandResult:
"""Execute shell command on device (list form, injection-safe).
Each argument is shell-quoted before being sent to the device,
preventing shell injection even though ADB concatenates args
for the device-side shell interpreter.
Args:
args: Command arguments as a list
device_id: Target device
timeout: Command timeout in seconds
Returns:
CommandResult with command output
"""
# Quote each arg for the device-side shell. ADB concatenates
# all args after "shell" with spaces and sends to the device
# shell, so we must quote to prevent injection.
quoted = [shlex.quote(a) for a in args]
return await self.run_adb(["shell"] + quoted, device_id, timeout=timeout)
async def get_device_property(
self,
prop: str,
device_id: str | None = None,
) -> str | None:
"""Get a device property via getprop.
Args:
prop: Property name (e.g., 'ro.product.model')
device_id: Target device
Returns:
Property value or None if not found
"""
result = await self.run_shell_args(["getprop", prop], device_id)
if result.success and result.stdout:
return result.stdout
return None

460
src/mixins/devices.py Normal file
View File

@ -0,0 +1,460 @@
"""Devices mixin for Android ADB MCP Server.
Provides tools and resources for device discovery and management.
"""
import contextlib
import re
from typing import Any
from fastmcp import Context
from fastmcp.contrib.mcp_mixin import mcp_resource, mcp_tool
from ..config import is_developer_mode
from ..models import DeviceInfo
from .base import ADBBaseMixin
class DevicesMixin(ADBBaseMixin):
"""Mixin for Android device management.
Provides tools for:
- Listing connected devices
- Getting device details
- Setting current working device
"""
def __init__(self) -> None:
super().__init__()
self._devices_cache: dict[str, DeviceInfo] = {}
async def _refresh_devices(self) -> list[DeviceInfo]:
"""Refresh the internal devices cache.
Returns:
List of discovered devices
"""
result = await self.run_adb(["devices", "-l"])
if not result.success:
return []
devices = []
lines = result.stdout.split("\n")[1:] # Skip header
for line in lines:
if not line.strip():
continue
parts = line.split()
if len(parts) >= 2:
device_id = parts[0]
status = parts[1]
# Parse extended info (model:xxx product:xxx)
model = None
product = None
for part in parts[2:]:
if part.startswith("model:"):
model = part.split(":", 1)[1]
elif part.startswith("product:"):
product = part.split(":", 1)[1]
device = DeviceInfo(
device_id=device_id,
status=status,
model=model,
product=product,
)
devices.append(device)
self._devices_cache[device_id] = device
return devices
@mcp_tool()
async def devices_list(self) -> list[DeviceInfo]:
"""List all connected Android devices.
Discovers devices connected via USB or network (adb connect).
Use this first to identify available devices before other operations.
Returns:
List of connected devices with IDs, status, and model info
"""
return await self._refresh_devices()
@mcp_tool()
async def devices_use(self, device_id: str) -> dict[str, Any]:
"""Set the current working device.
All subsequent commands will target this device by default.
Useful when multiple devices are connected.
Args:
device_id: Device serial number from devices_list
Returns:
Confirmation with device details
"""
# Verify device exists
devices = await self._refresh_devices()
device = next((d for d in devices if d.device_id == device_id), None)
if not device:
return {
"success": False,
"error": f"Device {device_id} not found",
"available": [d.device_id for d in devices],
}
if device.status != "device":
return {
"success": False,
"error": f"Device {device_id} is {device.status}, not ready",
}
self.set_current_device(device_id)
return {
"success": True,
"message": f"Now using device {device_id}",
"device": device.model_dump(),
}
@mcp_tool()
async def devices_current(self) -> dict[str, Any]:
"""Get information about the current working device.
Returns:
Current device info or error if none set
"""
current = self.get_current_device()
if not current:
devices = await self._refresh_devices()
if len(devices) == 1:
# Auto-select if only one device
return {
"device": None,
"message": "No device set, but only one available",
"available": devices[0].model_dump(),
}
return {
"device": None,
"error": "No current device set. Use devices_use() first.",
"available": [d.device_id for d in devices],
}
device = self._devices_cache.get(current)
if device:
return {"device": device.model_dump()}
return {"device": current, "cached_info": None}
@mcp_resource(uri="adb://devices")
async def resource_devices_list(self) -> dict[str, Any]:
"""Resource: List all connected Android devices.
Lightweight enumeration for quick reference.
"""
devices = await self._refresh_devices()
return {
"devices": [d.model_dump() for d in devices],
"count": len(devices),
"current": self.get_current_device(),
}
@mcp_resource(uri="adb://device/{device_id}")
async def resource_device_info(self, device_id: str) -> dict[str, Any]:
"""Resource: Get detailed information about a specific device.
Args:
device_id: Device serial number
Returns:
Device details including Android version, model, etc.
"""
# Ensure devices are loaded
await self._refresh_devices()
device = self._devices_cache.get(device_id)
if not device:
return {"error": f"Device {device_id} not found"}
# Fetch additional properties
props = {}
for prop_name, prop_key in [
("android_version", "ro.build.version.release"),
("sdk_version", "ro.build.version.sdk"),
("manufacturer", "ro.product.manufacturer"),
("brand", "ro.product.brand"),
("device", "ro.product.device"),
]:
value = await self.get_device_property(prop_key, device_id)
if value:
props[prop_name] = value
return {
**device.model_dump(),
**props,
}
@mcp_tool()
async def device_info(
self,
device_id: str | None = None,
) -> dict[str, Any]:
"""Get comprehensive device information.
Returns device state including battery, wifi, storage, and system info.
Useful for quick device health checks.
Args:
device_id: Target device
Returns:
Device information including battery, wifi, storage, etc.
"""
info: dict[str, Any] = {}
# Battery info — also serves as connectivity check
battery = await self.run_shell_args(["dumpsys", "battery"], device_id)
if not battery.success:
return {
"success": False,
"error": battery.stderr or "No device connected",
}
info["success"] = True
battery_info: dict[str, Any] = {}
for line in battery.stdout.split("\n"):
if "level:" in line:
with contextlib.suppress(ValueError):
battery_info["level"] = int(line.split(":")[1].strip())
elif "status:" in line:
status_map = {
"1": "unknown",
"2": "charging",
"3": "discharging",
"4": "not_charging",
"5": "full",
}
status = line.split(":")[1].strip()
battery_info["status"] = status_map.get(status, status)
elif "plugged:" in line:
plugged_map = {
"0": "unplugged",
"1": "AC",
"2": "USB",
"4": "wireless",
}
plugged = line.split(":")[1].strip()
battery_info["plugged"] = plugged_map.get(plugged, plugged)
info["battery"] = battery_info
# Get IP address — parse ip addr output in Python (no pipes)
ip_result = await self.run_shell_args(
["ip", "addr", "show", "wlan0"], device_id
)
if ip_result.success:
inet_match = re.search(r"inet (\d+\.\d+\.\d+\.\d+)/", ip_result.stdout)
if inet_match:
info["ip_address"] = inet_match.group(1)
# WiFi connection info — parse dumpsys in Python (no pipes)
wifi = await self.run_shell_args(["dumpsys", "wifi"], device_id)
if wifi.success:
for wifi_line in wifi.stdout.split("\n"):
if "mWifiInfo" in wifi_line and "SSID:" in wifi_line:
try:
ssid_part = wifi_line.split("SSID:")[1].split(",")[0].strip()
info["wifi_ssid"] = ssid_part.strip('"')
except IndexError:
pass
break
# System properties
props_to_fetch = [
("android_version", "ro.build.version.release"),
("sdk_version", "ro.build.version.sdk"),
("model", "ro.product.model"),
("manufacturer", "ro.product.manufacturer"),
("device_name", "ro.product.device"),
]
for key, prop in props_to_fetch:
value = await self.get_device_property(prop, device_id)
if value:
info[key] = value
# Storage info — parse df output in Python (no pipes)
storage = await self.run_shell_args(["df", "/data"], device_id)
if storage.success:
lines = storage.stdout.strip().split("\n")
if len(lines) >= 2:
parts = lines[-1].split()
if len(parts) >= 4:
with contextlib.suppress(ValueError):
info["storage"] = {
"total_kb": int(parts[1]),
"used_kb": int(parts[2]),
"available_kb": int(parts[3]),
}
return info
@mcp_tool(
tags={"developer"},
annotations={"requires": "developer_mode"},
)
async def device_reboot(
self,
ctx: Context,
mode: str | None = None,
device_id: str | None = None,
) -> dict[str, Any]:
"""Reboot the device.
[DEVELOPER MODE] Reboots the Android device.
Requires user confirmation before proceeding.
Args:
ctx: MCP context for elicitation/logging
mode: Optional reboot mode:
- None: Normal reboot
- "recovery": Boot to recovery mode
- "bootloader": Boot to bootloader/fastboot
device_id: Target device
Returns:
Reboot command result
"""
if not is_developer_mode():
return {
"success": False,
"error": "Developer mode required",
}
# Elicit confirmation for this dangerous action
mode_desc = mode or "normal"
await ctx.warning(f"Reboot requested: {mode_desc} mode")
confirmation = await ctx.elicit(
f"Are you sure you want to reboot the device in "
f"{mode_desc} mode? "
"This will interrupt any running operations.",
["Yes, reboot now", "Cancel"],
)
if confirmation.action != "accept" or confirmation.content == "Cancel":
await ctx.info("Reboot cancelled by user")
return {
"success": False,
"cancelled": True,
"message": "Reboot cancelled by user",
}
await ctx.info(f"Initiating {mode_desc} reboot...")
cmd = ["reboot"]
if mode:
cmd.append(mode)
result = await self.run_adb(cmd, device_id)
if result.success:
await ctx.info("Reboot command sent successfully")
else:
await ctx.error(f"Reboot failed: {result.stderr}")
return {
"success": result.success,
"action": "reboot",
"mode": mode_desc,
"error": result.stderr if not result.success else None,
}
@mcp_tool(
tags={"developer"},
annotations={"requires": "developer_mode"},
)
async def logcat_capture(
self,
lines: int = 100,
filter_spec: str | None = None,
clear_first: bool = False,
device_id: str | None = None,
) -> dict[str, Any]:
"""Capture logcat output.
[DEVELOPER MODE] Retrieves Android system logs.
Essential for debugging app crashes and system issues.
Args:
lines: Number of recent log lines to capture (default 100)
filter_spec: Filter by tag:priority (e.g., "MyApp:D *:S")
clear_first: Clear the log buffer before capturing
device_id: Target device
Returns:
Logcat output
"""
if not is_developer_mode():
return {
"success": False,
"error": "Developer mode required",
}
# Clear first if requested
if clear_first:
await self.run_shell_args(["logcat", "-c"], device_id)
# Build command as args list — filter_spec is split safely
cmd = ["logcat", "-d", "-t", str(lines)]
if filter_spec:
# Split filter spec on whitespace (e.g., "MyApp:D *:S")
# Each token is a separate arg, safely quoted by run_shell_args
cmd.extend(filter_spec.split())
result = await self.run_shell_args(cmd, device_id)
return {
"success": result.success,
"lines_requested": lines,
"filter": filter_spec,
"output": result.stdout,
"error": result.stderr if not result.success else None,
}
@mcp_tool(
tags={"developer"},
annotations={"requires": "developer_mode"},
)
async def logcat_clear(
self,
device_id: str | None = None,
) -> dict[str, Any]:
"""Clear the logcat buffer.
[DEVELOPER MODE] Clears all logs from the device log buffer.
Useful before reproducing an issue to get clean logs.
Args:
device_id: Target device
Returns:
Success status
"""
if not is_developer_mode():
return {
"success": False,
"error": "Developer mode required",
}
result = await self.run_shell_args(["logcat", "-c"], device_id)
return {
"success": result.success,
"action": "logcat_clear",
"error": result.stderr if not result.success else None,
}

321
src/mixins/files.py Normal file
View File

@ -0,0 +1,321 @@
"""Files mixin for Android ADB MCP Server.
Provides tools for file transfer between host and device.
"""
from pathlib import Path
from typing import Any
from fastmcp import Context
from fastmcp.contrib.mcp_mixin import mcp_tool
from ..config import is_developer_mode
from .base import ADBBaseMixin
class FilesMixin(ADBBaseMixin):
"""Mixin for Android file operations.
Provides tools for:
- Pushing files to device (developer mode)
- Pulling files from device (developer mode)
- Listing files on device (developer mode)
"""
@mcp_tool(
tags={"developer"},
annotations={"requires": "developer_mode"},
)
async def file_push(
self,
ctx: Context,
local_path: str,
device_path: str,
device_id: str | None = None,
) -> dict[str, Any]:
"""Push a file from host to device.
[DEVELOPER MODE] Transfers a file from the local machine to the
Android device. Useful for deploying configs, test data, APKs, etc.
Common destinations:
- /sdcard/ - External storage (accessible without root)
- /sdcard/Download/ - Downloads folder
- /data/local/tmp/ - Temp directory for executables
Args:
ctx: MCP context for logging
local_path: Path to file on host machine
device_path: Destination path on device (e.g., /sdcard/file.txt)
device_id: Target device
Returns:
Transfer result with bytes transferred
"""
if not is_developer_mode():
return {
"success": False,
"error": "Developer mode required",
}
# Verify local file exists
local = Path(local_path)
if not local.exists():
return {
"success": False,
"error": f"Local file not found: {local_path}",
}
file_size = local.stat().st_size
await ctx.info(f"Pushing {local.name} ({file_size:,} bytes) to {device_path}")
result = await self.run_adb(
["push", str(local.absolute()), device_path], device_id
)
if result.success:
await ctx.info(f"Successfully pushed {local.name}")
else:
await ctx.error(f"Push failed: {result.stderr}")
return {
"success": result.success,
"action": "push",
"local_path": str(local.absolute()),
"device_path": device_path,
"output": result.stdout,
"error": result.stderr if not result.success else None,
}
@mcp_tool(
tags={"developer"},
annotations={"requires": "developer_mode"},
)
async def file_pull(
self,
ctx: Context,
device_path: str,
local_path: str | None = None,
device_id: str | None = None,
) -> dict[str, Any]:
"""Pull a file from device to host.
[DEVELOPER MODE] Transfers a file from the Android device to the
local machine. Useful for retrieving logs, databases, screenshots.
Common sources:
- /sdcard/ - External storage
- /data/data/<package>/databases/ - App databases (may need root)
- /sdcard/Android/data/<package>/ - App external data
Args:
ctx: MCP context for logging
device_path: Path to file on device
local_path: Destination on host (default: current dir with same name)
device_id: Target device
Returns:
Transfer result with local file path
"""
if not is_developer_mode():
return {
"success": False,
"error": "Developer mode required",
}
# Default local path to current directory with same filename
if not local_path:
local_path = Path(device_path).name
local = Path(local_path).absolute()
local.parent.mkdir(parents=True, exist_ok=True)
await ctx.info(f"Pulling {device_path} to {local}")
result = await self.run_adb(["pull", device_path, str(local)], device_id)
if result.success:
await ctx.info(f"Successfully pulled to {local}")
else:
await ctx.error(f"Pull failed: {result.stderr}")
return {
"success": result.success,
"action": "pull",
"device_path": device_path,
"local_path": str(local),
"output": result.stdout,
"error": result.stderr if not result.success else None,
}
@mcp_tool(
tags={"developer"},
annotations={"requires": "developer_mode"},
)
async def file_list(
self,
device_path: str = "/sdcard/",
device_id: str | None = None,
) -> dict[str, Any]:
"""List files in a directory on the device.
[DEVELOPER MODE] Lists files and directories at the specified path.
Args:
device_path: Directory path on device (default: /sdcard/)
device_id: Target device
Returns:
List of files and directories
"""
if not is_developer_mode():
return {
"success": False,
"error": "Developer mode required",
}
result = await self.run_shell_args(["ls", "-la", device_path], device_id)
if not result.success:
return {
"success": False,
"error": result.stderr or "Failed to list directory",
}
# Parse ls output — Android uses ISO dates (YYYY-MM-DD HH:MM)
# while traditional ls uses (Mon DD HH:MM), so date takes 2 or 3 fields
files = []
for line in result.stdout.strip().split("\n"):
if not line or line.startswith("total"):
continue
parts = line.split()
if len(parts) < 7:
continue
# Detect date format: ISO "2024-01-15" vs traditional "Jan 15"
# ISO dates have a dash at index 4 of the date field
date_field = parts[5]
if len(date_field) == 10 and date_field[4:5] == "-":
# Android ISO format: perms links owner group size YYYY-MM-DD HH:MM name
date_str = f"{parts[5]} {parts[6]}"
name = " ".join(parts[7:])
elif len(parts) >= 8:
# Traditional format: perms links owner group size Mon DD HH:MM name
date_str = f"{parts[5]} {parts[6]} {parts[7]}"
name = " ".join(parts[8:])
else:
continue
files.append(
{
"permissions": parts[0],
"size": parts[4],
"date": date_str,
"name": name,
"is_directory": parts[0].startswith("d"),
}
)
return {
"success": True,
"path": device_path,
"files": files,
"count": len(files),
}
@mcp_tool(
tags={"developer"},
annotations={"requires": "developer_mode"},
)
async def file_delete(
self,
ctx: Context,
device_path: str,
device_id: str | None = None,
) -> dict[str, Any]:
"""Delete a file on the device.
[DEVELOPER MODE] Removes a file from the device storage.
Requires user confirmation. Deletion is permanent.
Args:
ctx: MCP context for elicitation/logging
device_path: Path to file on device
device_id: Target device
Returns:
Deletion result
"""
if not is_developer_mode():
return {
"success": False,
"error": "Developer mode required",
}
# Elicit confirmation
await ctx.warning(f"Delete requested: {device_path}")
confirmation = await ctx.elicit(
f"Are you sure you want to delete '{device_path}'? This cannot be undone.",
["Yes, delete", "Cancel"],
)
if confirmation.action != "accept" or confirmation.content == "Cancel":
await ctx.info("Delete cancelled by user")
return {
"success": False,
"cancelled": True,
"message": "Delete cancelled by user",
}
await ctx.info(f"Deleting {device_path}...")
result = await self.run_shell_args(["rm", device_path], device_id)
if result.success:
await ctx.info(f"Successfully deleted {device_path}")
else:
await ctx.error(f"Delete failed: {result.stderr}")
return {
"success": result.success,
"action": "delete",
"path": device_path,
"error": result.stderr if not result.success else None,
}
@mcp_tool(
tags={"developer"},
annotations={"requires": "developer_mode"},
)
async def file_exists(
self,
device_path: str,
device_id: str | None = None,
) -> dict[str, Any]:
"""Check if a file exists on the device.
[DEVELOPER MODE] Tests for file existence.
Args:
device_path: Path to check on device
device_id: Target device
Returns:
Existence check result
"""
if not is_developer_mode():
return {
"success": False,
"error": "Developer mode required",
}
# Use test -e and check returncode (injection-safe via run_shell_args)
result = await self.run_shell_args(["test", "-e", device_path], device_id)
return {
"success": True,
"path": device_path,
"exists": result.success,
}

511
src/mixins/input.py Normal file
View File

@ -0,0 +1,511 @@
"""Input mixin for Android ADB MCP Server.
Provides tools for simulating user input on Android devices.
"""
import re
from typing import Any
from fastmcp.contrib.mcp_mixin import mcp_tool
from ..config import is_developer_mode
from .base import ADBBaseMixin
# Characters that ADB's input text command cannot handle — suggest clipboard
_INPUT_TEXT_UNSAFE = set("'\"\\`$(){}[]|&;<>!~#%^*?")
class InputMixin(ADBBaseMixin):
"""Mixin for Android input simulation.
Provides tools for:
- Tapping screen coordinates
- Swiping/scrolling gestures
- Key events (back, home, etc.)
- Text input
- Raw shell commands (developer mode)
"""
async def _get_screen_dimensions(
self,
device_id: str | None = None,
) -> tuple[int, int]:
"""Get screen width and height, falling back to 1080x1920."""
result = await self.run_shell_args(["wm", "size"], device_id)
if result.success:
# Parse "Physical size: 1080x1920" or "Override size: ..."
match = re.search(r"(\d+)x(\d+)", result.stdout)
if match:
return int(match.group(1)), int(match.group(2))
return 1080, 1920
@mcp_tool()
async def input_tap(
self,
x: int,
y: int,
device_id: str | None = None,
) -> dict[str, Any]:
"""Tap at screen coordinates.
Simulates a finger tap at the specified position.
Args:
x: X coordinate (pixels from left)
y: Y coordinate (pixels from top)
device_id: Target device (optional if one device or current set)
Returns:
Success status
"""
result = await self.run_shell_args(["input", "tap", str(x), str(y)], device_id)
return {
"success": result.success,
"action": "tap",
"coordinates": {"x": x, "y": y},
"error": result.stderr if not result.success else None,
}
@mcp_tool()
async def input_swipe(
self,
x1: int,
y1: int,
x2: int,
y2: int,
duration_ms: int = 300,
device_id: str | None = None,
) -> dict[str, Any]:
"""Swipe between two points.
Simulates a finger swipe gesture. Use for scrolling, dragging, etc.
Common patterns:
- Scroll down: swipe from bottom to top (y1 > y2)
- Scroll up: swipe from top to bottom (y1 < y2)
- Swipe left: swipe from right to left (x1 > x2)
- Swipe right: swipe from left to right (x1 < x2)
Args:
x1: Start X coordinate
y1: Start Y coordinate
x2: End X coordinate
y2: End Y coordinate
duration_ms: Swipe duration in milliseconds (default 300)
device_id: Target device
Returns:
Success status
"""
result = await self.run_shell_args(
[
"input",
"swipe",
str(x1),
str(y1),
str(x2),
str(y2),
str(duration_ms),
],
device_id,
)
return {
"success": result.success,
"action": "swipe",
"from": {"x": x1, "y": y1},
"to": {"x": x2, "y": y2},
"duration_ms": duration_ms,
"error": result.stderr if not result.success else None,
}
@mcp_tool()
async def input_scroll_down(
self,
device_id: str | None = None,
) -> dict[str, Any]:
"""Scroll down one page.
Convenience method for common scroll-down gesture.
Queries actual screen dimensions to compute center coordinates.
Args:
device_id: Target device
Returns:
Success status
"""
w, h = await self._get_screen_dimensions(device_id)
cx = w // 2
# Swipe from 65% to 25% of screen height
y_start = int(h * 0.65)
y_end = int(h * 0.25)
result = await self.run_shell_args(
[
"input",
"swipe",
str(cx),
str(y_start),
str(cx),
str(y_end),
"300",
],
device_id,
)
return {
"success": result.success,
"action": "scroll_down",
"error": result.stderr if not result.success else None,
}
@mcp_tool()
async def input_scroll_up(
self,
device_id: str | None = None,
) -> dict[str, Any]:
"""Scroll up one page.
Convenience method for common scroll-up gesture.
Queries actual screen dimensions to compute center coordinates.
Args:
device_id: Target device
Returns:
Success status
"""
w, h = await self._get_screen_dimensions(device_id)
cx = w // 2
# Swipe from 25% to 65% of screen height
y_start = int(h * 0.25)
y_end = int(h * 0.65)
result = await self.run_shell_args(
[
"input",
"swipe",
str(cx),
str(y_start),
str(cx),
str(y_end),
"300",
],
device_id,
)
return {
"success": result.success,
"action": "scroll_up",
"error": result.stderr if not result.success else None,
}
@mcp_tool()
async def input_back(
self,
device_id: str | None = None,
) -> dict[str, Any]:
"""Press the Back button.
Simulates pressing the Android back button.
Args:
device_id: Target device
Returns:
Success status
"""
result = await self.run_shell_args(
["input", "keyevent", "KEYCODE_BACK"], device_id
)
return {
"success": result.success,
"action": "back",
"error": result.stderr if not result.success else None,
}
@mcp_tool()
async def input_home(
self,
device_id: str | None = None,
) -> dict[str, Any]:
"""Press the Home button.
Returns to the home screen.
Args:
device_id: Target device
Returns:
Success status
"""
result = await self.run_shell_args(
["input", "keyevent", "KEYCODE_HOME"], device_id
)
return {
"success": result.success,
"action": "home",
"error": result.stderr if not result.success else None,
}
@mcp_tool()
async def input_recent_apps(
self,
device_id: str | None = None,
) -> dict[str, Any]:
"""Open recent apps / app switcher.
Shows the recent applications overview.
Args:
device_id: Target device
Returns:
Success status
"""
result = await self.run_shell_args(
["input", "keyevent", "KEYCODE_APP_SWITCH"], device_id
)
return {
"success": result.success,
"action": "recent_apps",
"error": result.stderr if not result.success else None,
}
@mcp_tool()
async def input_key(
self,
key_code: str,
device_id: str | None = None,
) -> dict[str, Any]:
"""Send a key event.
Send any Android key event by code name.
Common key codes:
- KEYCODE_BACK, KEYCODE_HOME, KEYCODE_APP_SWITCH
- KEYCODE_VOLUME_UP, KEYCODE_VOLUME_DOWN, KEYCODE_MUTE
- KEYCODE_POWER, KEYCODE_MENU, KEYCODE_SEARCH
- KEYCODE_ENTER, KEYCODE_DEL (backspace), KEYCODE_TAB
- KEYCODE_DPAD_UP/DOWN/LEFT/RIGHT, KEYCODE_DPAD_CENTER
Args:
key_code: Android key code (e.g., "KEYCODE_ENTER")
device_id: Target device
Returns:
Success status
"""
# Normalize key code — strip anything non-alphanumeric/underscore
clean = re.sub(r"[^A-Za-z0-9_]", "", key_code)
if not clean.startswith("KEYCODE_"):
clean = f"KEYCODE_{clean.upper()}"
result = await self.run_shell_args(["input", "keyevent", clean], device_id)
return {
"success": result.success,
"action": "key",
"key_code": clean,
"error": result.stderr if not result.success else None,
}
@mcp_tool()
async def input_text(
self,
text: str,
device_id: str | None = None,
) -> dict[str, Any]:
"""Type text into the focused input field.
Types the specified text as if entered via keyboard.
Focus must be on a text input field first.
Note: Only handles basic alphanumeric text and common punctuation.
For text with special characters, use clipboard_set(text, paste=True)
which handles all characters correctly.
Args:
text: Text to type (spaces are handled automatically)
device_id: Target device
Returns:
Success status
"""
# Check for characters that ADB input text can't handle
has_unsafe = any(c in _INPUT_TEXT_UNSAFE for c in text)
if has_unsafe:
return {
"success": False,
"error": (
"Text contains special characters that ADB input "
"text cannot handle reliably. Use "
"clipboard_set(text, paste=True) instead."
),
"text": text,
}
# ADB input text: spaces must be %s, no shell metacharacters
escaped = text.replace(" ", "%s")
result = await self.run_shell_args(["input", "text", escaped], device_id)
return {
"success": result.success,
"action": "text",
"text": text,
"error": result.stderr if not result.success else None,
}
# === Developer Mode Tools ===
@mcp_tool(
tags={"developer"},
annotations={"requires": "developer_mode"},
)
async def shell_command(
self,
command: str,
device_id: str | None = None,
) -> dict[str, Any]:
"""Execute arbitrary shell command on device.
[DEVELOPER MODE] Run any shell command on the Android device.
Use with caution - commands run with shell user permissions.
Common commands:
- ls /sdcard - list files
- getprop ro.build.version.release - get Android version
- pm list packages - list installed packages
- dumpsys battery - battery info
- settings get system screen_brightness - screen brightness
Args:
command: Shell command to execute
device_id: Target device
Returns:
Command output with stdout, stderr, and return code
"""
if not is_developer_mode():
return {
"success": False,
"error": (
"Developer mode required. "
"Enable with config_set_developer_mode(True)"
),
}
# Developer shell_command intentionally uses run_shell (string form)
# since the user explicitly provides the command string
result = await self.run_shell(command, device_id)
return {
"success": result.success,
"command": command,
"stdout": result.stdout,
"stderr": result.stderr,
"returncode": result.returncode,
}
@mcp_tool(
tags={"developer"},
annotations={"requires": "developer_mode"},
)
async def input_long_press(
self,
x: int,
y: int,
duration_ms: int = 1000,
device_id: str | None = None,
) -> dict[str, Any]:
"""Long press at screen coordinates.
[DEVELOPER MODE] Simulates a long press / press-and-hold gesture.
Args:
x: X coordinate
y: Y coordinate
duration_ms: Hold duration in milliseconds (default 1000)
device_id: Target device
Returns:
Success status
"""
if not is_developer_mode():
return {
"success": False,
"error": "Developer mode required",
}
# Long press is a swipe with no movement
result = await self.run_shell_args(
[
"input",
"swipe",
str(x),
str(y),
str(x),
str(y),
str(duration_ms),
],
device_id,
)
return {
"success": result.success,
"action": "long_press",
"coordinates": {"x": x, "y": y},
"duration_ms": duration_ms,
"error": result.stderr if not result.success else None,
}
@mcp_tool()
async def clipboard_set(
self,
text: str,
paste: bool = False,
device_id: str | None = None,
) -> dict[str, Any]:
"""Set clipboard text and optionally paste.
Sets the device clipboard to the specified text. Unlike input_text,
this handles all special characters correctly.
Use paste=True to immediately paste into the focused field.
Args:
text: Text to put on clipboard
paste: If True, also send Ctrl+V to paste (default False)
device_id: Target device
Returns:
Success status
"""
# Try cmd clipboard set (Android 12+, injection-safe via args)
result = await self.run_shell_args(["cmd", "clipboard", "set", text], device_id)
# Fallback: try am broadcast (Clipper app or similar)
if not result.success:
result = await self.run_shell_args(
[
"am",
"broadcast",
"-a",
"clipper.set",
"-e",
"text",
text,
],
device_id,
)
preview = text[:100] + "..." if len(text) > 100 else text
response: dict[str, Any] = {
"success": result.success,
"action": "clipboard_set",
"text": preview,
"error": result.stderr if not result.success else None,
}
# Paste if requested
if paste and result.success:
paste_result = await self.run_shell_args(
["input", "keyevent", "KEYCODE_PASTE"], device_id
)
response["pasted"] = paste_result.success
if not paste_result.success:
response["paste_error"] = paste_result.stderr
return response

398
src/mixins/screenshot.py Normal file
View File

@ -0,0 +1,398 @@
"""Screenshot mixin for Android ADB MCP Server.
Provides tools for screen capture and display information.
"""
from datetime import datetime
from pathlib import Path
from typing import Any
from fastmcp import Context
from fastmcp.contrib.mcp_mixin import mcp_resource, mcp_tool
from ..config import get_config, is_developer_mode
from ..models import ScreenshotResult
from .base import ADBBaseMixin
class ScreenshotMixin(ADBBaseMixin):
"""Mixin for Android screen capture.
Provides tools for:
- Taking screenshots
- Getting screen dimensions
- Screen recording (developer mode)
"""
@mcp_tool()
async def screenshot(
self,
ctx: Context,
filename: str | None = None,
device_id: str | None = None,
) -> ScreenshotResult:
"""Take a screenshot of the device screen.
Captures the current screen and saves it locally as a PNG file.
Args:
ctx: MCP context for logging
filename: Output filename (default: screenshot_YYYYMMDD_HHMMSS.png)
device_id: Target device
Returns:
ScreenshotResult with success status and file path
"""
await ctx.info("Capturing screenshot...")
# Generate default filename with timestamp
if not filename:
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
filename = f"screenshot_{timestamp}.png"
# Use configured screenshot directory if set
config = get_config()
if config.default_screenshot_dir:
output_path = Path(config.default_screenshot_dir) / filename
else:
output_path = Path(filename).absolute()
# Ensure parent directory exists
output_path.parent.mkdir(parents=True, exist_ok=True)
# Take screenshot on device
device_temp = "/sdcard/adb_mcp_screenshot.png"
result = await self.run_shell_args(["screencap", "-p", device_temp], device_id)
if not result.success:
await ctx.error(f"Screenshot capture failed: {result.stderr}")
return ScreenshotResult(
success=False,
error=f"Failed to capture screenshot: {result.stderr}",
)
await ctx.info("Transferring screenshot to host...")
# Pull to local machine
pull_result = await self.run_adb(
["pull", device_temp, str(output_path)], device_id
)
if not pull_result.success:
await ctx.error(f"Screenshot transfer failed: {pull_result.stderr}")
return ScreenshotResult(
success=False,
error=f"Failed to pull screenshot: {pull_result.stderr}",
)
# Clean up device temp file
await self.run_shell_args(["rm", device_temp], device_id)
await ctx.info(f"Screenshot saved: {output_path}")
return ScreenshotResult(
success=True,
local_path=str(output_path),
)
@mcp_tool()
async def screen_size(
self,
device_id: str | None = None,
) -> dict[str, Any]:
"""Get the screen dimensions.
Returns the physical screen resolution in pixels.
Args:
device_id: Target device
Returns:
Screen width and height
"""
result = await self.run_shell_args(["wm", "size"], device_id)
if result.success:
# Parse "Physical size: 1080x1920"
for line in result.stdout.split("\n"):
if "Physical size" in line or "Override size" in line:
parts = line.split(":")
if len(parts) == 2:
size = parts[1].strip()
if "x" in size:
w, h = size.split("x")
return {
"success": True,
"width": int(w),
"height": int(h),
"raw": result.stdout,
}
return {
"success": False,
"error": result.stderr or "Could not parse screen size",
"raw": result.stdout,
}
@mcp_tool()
async def screen_density(
self,
device_id: str | None = None,
) -> dict[str, Any]:
"""Get the screen density (DPI).
Args:
device_id: Target device
Returns:
Screen density in DPI
"""
result = await self.run_shell_args(["wm", "density"], device_id)
if result.success:
for line in result.stdout.split("\n"):
if "Physical density" in line or "Override density" in line:
parts = line.split(":")
if len(parts) == 2:
try:
dpi = int(parts[1].strip())
return {
"success": True,
"dpi": dpi,
}
except ValueError:
pass
return {
"success": False,
"error": result.stderr or "Could not parse density",
"raw": result.stdout,
}
@mcp_tool()
async def screen_on(
self,
device_id: str | None = None,
) -> dict[str, Any]:
"""Turn the screen on.
Wakes up the device display. Does not unlock.
Args:
device_id: Target device
Returns:
Success status
"""
result = await self.run_shell_args(
["input", "keyevent", "KEYCODE_WAKEUP"], device_id
)
return {
"success": result.success,
"action": "screen_on",
"error": result.stderr if not result.success else None,
}
@mcp_tool()
async def screen_off(
self,
device_id: str | None = None,
) -> dict[str, Any]:
"""Turn the screen off.
Puts the device display to sleep.
Args:
device_id: Target device
Returns:
Success status
"""
result = await self.run_shell_args(
["input", "keyevent", "KEYCODE_SLEEP"], device_id
)
return {
"success": result.success,
"action": "screen_off",
"error": result.stderr if not result.success else None,
}
# === Developer Mode Tools ===
@mcp_tool(
tags={"developer"},
annotations={"requires": "developer_mode"},
)
async def screen_record(
self,
ctx: Context,
filename: str | None = None,
duration_seconds: int = 10,
device_id: str | None = None,
) -> dict[str, Any]:
"""Record the screen.
[DEVELOPER MODE] Records the device screen to a video file.
Recording runs for the specified duration.
Args:
ctx: MCP context for logging
filename: Output filename (default: recording_YYYYMMDD_HHMMSS.mp4)
duration_seconds: Recording duration (max 180 seconds)
device_id: Target device
Returns:
Recording result with file path
"""
if not is_developer_mode():
return {
"success": False,
"error": "Developer mode required",
}
# Generate default filename
if not filename:
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
filename = f"recording_{timestamp}.mp4"
# Use configured directory
config = get_config()
if config.default_screenshot_dir:
output_path = Path(config.default_screenshot_dir) / filename
else:
output_path = Path(filename).absolute()
output_path.parent.mkdir(parents=True, exist_ok=True)
# Limit duration
duration = min(duration_seconds, 180)
await ctx.info(f"Recording screen for {duration}s...")
# Record on device — uses dedicated timeout for recording duration
device_temp = "/sdcard/adb_mcp_recording.mp4"
result = await self.run_shell_args(
[
"screenrecord",
"--time-limit",
str(duration),
device_temp,
],
device_id,
timeout=duration + 10, # Extra margin for command overhead
)
if not result.success:
return {
"success": False,
"error": f"Failed to record: {result.stderr}",
}
await ctx.info("Transferring recording to host...")
# Pull to local
pull_result = await self.run_adb(
["pull", device_temp, str(output_path)], device_id
)
# Clean up
await self.run_shell_args(["rm", device_temp], device_id)
if not pull_result.success:
return {
"success": False,
"error": (f"Failed to pull recording: {pull_result.stderr}"),
}
await ctx.info(f"Recording saved: {output_path}")
return {
"success": True,
"local_path": str(output_path),
"duration_seconds": duration,
}
@mcp_tool(
tags={"developer"},
annotations={"requires": "developer_mode"},
)
async def screen_set_size(
self,
width: int,
height: int,
device_id: str | None = None,
) -> dict[str, Any]:
"""Override screen resolution.
[DEVELOPER MODE] Changes the display resolution.
Use screen_reset_size to restore original.
Args:
width: New width in pixels
height: New height in pixels
device_id: Target device
Returns:
Success status
"""
if not is_developer_mode():
return {
"success": False,
"error": "Developer mode required",
}
result = await self.run_shell_args(
["wm", "size", f"{width}x{height}"], device_id
)
return {
"success": result.success,
"action": "set_size",
"width": width,
"height": height,
"error": result.stderr if not result.success else None,
}
@mcp_tool(
tags={"developer"},
annotations={"requires": "developer_mode"},
)
async def screen_reset_size(
self,
device_id: str | None = None,
) -> dict[str, Any]:
"""Reset screen to physical resolution.
[DEVELOPER MODE] Restores the original display resolution.
Args:
device_id: Target device
Returns:
Success status
"""
if not is_developer_mode():
return {
"success": False,
"error": "Developer mode required",
}
result = await self.run_shell_args(["wm", "size", "reset"], device_id)
return {
"success": result.success,
"action": "reset_size",
"error": result.stderr if not result.success else None,
}
# === Resources ===
@mcp_resource(uri="adb://screen/info")
async def resource_screen_info(self) -> dict[str, Any]:
"""Resource: Get screen information."""
size = await self.screen_size()
density = await self.screen_density()
return {
"width": size.get("width"),
"height": size.get("height"),
"dpi": density.get("dpi"),
}

367
src/mixins/ui.py Normal file
View File

@ -0,0 +1,367 @@
"""UI inspection mixin for Android ADB MCP Server.
Provides tools for UI hierarchy inspection and synchronization.
"""
import asyncio
import re
import time
from typing import Any
from fastmcp import Context
from fastmcp.contrib.mcp_mixin import mcp_tool
from .base import ADBBaseMixin
class UIMixin(ADBBaseMixin):
"""Mixin for Android UI inspection.
Provides tools for:
- Dumping UI hierarchy (accessibility tree)
- Waiting for text/elements to appear
- Finding elements by various attributes
"""
@mcp_tool()
async def ui_dump(
self,
ctx: Context | None = None,
device_id: str | None = None,
) -> dict[str, Any]:
"""Dump the current UI hierarchy.
Returns the accessibility tree as XML, showing all visible elements
with their properties (text, content-description, class, bounds, etc.).
This is extremely useful for:
- Finding clickable elements by their text
- Understanding screen layout without screenshots
- Locating elements by resource-id or content-description
Example output element:
<node text="Settings" class="android.widget.TextView"
bounds="[0,100][200,150]" clickable="true" />
Args:
ctx: MCP context for logging (optional for internal calls)
device_id: Target device
Returns:
UI hierarchy XML and parsed element summary
"""
if ctx:
await ctx.info("Dumping UI hierarchy...")
# Dump UI to temp file on device
device_path = "/sdcard/window_dump.xml"
result = await self.run_shell_args(
["uiautomator", "dump", device_path], device_id
)
if not result.success:
if ctx:
await ctx.error(f"UI dump failed: {result.stderr}")
return {
"success": False,
"error": f"Failed to dump UI: {result.stderr}",
}
# Read the dump
cat_result = await self.run_shell_args(["cat", device_path], device_id)
if not cat_result.success:
if ctx:
await ctx.error(f"Failed to read dump: {cat_result.stderr}")
return {
"success": False,
"error": f"Failed to read UI dump: {cat_result.stderr}",
}
# Clean up
await self.run_shell_args(["rm", device_path], device_id)
xml_content = cat_result.stdout
# Parse out clickable/important elements for quick reference
clickable_elements = self._parse_ui_elements(xml_content)
if ctx:
await ctx.info(f"Found {len(clickable_elements)} interactive elements")
return {
"success": True,
"xml": xml_content,
"clickable_elements": clickable_elements,
"element_count": len(clickable_elements),
}
def _parse_ui_elements(self, xml_content: str) -> list[dict[str, Any]]:
"""Parse UI XML to extract clickable/important elements."""
elements = []
# Regex to find node elements with their attributes
node_pattern = re.compile(r"<node\s+([^>]+?)(?:/>|>)", re.DOTALL)
attr_pattern = re.compile(r'(\w+)="([^"]*)"')
for match in node_pattern.finditer(xml_content):
attrs_str = match.group(1)
attrs = dict(attr_pattern.findall(attrs_str))
# Only include elements that are interactive or have useful text
is_clickable = attrs.get("clickable") == "true"
is_focusable = attrs.get("focusable") == "true"
has_text = bool(attrs.get("text", "").strip())
has_desc = bool(attrs.get("content-desc", "").strip())
if is_clickable or is_focusable or has_text or has_desc:
element = {
"text": attrs.get("text", ""),
"content_desc": attrs.get("content-desc", ""),
"class": attrs.get("class", ""),
"resource_id": attrs.get("resource-id", ""),
"clickable": is_clickable,
"bounds": attrs.get("bounds", ""),
}
# Parse bounds to get center coordinates for tapping
bounds = attrs.get("bounds", "")
if bounds:
bounds_match = re.match(r"\[(\d+),(\d+)\]\[(\d+),(\d+)\]", bounds)
if bounds_match:
x1, y1, x2, y2 = map(int, bounds_match.groups())
element["center"] = {
"x": (x1 + x2) // 2,
"y": (y1 + y2) // 2,
}
elements.append(element)
return elements
@mcp_tool()
async def ui_find_element(
self,
text: str | None = None,
content_desc: str | None = None,
resource_id: str | None = None,
class_name: str | None = None,
device_id: str | None = None,
) -> dict[str, Any]:
"""Find UI elements matching criteria.
Searches the current UI for elements matching the specified
attributes. Returns matching elements with their bounds/center
for interaction.
Args:
text: Find elements with this exact text
content_desc: Find elements with this content-description
resource_id: Find elements with this resource ID
class_name: Find elements of this class
device_id: Target device
Returns:
List of matching elements with tap coordinates
"""
# Get UI dump (internal call, no ctx)
dump = await self.ui_dump(device_id=device_id)
if not dump.get("success"):
return dump
elements = dump["clickable_elements"]
matches = []
for elem in elements:
match = True
if text is not None and elem.get("text") != text:
match = False
if content_desc is not None and elem.get("content_desc") != content_desc:
match = False
if resource_id is not None and resource_id not in elem.get(
"resource_id", ""
):
match = False
if class_name is not None and class_name not in elem.get("class", ""):
match = False
if match:
matches.append(elem)
return {
"success": True,
"matches": matches,
"count": len(matches),
}
@mcp_tool()
async def wait_for_text(
self,
text: str,
timeout_seconds: float = 10.0,
poll_interval: float = 0.5,
device_id: str | None = None,
) -> dict[str, Any]:
"""Wait for text to appear on screen.
Polls the UI hierarchy until the specified text is found
or timeout. Essential for synchronizing automation flows.
Args:
text: Text to wait for (case-sensitive, substring match)
timeout_seconds: Maximum wait time (default 10s)
poll_interval: Time between polls in seconds (default 0.5s)
device_id: Target device
Returns:
Success status with element info if found, or timeout error
"""
start_time = time.time()
attempts = 0
while (time.time() - start_time) < timeout_seconds:
attempts += 1
# Internal call, no ctx
dump = await self.ui_dump(device_id=device_id)
if dump.get("success"):
for elem in dump.get("clickable_elements", []):
if text in elem.get("text", "") or text in elem.get(
"content_desc", ""
):
return {
"success": True,
"found": True,
"element": elem,
"wait_time": round(time.time() - start_time, 2),
"attempts": attempts,
}
await asyncio.sleep(poll_interval)
return {
"success": False,
"found": False,
"error": (f"Text '{text}' not found after {timeout_seconds}s"),
"attempts": attempts,
}
@mcp_tool()
async def wait_for_text_gone(
self,
text: str,
timeout_seconds: float = 10.0,
poll_interval: float = 0.5,
device_id: str | None = None,
) -> dict[str, Any]:
"""Wait for text to disappear from screen.
Useful for waiting for loading indicators to finish,
dialogs to close, etc.
Args:
text: Text to wait to disappear
timeout_seconds: Maximum wait time (default 10s)
poll_interval: Time between polls (default 0.5s)
device_id: Target device
Returns:
Success when text is no longer visible, or timeout error
"""
start_time = time.time()
attempts = 0
while (time.time() - start_time) < timeout_seconds:
attempts += 1
dump = await self.ui_dump(device_id=device_id)
if dump.get("success"):
found = False
for elem in dump.get("clickable_elements", []):
if text in elem.get("text", "") or text in elem.get(
"content_desc", ""
):
found = True
break
if not found:
return {
"success": True,
"gone": True,
"wait_time": round(time.time() - start_time, 2),
"attempts": attempts,
}
await asyncio.sleep(poll_interval)
return {
"success": False,
"gone": False,
"error": (f"Text '{text}' still present after {timeout_seconds}s"),
"attempts": attempts,
}
@mcp_tool()
async def tap_text(
self,
text: str,
device_id: str | None = None,
) -> dict[str, Any]:
"""Find element by text and tap it.
Convenience method that combines ui_find_element + input_tap.
Finds the first element containing the text and taps its center.
Args:
text: Text of element to tap
device_id: Target device
Returns:
Success status with tapped coordinates
"""
# Find element
result = await self.ui_find_element(text=text, device_id=device_id)
if not result.get("success"):
return result
matches = result.get("matches", [])
if not matches:
# Try content-desc as fallback
result = await self.ui_find_element(content_desc=text, device_id=device_id)
matches = result.get("matches", [])
if not matches:
return {
"success": False,
"error": f"No element found with text '{text}'",
}
element = matches[0]
center = element.get("center")
if not center:
return {
"success": False,
"error": "Element found but could not determine coordinates",
"element": element,
}
# Tap the center
tap_result = await self.run_shell_args(
["input", "tap", str(center["x"]), str(center["y"])],
device_id,
)
return {
"success": tap_result.success,
"action": "tap_text",
"text": text,
"coordinates": center,
"element": element,
"error": tap_result.stderr if not tap_result.success else None,
}

36
src/models.py Normal file
View File

@ -0,0 +1,36 @@
"""Pydantic models for Android ADB MCP Server."""
from pydantic import BaseModel, Field
class DeviceInfo(BaseModel):
"""Android device information returned by ADB."""
device_id: str = Field(description="Unique device identifier/serial number")
status: str = Field(
description="Device connection status",
json_schema_extra={
"examples": ["device", "offline", "unauthorized", "no permissions"]
},
)
model: str | None = Field(None, description="Device model name")
product: str | None = Field(None, description="Product name")
class CommandResult(BaseModel):
"""Result of an ADB command execution."""
success: bool = Field(description="Whether the command succeeded")
stdout: str = Field(default="", description="Standard output from command")
stderr: str = Field(default="", description="Standard error from command")
returncode: int = Field(description="Command exit code")
class ScreenshotResult(BaseModel):
"""Screenshot capture operation result."""
success: bool = Field(description="Whether screenshot was captured successfully")
local_path: str | None = Field(
None, description="Absolute path to the saved screenshot file"
)
error: str | None = Field(None, description="Error message if operation failed")

View File

@ -3,349 +3,233 @@
Android ADB MCP Server Android ADB MCP Server
A Model Context Protocol (MCP) server for Android device automation via ADB. A Model Context Protocol (MCP) server for Android device automation via ADB.
Provides tools for device interaction, screenshots, app launching, and more. Uses MCPMixin pattern for organized, extensible tool registration.
Features:
- Device management and multi-device support
- Screen capture and recording
- Input simulation (tap, swipe, key events, text)
- App launching and management
- Developer mode for advanced tools
""" """
import asyncio from typing import Any
import json
import subprocess
import sys
from pathlib import Path
from typing import List, Optional, Dict, Any
from fastmcp import FastMCP from fastmcp import FastMCP
from pydantic import BaseModel, Field from fastmcp.contrib.mcp_mixin import mcp_resource, mcp_tool
from .config import get_config
from .mixins import (
AppsMixin,
DevicesMixin,
FilesMixin,
InputMixin,
ScreenshotMixin,
UIMixin,
)
class DeviceInfo(BaseModel): class ADBServer(
"""Android device information returned by ADB""" DevicesMixin, InputMixin, AppsMixin, ScreenshotMixin, UIMixin, FilesMixin
device_id: str = Field(description="Unique device identifier/serial number") ):
status: str = Field( """Android ADB MCP Server combining all functionality.
description="Device connection status",
json_schema_extra={
"examples": ["device", "offline", "unauthorized", "no permissions"]
}
)
Inherits from mixins:
- DevicesMixin: Device listing, selection, info, logcat
- InputMixin: Tap, swipe, keys, text input, clipboard
- AppsMixin: App launching, URL opening, package management, intents
- ScreenshotMixin: Screen capture, recording, display control
- UIMixin: UI hierarchy inspection, element finding, text waiting
- FilesMixin: File push/pull between host and device
class ScreenshotResult(BaseModel): Developer mode enables additional tools for power users.
"""Screenshot capture operation result""" """
success: bool = Field(description="Whether the screenshot was captured successfully")
local_path: Optional[str] = Field(None, description="Absolute path to the saved screenshot file")
error: Optional[str] = Field(None, description="Error message if operation failed")
# === Configuration Tools ===
class ADBCommand(BaseModel): @mcp_tool()
"""ADB command execution parameters""" async def config_status(self) -> dict[str, Any]:
command: List[str] """Get current server configuration.
device_id: Optional[str] = None
Shows developer mode status and other settings.
Returns:
# Initialize FastMCP server Current configuration values
mcp = FastMCP("android-mcp-server") """
config = get_config()
async def run_adb_command(cmd: List[str], device_id: Optional[str] = None) -> Dict[str, Any]:
"""Execute ADB command and return result"""
full_cmd = ["adb"]
if device_id:
full_cmd.extend(["-s", device_id])
full_cmd.extend(cmd)
try:
result = await asyncio.create_subprocess_exec(
*full_cmd,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE
)
stdout, stderr = await result.communicate()
return { return {
"success": result.returncode == 0, "developer_mode": config.developer_mode,
"stdout": stdout.decode('utf-8', errors='ignore').strip(), "auto_select_single_device": config.auto_select_single_device,
"stderr": stderr.decode('utf-8', errors='ignore').strip(), "default_screenshot_dir": config.default_screenshot_dir,
"returncode": result.returncode "current_device": self.get_current_device(),
}
except Exception as e:
return {
"success": False,
"stdout": "",
"stderr": str(e),
"returncode": -1
} }
@mcp_tool()
async def config_set_developer_mode(self, enabled: bool) -> dict[str, Any]:
"""Enable or disable developer mode.
@mcp.tool() Developer mode unlocks advanced tools:
async def adb_devices() -> List[DeviceInfo]: - Raw shell command execution
""" - Package listing/installation/uninstallation
List all Android devices connected via USB or network. - Screen recording and resolution changes
- App data clearing
- Long press gestures
Returns device information including unique identifiers and connection status. The setting persists across server restarts.
Use this to identify available devices before performing other operations.
Returns: Args:
List of connected devices with their IDs and status enabled: True to enable, False to disable
"""
result = await run_adb_command(["devices"])
if not result["success"]: Returns:
raise Exception(f"Failed to list devices: {result['stderr']}") Confirmation with new status
"""
devices = [] config = get_config()
lines = result["stdout"].split('\n')[1:] # Skip header config.developer_mode = enabled
for line in lines:
if line.strip():
parts = line.split('\t')
if len(parts) >= 2:
devices.append(DeviceInfo(
device_id=parts[0],
status=parts[1]
))
return devices
@mcp.tool()
async def adb_screenshot(
device_id: Optional[str] = Field(None, description="Target device ID (if multiple devices connected)"),
local_filename: str = Field("screenshot.png", description="Local filename to save screenshot to")
) -> ScreenshotResult:
"""
Capture a screenshot from Android device and save it locally.
Takes a screenshot of the current screen content and saves it as a PNG file.
Automatically handles device communication and file transfer.
Args:
device_id: Specific device to target (optional if only one device)
local_filename: Name for the saved screenshot file
Returns:
Result object with success status and file path
"""
# Take screenshot on device
result = await run_adb_command(["shell", "screencap", "-p", "/sdcard/temp_screenshot.png"], device_id)
if not result["success"]:
return ScreenshotResult(success=False, error=f"Failed to capture screenshot: {result['stderr']}")
# Pull screenshot to local machine
local_path = Path(local_filename).absolute()
pull_result = await run_adb_command(["pull", "/sdcard/temp_screenshot.png", str(local_path)], device_id)
if not pull_result["success"]:
return ScreenshotResult(success=False, error=f"Failed to pull screenshot: {pull_result['stderr']}")
# Clean up device file
await run_adb_command(["shell", "rm", "/sdcard/temp_screenshot.png"], device_id)
return ScreenshotResult(success=True, local_path=str(local_path))
@mcp.tool()
async def adb_input(
action_type: str,
x: str = "",
y: str = "",
x2: str = "",
y2: str = "",
key_code: str = "",
text: str = "",
device_id: Optional[str] = None
) -> Dict[str, Any]:
"""
Send input events to Android device to simulate user interactions.
Supports various input types with simple parameter interface:
- tap: adb_input(action_type="tap", x="400", y="600")
- swipe: adb_input(action_type="swipe", x="100", y="200", x2="300", y2="400")
- key: adb_input(action_type="key", key_code="KEYCODE_BACK")
- text: adb_input(action_type="text", text="Hello World")
Args:
action_type: Type of input action ("tap", "swipe", "key", "text")
x: X coordinate for tap/swipe start (required for tap/swipe)
y: Y coordinate for tap/swipe start (required for tap/swipe)
x2: X coordinate for swipe end (required for swipe)
y2: Y coordinate for swipe end (required for swipe)
key_code: Android key code like KEYCODE_BACK (required for key)
text: Text to type (required for text)
device_id: Specific device to target (optional if only one device)
Returns:
Command execution result with success status
"""
if action_type == "tap":
if not x or not y:
raise ValueError("tap action requires x and y coordinates")
cmd = ["shell", "input", "tap", x, y]
elif action_type == "swipe":
if not x or not y or not x2 or not y2:
raise ValueError("swipe action requires x, y, x2, y2 coordinates")
cmd = ["shell", "input", "swipe", x, y, x2, y2]
elif action_type == "key":
if not key_code:
raise ValueError("key action requires key_code")
cmd = ["shell", "input", "keyevent", key_code]
elif action_type == "text":
if not text:
raise ValueError("text action requires text")
cmd = ["shell", "input", "text", text]
else:
raise ValueError(f"Unknown action type: {action_type}. Must be one of: tap, swipe, key, text")
result = await run_adb_command(cmd, device_id)
return result
@mcp.tool()
async def adb_launch_app(
package_name: str = Field(description="Android package name (e.g., com.android.chrome)"),
device_id: Optional[str] = Field(None, description="Target device ID (if multiple devices connected)")
) -> Dict[str, Any]:
"""
Launch an Android application by its package name.
Starts the main activity of the specified app. Use adb_list_packages to find
available package names on the device.
Args:
package_name: Full package identifier (e.g., com.android.chrome, com.whatsapp)
device_id: Specific device to target (optional if only one device)
Returns:
Command execution result with success status and output
"""
cmd = ["shell", "monkey", "-p", package_name, "-c", "android.intent.category.LAUNCHER", "1"]
result = await run_adb_command(cmd, device_id)
return result
@mcp.tool()
async def adb_launch_url(
url: str = Field(description="URL to open (e.g., https://example.com)"),
device_id: Optional[str] = Field(None, description="Target device ID (if multiple devices connected)")
) -> Dict[str, Any]:
"""
Open a URL in the device's default browser application.
Launches the default browser and navigates to the specified URL.
Supports HTTP, HTTPS, and other URL schemes supported by Android.
Args:
url: Web address to navigate to
device_id: Specific device to target (optional if only one device)
Returns:
Command execution result with success status
"""
cmd = ["shell", "am", "start", "-a", "android.intent.action.VIEW", "-d", url]
result = await run_adb_command(cmd, device_id)
return result
@mcp.tool()
async def adb_list_packages(
device_id: Optional[str] = Field(None, description="Target device ID (if multiple devices connected)"),
filter_text: Optional[str] = Field(None, description="Filter packages containing this text (e.g., 'chrome', 'google')")
) -> Dict[str, Any]:
"""
List all installed applications on the Android device.
Retrieves package names of all installed apps, optionally filtered by text.
Useful for finding package names to use with adb_launch_app.
Args:
device_id: Specific device to target (optional if only one device)
filter_text: Only return packages containing this text
Returns:
Dictionary with success status, package list, and count
"""
cmd = ["shell", "pm", "list", "packages"]
if filter_text:
cmd.extend(["|", "grep", filter_text])
result = await run_adb_command(cmd, device_id)
if result["success"]:
packages = []
for line in result["stdout"].split('\n'):
if line.startswith('package:'):
packages.append(line.replace('package:', ''))
return { return {
"success": True, "success": True,
"packages": packages, "developer_mode": enabled,
"count": len(packages) "message": (
"Developer mode enabled. Advanced tools are now available."
if enabled
else "Developer mode disabled. Using standard tools only."
),
} }
return result @mcp_tool()
async def config_set_screenshot_dir(self, directory: str | None) -> dict[str, Any]:
"""Set default directory for screenshots.
Screenshots will be saved to this directory by default.
Set to None to save to current working directory.
@mcp.tool() Args:
async def adb_shell_command( directory: Directory path, or None for current directory
command: str = Field(
description="Shell command to execute. Common input commands: 'input tap X Y' for tapping, 'input swipe X1 Y1 X2 Y2' for swiping, 'input keyevent KEYCODE' for keys, 'input text \"hello\"' for typing", Returns:
json_schema_extra={ Confirmation
"examples": [ """
"input tap 400 600", config = get_config()
"input swipe 100 200 300 400", config.default_screenshot_dir = directory
"input keyevent KEYCODE_BACK",
"input text \"hello world\"", return {
"ls /sdcard", "success": True,
"getprop ro.build.version.release", "screenshot_dir": directory,
"pm list packages | grep chrome" }
# === Help / Discovery ===
@mcp_resource(uri="adb://help")
async def resource_help(self) -> dict[str, Any]:
"""Resource: Server help and available tools overview."""
config = get_config()
tools = {
"devices": [
"devices_list - List connected devices",
"devices_use - Set current device",
"devices_current - Get current device info",
"device_info - Battery, wifi, storage, system info",
],
"input": [
"input_tap - Tap at coordinates",
"input_swipe - Swipe between points",
"input_scroll_down / input_scroll_up - Scroll gestures",
"input_back / input_home / input_recent_apps - Navigation",
"input_key - Send any key event",
"input_text - Type text",
"clipboard_set - Set clipboard (handles special chars)",
],
"apps": [
"app_launch - Launch app by package name",
"app_open_url - Open URL in browser",
"app_close - Force stop an app",
"app_current - Get focused app",
],
"screen": [
"screenshot - Capture screen",
"screen_size / screen_density - Display info",
"screen_on / screen_off - Wake/sleep display",
],
"ui": [
"ui_dump - Dump UI hierarchy (accessibility tree)",
"ui_find_element - Find elements by text/id/class",
"wait_for_text - Wait for text to appear",
"wait_for_text_gone - Wait for text to disappear",
"tap_text - Find element by text and tap it",
],
"config": [
"config_status - Show current settings",
"config_set_developer_mode - Toggle developer tools",
"config_set_screenshot_dir - Set screenshot output directory",
],
}
if config.developer_mode:
tools["developer"] = [
"shell_command - Execute any shell command",
"input_long_press - Long press gesture",
"app_list_packages - List installed packages",
"app_install / app_uninstall - Install/remove apps",
"app_clear_data - Clear app data",
"activity_start - Start activity with intent",
"broadcast_send - Send broadcast intent",
"screen_record - Record screen video",
"screen_set_size / screen_reset_size - Change resolution",
"device_reboot - Reboot device",
"logcat_capture / logcat_clear - Android logs",
"file_push / file_pull - Transfer files",
"file_list / file_delete / file_exists - File operations",
] ]
return {
"name": "Android ADB MCP Server",
"developer_mode": config.developer_mode,
"tools": tools,
"tip": (
"Use config_set_developer_mode(True) to unlock advanced tools"
if not config.developer_mode
else "Developer mode is active - all tools available"
),
} }
),
device_id: Optional[str] = Field(None, description="Target device ID (if multiple devices connected)")
) -> Dict[str, Any]:
"""
Execute shell commands on Android device, including input simulation.
This is the most reliable way to perform input actions on Android devices.
Runs commands in the Android shell environment with full access to input system.
Common Input Commands: # Initialize FastMCP server
- Tap: adb_shell_command(command="input tap 400 600") mcp = FastMCP(
- Swipe: adb_shell_command(command="input swipe 100 200 300 400") "android-adb",
- Key press: adb_shell_command(command="input keyevent KEYCODE_BACK") instructions="""Android ADB MCP Server for device automation.
- Type text: adb_shell_command(command="input text \"hello world\"")
- Scroll down: adb_shell_command(command="input swipe 500 800 500 300")
- Scroll up: adb_shell_command(command="input swipe 500 300 500 800")
Other Useful Commands: Use devices_list() first to see connected devices.
- List files: adb_shell_command(command="ls /sdcard") If multiple devices are connected, use devices_use(device_id) to select one.
- Get device info: adb_shell_command(command="getprop ro.build.version.release")
- Find packages: adb_shell_command(command="pm list packages | grep chrome")
- Screen brightness: adb_shell_command(command="settings get system screen_brightness")
Args: Common workflows:
command: Shell command string to execute (see examples above) 1. Take screenshot: screenshot()
device_id: Specific device to target (optional if only one device) 2. Tap on screen: input_tap(x, y)
3. Launch app: app_launch("com.android.chrome")
4. Open URL: app_open_url("https://example.com")
5. Navigate back: input_back()
Returns: Enable developer mode for advanced tools:
Command execution result with stdout, stderr, and return code config_set_developer_mode(True)
""" """,
cmd = ["shell"] + command.split() )
result = await run_adb_command(cmd, device_id)
return result # Create server instance and register all tools
server = ADBServer()
server.register_all(mcp)
def main(): def main():
"""Main entry point for STDIO MCP server - used by console script""" """Main entry point for STDIO MCP server."""
try:
from importlib.metadata import version
package_version = version("android-mcp-server")
except Exception:
package_version = "0.3.1"
print(f"📱 Android ADB MCP Server v{package_version}", flush=True)
mcp.run() mcp.run()

1056
uv.lock generated

File diff suppressed because it is too large Load Diff