--- title: "Agent-to-Agent Handshake Protocol" description: "Solving the 'ships passing in the night' problem with coordinated agent communication" --- ## The Problem: Dropped Messages When two AI agents try to communicate via MQTT, a common failure pattern emerges: 1. Agent A connects to broker 2. Agent A publishes "Hello, Agent B!" 3. Agent B connects to broker 4. Agent B subscribes to the topic 5. **Agent B never receives the message** (it was published before subscription) This is the **"ships passing in the night"** problem - agents miss messages because they publish before the other party has subscribed. ## The Solution: Host/Join Handshake mcmqtt's coordination protocol ensures **no messages are dropped** by implementing a simple but powerful handshake: ``` HOST JOINER │ │ │ 1. Spawn/connect to broker │ │ 2. Subscribe to $coord/join │ │ 3. Publish broker_ready │ │─────────────────────────────────►│ │ │ │ 4. Connect to broker │ │ 5. Subscribe to topics │ │ 6. Publish join request │ │◄─────────────────────────────────│ │ │ │ 7. Acknowledge join │ │─────────────────────────────────►│ │ │ │ 8. Publish all_ready signal │ │─────────────────────────────────►│ │ │ ▼ SAFE TO EXCHANGE MESSAGES! ▼ ``` ## Key Principle **The initiating agent ALWAYS hosts.** This eliminates confusion about who spawns the broker. ## MCP Tools ### mqtt_host_conversation Use this when **you** are starting a conversation with other agents. ```json { "tool": "mqtt_host_conversation", "arguments": { "session_id": "collab-task-123", "host_agent_id": "coordinator-agent", "expected_agents": ["worker-1", "worker-2", "analyst"], "broker_host": "127.0.0.1", "broker_port": 0, "timeout_seconds": 30 } } ``` **Parameters:** | Parameter | Type | Description | |-----------|------|-------------| | `session_id` | string | Unique identifier for this conversation | | `host_agent_id` | string | Your agent's unique ID | | `expected_agents` | array | List of agent IDs that must join | | `broker_host` | string | Host to bind broker (default: 127.0.0.1) | | `broker_port` | int | Port (0 = auto-assign) | | `timeout_seconds` | float | Max wait time for agents to join | **Response (success):** ```json { "success": true, "message": "Conversation ready! All 3 agents joined.", "session_id": "collab-task-123", "state": "ready", "broker_host": "127.0.0.1", "broker_port": 51234, "broker_url": "mqtt://127.0.0.1:51234", "joined_agents": ["worker-1", "worker-2", "analyst"], "conversation_topic": "conversation/collab-task-123/main", "ready_to_publish": true } ``` ### mqtt_join_conversation Use this when **another agent** invited you to a conversation. ```json { "tool": "mqtt_join_conversation", "arguments": { "session_id": "collab-task-123", "agent_id": "worker-1", "broker_host": "127.0.0.1", "broker_port": 51234, "capabilities": ["data-analysis", "visualization"], "timeout_seconds": 30 } } ``` **Parameters:** | Parameter | Type | Description | |-----------|------|-------------| | `session_id` | string | Session ID from host's invitation | | `agent_id` | string | Your unique agent ID | | `broker_host` | string | Broker host (from host's invitation) | | `broker_port` | int | Broker port (from host's invitation) | | `capabilities` | array | Optional list of your capabilities | | `timeout_seconds` | float | Max wait for acknowledgement | **Response (success):** ```json { "success": true, "message": "Successfully joined conversation collab-task-123!", "session_id": "collab-task-123", "agent_id": "worker-1", "broker_host": "127.0.0.1", "broker_port": 51234, "other_agents": ["coordinator-agent", "worker-2", "analyst"], "conversation_topic": "conversation/collab-task-123/main", "ready_to_receive": true } ``` ## Example: Two-Agent Collaboration ### Agent A (Initiator/Host) ```python # Step 1: Host the conversation result = await mqtt_host_conversation( session_id="data-analysis-job", host_agent_id="data-processor", expected_agents=["visualizer"], timeout_seconds=30 ) if result["ready_to_publish"]: # Step 2: Safe to publish - visualizer is definitely subscribed! await mqtt_publish( topic=result["conversation_topic"], payload={"type": "data", "values": [1, 2, 3, 4, 5]} ) ``` ### Agent B (Joiner) ```python # Step 1: Join using info from Agent A result = await mqtt_join_conversation( session_id="data-analysis-job", agent_id="visualizer", broker_host="127.0.0.1", broker_port=51234 ) if result["ready_to_receive"]: # Step 2: Now receive messages - guaranteed not to miss any! messages = await mqtt_get_messages( topic=result["conversation_topic"] ) ``` ## Topic Structure The protocol uses reserved topics under `$coordination/`: ``` $coordination/{session_id}/ ├── broker_ready # Host publishes broker info (retained) ├── join # Agents publish join requests ├── joined/{agent_id} # Host acknowledges each agent (retained) ├── ready # Host signals all agents ready (retained) └── heartbeat/{agent_id} # Optional: agent heartbeats ``` After handshake, conversations use: ``` conversation/{session_id}/ ├── main # Primary conversation channel ├── {channel_name} # Additional named channels └── ... ``` ## Timeout Handling If expected agents don't join within the timeout: ```json { "success": false, "message": "Timeout waiting for agents. Missing: ['worker-2']", "session_id": "collab-task-123", "state": "timeout", "joined_agents": ["worker-1", "analyst"], "missing_agents": ["worker-2"], "ready_to_publish": false } ``` The host can then decide whether to: - Retry with a longer timeout - Proceed with available agents - Abort the conversation ## Best Practices 1. **Always use coordination tools for multi-agent work** - Don't use raw `mqtt_connect` + `mqtt_publish` when coordinating with other agents 2. **Choose meaningful session IDs** - Include context like `task-{id}-{timestamp}` for debugging 3. **Set appropriate timeouts** - Network latency and agent startup time vary; 30 seconds is a safe default 4. **Check the response** - Always verify `ready_to_publish` (host) or `ready_to_receive` (joiner) before proceeding 5. **Handle failures gracefully** - Timeout doesn't mean failure; retry logic is your friend