Adds revolutionary features for MCP client identification and browser automation: MCP Client Debug System: - Floating pill toolbar with client identification and session info - Theme system with 5 built-in themes (minimal, corporate, hacker, glass, high-contrast) - Custom theme creation API with CSS variable overrides - Cross-site validation ensuring toolbar persists across navigation - Session-based injection with persistence across page loads Voice Collaboration (Prototype): - Web Speech API integration for conversational browser automation - Bidirectional voice communication between AI and user - Real-time voice guidance during automation tasks - Documented architecture and future development roadmap Code Injection Enhancements: - Model collaboration API for notify, prompt, and inspector functions - Auto-injection and persistence options - Toolbar integration with code injection system Documentation: - Comprehensive technical achievement documentation - Voice collaboration architecture and implementation guide - Theme system integration documentation - Tool annotation templates for consistency This represents a major advancement in browser automation UX, enabling unprecedented visibility and interaction patterns for MCP clients.
7.1 KiB
MCP Model-User Collaboration API
This document describes the JavaScript functions available to models for direct user communication and collaborative element selection within the Playwright MCP browser automation system.
🎯 Core Philosophy
Enable seamless collaboration between AI models and human users by providing simple JavaScript APIs for real-time communication, confirmations, and interactive element selection.
📱 Messaging System
Basic Messaging
// Send messages to users with auto-dismiss
mcpMessage('Hello user!', 'info', 5000) // Info message (green)
mcpMessage('Success!', 'success', 3000) // Success message (bright green)
mcpMessage('Warning!', 'warning', 4000) // Warning message (yellow)
mcpMessage('Error occurred', 'error', 6000) // Error message (red)
mcpMessage('Persistent', 'info', 0) // Persistent until dismissed
Helper Functions
mcpNotify.info('Information for the user') // Standard info message
mcpNotify.success('Task completed!') // Success confirmation
mcpNotify.warning('Please be careful') // Cautionary message
mcpNotify.error('Something went wrong') // Error notification
mcpNotify.loading('Processing...') // Persistent loading indicator
mcpNotify.done('All finished!') // Quick success (3s auto-dismiss)
mcpNotify.failed('Task failed') // Quick error (5s auto-dismiss)
🤝 User Confirmation System
Interactive Prompts
// Ask user for confirmation
const confirmed = await mcpPrompt('Should I proceed with this action?');
if (confirmed) {
mcpNotify.success('User confirmed - proceeding!');
} else {
mcpNotify.info('User cancelled the action');
}
// Custom confirmation with options
const result = await mcpPrompt('Do you want to login first?', {
title: '🔐 LOGIN REQUIRED',
confirmText: 'YES, LOGIN',
cancelText: 'SKIP FOR NOW'
});
🔍 Collaborative Element Selection
Interactive Element Inspector
// Basic element selection
mcpInspector.start('Please click on the login button');
// Element selection with callback
mcpInspector.start(
'Click on the element you want me to interact with',
(elementDetails) => {
// Model receives detailed element information
console.log('User selected:', elementDetails);
// Use the XPath for precise automation
const xpath = elementDetails.xpath;
mcpNotify.success(`Got it! I'll click on: ${elementDetails.textContent}`);
// Now use xpath with Playwright tools...
}
);
// Stop inspection programmatically
mcpInspector.stop();
Element Details Returned
When user clicks an element, the callback receives:
{
tagName: 'a', // HTML tag
id: 'login-button', // Element ID (if present)
className: 'btn btn-primary', // CSS classes
textContent: 'Login', // Visible text (truncated to 100 chars)
xpath: '//*[@id="login-button"]', // Generated XPath
attributes: { // All HTML attributes
href: '/login',
class: 'btn btn-primary',
'data-action': 'login'
},
boundingRect: { // Element position/size
x: 100, y: 200,
width: 80, height: 32
},
visible: true // Element visibility status
}
🚀 Collaboration Patterns
1. Ambiguous Element Selection
// When multiple similar elements exist
const confirmed = await mcpPrompt('I see multiple login buttons. Should I click the main one in the header?');
if (!confirmed) {
mcpInspector.start('Please click on the specific login button you want me to use');
}
2. Permission Requests
// Ask before sensitive actions
const canProceed = await mcpPrompt('This will delete all items. Are you sure?', {
title: '⚠️ DESTRUCTIVE ACTION',
confirmText: 'YES, DELETE ALL',
cancelText: 'CANCEL'
});
3. Form Field Identification
// Help user identify form fields
mcpInspector.start(
'Please click on the email input field',
(element) => {
if (element.tagName !== 'input') {
mcpNotify.warning('That doesn\'t look like an input field. Try again?');
return;
}
mcpNotify.success('Perfect! I\'ll enter the email there.');
}
);
4. Dynamic Content Handling
// When content changes dynamically
mcpNotify.loading('Waiting for page to load...');
// ... wait for content ...
mcpNotify.done('Page loaded!');
const shouldWait = await mcpPrompt('The content is still loading. Should I wait longer?');
🎨 Visual Design
All messages and prompts use the cyberpunk "hacker matrix" theme:
- Black background with neon green text (#00ff00)
- Terminal-style Courier New font
- Glowing effects and smooth animations
- High contrast for excellent readability
- ESC key support for cancellation
🛠️ Implementation Guidelines for Models
Best Practices
- Clear Communication: Use descriptive messages that explain what you're doing
- Ask for Permission: Confirm before destructive or sensitive actions
- Collaborative Selection: When element location is ambiguous, ask user to click
- Progress Updates: Use loading/done messages for long operations
- Error Handling: Provide clear error messages with next steps
Example Workflows
// Complete login workflow with collaboration
async function collaborativeLogin() {
// 1. Ask for permission
const shouldLogin = await mcpPrompt('I need to log in. Should I proceed?');
if (!shouldLogin) return;
// 2. Get user to identify elements
mcpNotify.loading('Please help me find the login form...');
mcpInspector.start('Click on the username/email field', (emailField) => {
mcpNotify.success('Got the email field!');
mcpInspector.start('Now click on the password field', (passwordField) => {
mcpNotify.success('Got the password field!');
mcpInspector.start('Finally, click the login button', (loginButton) => {
mcpNotify.done('Perfect! I have all the elements I need.');
// Now use the XPaths for automation
performLogin(emailField.xpath, passwordField.xpath, loginButton.xpath);
});
});
});
}
🔧 Technical Notes
Initialization
These functions are automatically available after injecting the collaboration system:
// Check if available
if (typeof mcpMessage === 'function') {
mcpNotify.success('Collaboration system ready!');
}
Error Handling
All functions include built-in error handling and will gracefully fail if DOM manipulation isn't possible.
Performance
- Messages auto-clean up after display
- Event listeners are properly removed
- No memory leaks from repeated usage
This collaboration API transforms the MCP browser automation from a purely programmatic tool into an interactive, user-guided system that combines AI efficiency with human insight and precision.