Adds revolutionary features for MCP client identification and browser automation: MCP Client Debug System: - Floating pill toolbar with client identification and session info - Theme system with 5 built-in themes (minimal, corporate, hacker, glass, high-contrast) - Custom theme creation API with CSS variable overrides - Cross-site validation ensuring toolbar persists across navigation - Session-based injection with persistence across page loads Voice Collaboration (Prototype): - Web Speech API integration for conversational browser automation - Bidirectional voice communication between AI and user - Real-time voice guidance during automation tasks - Documented architecture and future development roadmap Code Injection Enhancements: - Model collaboration API for notify, prompt, and inspector functions - Auto-injection and persistence options - Toolbar integration with code injection system Documentation: - Comprehensive technical achievement documentation - Voice collaboration architecture and implementation guide - Theme system integration documentation - Tool annotation templates for consistency This represents a major advancement in browser automation UX, enabling unprecedented visibility and interaction patterns for MCP clients.
209 lines
7.1 KiB
Markdown
209 lines
7.1 KiB
Markdown
# MCP Model-User Collaboration API
|
|
|
|
This document describes the JavaScript functions available to models for direct user communication and collaborative element selection within the Playwright MCP browser automation system.
|
|
|
|
## 🎯 Core Philosophy
|
|
Enable seamless collaboration between AI models and human users by providing simple JavaScript APIs for real-time communication, confirmations, and interactive element selection.
|
|
|
|
## 📱 Messaging System
|
|
|
|
### Basic Messaging
|
|
```javascript
|
|
// Send messages to users with auto-dismiss
|
|
mcpMessage('Hello user!', 'info', 5000) // Info message (green)
|
|
mcpMessage('Success!', 'success', 3000) // Success message (bright green)
|
|
mcpMessage('Warning!', 'warning', 4000) // Warning message (yellow)
|
|
mcpMessage('Error occurred', 'error', 6000) // Error message (red)
|
|
mcpMessage('Persistent', 'info', 0) // Persistent until dismissed
|
|
```
|
|
|
|
### Helper Functions
|
|
```javascript
|
|
mcpNotify.info('Information for the user') // Standard info message
|
|
mcpNotify.success('Task completed!') // Success confirmation
|
|
mcpNotify.warning('Please be careful') // Cautionary message
|
|
mcpNotify.error('Something went wrong') // Error notification
|
|
mcpNotify.loading('Processing...') // Persistent loading indicator
|
|
mcpNotify.done('All finished!') // Quick success (3s auto-dismiss)
|
|
mcpNotify.failed('Task failed') // Quick error (5s auto-dismiss)
|
|
```
|
|
|
|
## 🤝 User Confirmation System
|
|
|
|
### Interactive Prompts
|
|
```javascript
|
|
// Ask user for confirmation
|
|
const confirmed = await mcpPrompt('Should I proceed with this action?');
|
|
if (confirmed) {
|
|
mcpNotify.success('User confirmed - proceeding!');
|
|
} else {
|
|
mcpNotify.info('User cancelled the action');
|
|
}
|
|
|
|
// Custom confirmation with options
|
|
const result = await mcpPrompt('Do you want to login first?', {
|
|
title: '🔐 LOGIN REQUIRED',
|
|
confirmText: 'YES, LOGIN',
|
|
cancelText: 'SKIP FOR NOW'
|
|
});
|
|
```
|
|
|
|
## 🔍 Collaborative Element Selection
|
|
|
|
### Interactive Element Inspector
|
|
```javascript
|
|
// Basic element selection
|
|
mcpInspector.start('Please click on the login button');
|
|
|
|
// Element selection with callback
|
|
mcpInspector.start(
|
|
'Click on the element you want me to interact with',
|
|
(elementDetails) => {
|
|
// Model receives detailed element information
|
|
console.log('User selected:', elementDetails);
|
|
|
|
// Use the XPath for precise automation
|
|
const xpath = elementDetails.xpath;
|
|
mcpNotify.success(`Got it! I'll click on: ${elementDetails.textContent}`);
|
|
|
|
// Now use xpath with Playwright tools...
|
|
}
|
|
);
|
|
|
|
// Stop inspection programmatically
|
|
mcpInspector.stop();
|
|
```
|
|
|
|
### Element Details Returned
|
|
When user clicks an element, the callback receives:
|
|
```javascript
|
|
{
|
|
tagName: 'a', // HTML tag
|
|
id: 'login-button', // Element ID (if present)
|
|
className: 'btn btn-primary', // CSS classes
|
|
textContent: 'Login', // Visible text (truncated to 100 chars)
|
|
xpath: '//*[@id="login-button"]', // Generated XPath
|
|
attributes: { // All HTML attributes
|
|
href: '/login',
|
|
class: 'btn btn-primary',
|
|
'data-action': 'login'
|
|
},
|
|
boundingRect: { // Element position/size
|
|
x: 100, y: 200,
|
|
width: 80, height: 32
|
|
},
|
|
visible: true // Element visibility status
|
|
}
|
|
```
|
|
|
|
## 🚀 Collaboration Patterns
|
|
|
|
### 1. Ambiguous Element Selection
|
|
```javascript
|
|
// When multiple similar elements exist
|
|
const confirmed = await mcpPrompt('I see multiple login buttons. Should I click the main one in the header?');
|
|
if (!confirmed) {
|
|
mcpInspector.start('Please click on the specific login button you want me to use');
|
|
}
|
|
```
|
|
|
|
### 2. Permission Requests
|
|
```javascript
|
|
// Ask before sensitive actions
|
|
const canProceed = await mcpPrompt('This will delete all items. Are you sure?', {
|
|
title: '⚠️ DESTRUCTIVE ACTION',
|
|
confirmText: 'YES, DELETE ALL',
|
|
cancelText: 'CANCEL'
|
|
});
|
|
```
|
|
|
|
### 3. Form Field Identification
|
|
```javascript
|
|
// Help user identify form fields
|
|
mcpInspector.start(
|
|
'Please click on the email input field',
|
|
(element) => {
|
|
if (element.tagName !== 'input') {
|
|
mcpNotify.warning('That doesn\'t look like an input field. Try again?');
|
|
return;
|
|
}
|
|
mcpNotify.success('Perfect! I\'ll enter the email there.');
|
|
}
|
|
);
|
|
```
|
|
|
|
### 4. Dynamic Content Handling
|
|
```javascript
|
|
// When content changes dynamically
|
|
mcpNotify.loading('Waiting for page to load...');
|
|
// ... wait for content ...
|
|
mcpNotify.done('Page loaded!');
|
|
|
|
const shouldWait = await mcpPrompt('The content is still loading. Should I wait longer?');
|
|
```
|
|
|
|
## 🎨 Visual Design
|
|
All messages and prompts use the cyberpunk "hacker matrix" theme:
|
|
- Black background with neon green text (#00ff00)
|
|
- Terminal-style Courier New font
|
|
- Glowing effects and smooth animations
|
|
- High contrast for excellent readability
|
|
- ESC key support for cancellation
|
|
|
|
## 🛠️ Implementation Guidelines for Models
|
|
|
|
### Best Practices
|
|
1. **Clear Communication**: Use descriptive messages that explain what you're doing
|
|
2. **Ask for Permission**: Confirm before destructive or sensitive actions
|
|
3. **Collaborative Selection**: When element location is ambiguous, ask user to click
|
|
4. **Progress Updates**: Use loading/done messages for long operations
|
|
5. **Error Handling**: Provide clear error messages with next steps
|
|
|
|
### Example Workflows
|
|
```javascript
|
|
// Complete login workflow with collaboration
|
|
async function collaborativeLogin() {
|
|
// 1. Ask for permission
|
|
const shouldLogin = await mcpPrompt('I need to log in. Should I proceed?');
|
|
if (!shouldLogin) return;
|
|
|
|
// 2. Get user to identify elements
|
|
mcpNotify.loading('Please help me find the login form...');
|
|
|
|
mcpInspector.start('Click on the username/email field', (emailField) => {
|
|
mcpNotify.success('Got the email field!');
|
|
|
|
mcpInspector.start('Now click on the password field', (passwordField) => {
|
|
mcpNotify.success('Got the password field!');
|
|
|
|
mcpInspector.start('Finally, click the login button', (loginButton) => {
|
|
mcpNotify.done('Perfect! I have all the elements I need.');
|
|
|
|
// Now use the XPaths for automation
|
|
performLogin(emailField.xpath, passwordField.xpath, loginButton.xpath);
|
|
});
|
|
});
|
|
});
|
|
}
|
|
```
|
|
|
|
## 🔧 Technical Notes
|
|
|
|
### Initialization
|
|
These functions are automatically available after injecting the collaboration system:
|
|
```javascript
|
|
// Check if available
|
|
if (typeof mcpMessage === 'function') {
|
|
mcpNotify.success('Collaboration system ready!');
|
|
}
|
|
```
|
|
|
|
### Error Handling
|
|
All functions include built-in error handling and will gracefully fail if DOM manipulation isn't possible.
|
|
|
|
### Performance
|
|
- Messages auto-clean up after display
|
|
- Event listeners are properly removed
|
|
- No memory leaks from repeated usage
|
|
|
|
This collaboration API transforms the MCP browser automation from a purely programmatic tool into an interactive, user-guided system that combines AI efficiency with human insight and precision. |