Claude can view screens and perform mouse and keyboard actions. Automate desktop workflows with natural language. Combine with other tools for complex tasks. Currently in beta with safety considerations.
Capabilities
View screen contents through screenshots. Move mouse and click elements. Type text and use keyboard shortcuts. Navigate between applications.
- Enable computer use in API requests
- Provide screenshots for Claude to analyze
- Execute returned tool calls for actions
- Handle safety constraints appropriately
- Test thoroughly before production use
Use Cases
Automate repetitive desktop tasks. Test web applications visually. Process legacy applications without APIs. Assist users with complex workflows.