A ready-to-run example is available here!The BrowserToolSet integration enables your agent to interact with web pages through automated browser control. Built on top of browser-use, it provides capabilities for navigating websites, clicking elements, filling forms, and extracting content - all through natural language instructions.
How It Works
The ready-to-run example demonstrates combining multiple tools to create a capable web research agent:- BrowserToolSet: Provides automated browser control for web interaction
- FileEditorTool: Allows the agent to read and write files if needed
- BashTool: Enables command-line operations for additional functionality
- Navigate to specified URLs
- Interact with web page elements (clicking, scrolling, etc.)
- Extract and analyze content from web pages
- Summarize information from multiple sources
Customization
For advanced use cases requiring only a subset of browser tools or custom configurations, you can manually register individual browser tools. Refer to the BrowserToolSet definition to see the available individual tools and create aBrowserToolExecutor with customized tool configurations before constructing the Agent.
This gives you fine-grained control over which browser capabilities are exposed to the agent.
Ready-to-run Example
This example is available on GitHub: examples/01_standalone_sdk/15_browser_use.py
examples/01_standalone_sdk/15_browser_use.py
The model name should follow the LiteLLM convention:
provider/model_name (e.g., anthropic/claude-sonnet-4-5-20250929, openai/gpt-4o).
The LLM_API_KEY should be the API key for your chosen provider.Next Steps
- Custom Tools - Create specialized tools
- MCP Integration - Connect external services

