The Browser Tool enables AI-driven browser interactions. Launch browser sessions, click elements, type text, scroll pages, and capture screenshots through natural language commands.

What You’ll Learn

  • Session lifecycle: launch → interact → close
  • Browser actions: click, type, scroll
  • Use cases: UI testing, screenshots, navigation

Session Lifecycle

Every browser automation workflow follows a strict sequence:
  1. Launch - Start a browser session at a target URL
  2. Interact - Perform actions (click, type, scroll)
  3. Close - End the session to release resources
Browser state persists across actions within a session. You must close the browser before using other Verdent tools.
Each action returns a screenshot showing the current browser state. Review screenshots between actions to verify success before proceeding.

Browser Actions

Start a new browser session
  • Required: target URL
  • Opens browser at 1920x1080 resolution
  • Always the first action in any workflow
Launch browser at https://example.com
Coordinates are relative to the 1920x1080 viewport. Center is approximately (960, 540). Use screenshots to estimate element positions.

Common Use Cases

Test form submissions and navigation flowsLaunch at a login page, click input fields, type credentials, submit forms, and verify results through screenshots.
Launch browser at https://app.example.com/login
Click coordinates 450,280
Type "testuser@example.com"
Click coordinates 450,340
Type "password123"
Click coordinates 500,420
Close browser

Limitations

  • Tool exclusivity - Only browser_action can be used during active sessions
  • Coordinate-based - Requires x,y coordinates, not CSS selectors
  • Fixed resolution - Browser viewport locked at 1920x1080
  • Chrome only - Puppeteer supports Chrome/Chromium browsers
  • No persistence - Sessions don’t survive Verdent restarts
  • No WSL support - Browser Tool does not work in WSL environments
  • No saved state - Each session starts fresh without cookies or authentication
  • Single session - Only one browser session can be active at a time
Always close the browser session before using file operations, search tools, or bash commands. The browser locks other tools during active sessions.

See Also