Getting Started with Browser¶
The Amazon Bedrock AgentCore Browser provides a secure, managed environment for web browsing and automation. This guide will help you implement powerful browser capabilities in your agent applications.
Prerequisites¶
Before using the browser tool, ensure you have:
- An AWS account with appropriate permissions
- Python 3.10+ installed
Install the SDK¶
pip install bedrock-agentcore
Create a Browser Session¶
The bedrock-agentcore SDK provides a convenient way to create browser sessions:
from bedrock_agentcore.tools.browser_client import browser_session
# Create a browser session using the context manager
with browser_session("us-west-2") as client:
# The session_id is automatically generated
print(f"Session ID: {client.session_id}")
# Generate WebSocket URL and headers for connecting automation frameworks
websocket_url, headers = client.generate_ws_headers()
# Use these to connect your preferred browser automation tool
# (See examples below)
# The session is automatically closed when exiting the context manager
For more control over the session lifecycle:
from bedrock_agentcore.tools.browser_client import BrowserClient
# Create a browser client
client = BrowserClient(region="us-west-2")
# Start a browser session
client.start()
print(f"Session ID: {client.session_id}")
try:
# Generate WebSocket URL and headers
url, headers = client.generate_ws_headers()
# Perform browser operations with your preferred automation tool
finally:
# Always close the session when done
client.stop()
Integration Examples¶
Example 1: Browser Automation with Nova Act¶
Install dependencies¶
pip install nova-act
import time
from bedrock_agentcore.tools.browser_client import browser_session
from nova_act import NovaAct
from rich.console import Console
NOVA_ACT_API_KEY = "YOUR_NOVA_ACT_API_KEY"
console = Console()
def main():
try:
# Step 1: Create browser session
with browser_session("us-west-2") as client:
print("\r ✅ Browser ready! ")
ws_url, headers = client.generate_ws_headers()
# Step 2: Use Nova Act to interact with the browser
with NovaAct(
cdp_endpoint_url=ws_url,
cdp_headers=headers,
preview={"playwright_actuation": True},
nova_act_api_key=NOVA_ACT_API_KEY,
starting_page="https://www.amazon.com",
) as nova_act:
result = nova_act.act("Search for coffee maker and get the details of the lowest priced one on the first page")
console.print(f"\n[bold green]Nova Act Result:[/bold green] {result}")
except KeyboardInterrupt:
console.print("\n\n[yellow]Shutting down...[/yellow]")
if 'client' in locals():
client.stop()
print("✅ Browser session terminated")
except Exception as e:
print(f"\n[red]Error: {e}[/red]")
import traceback
traceback.print_exc()
if __name__ == "__main__":
main()
Example 2: Using Playwright for Browser Control¶
Install dependencies¶
pip install playwright
You can use the Playwright automation framework with the Browser Tool:
import time
import base64
from datetime import datetime
from playwright.sync_api import sync_playwright, Playwright, BrowserType
from bedrock_agentcore.tools.browser_client import browser_session
def capture_cdp_screenshot(context, page, filename_prefix="screenshot", image_format="jpeg"):
"""Capture a screenshot using the CDP API and save to file."""
cdp_client = context.new_cdp_session(page)
screenshot_data = cdp_client.send("Page.captureScreenshot", {
"format": image_format,
"quality": 80,
"captureBeyondViewport": True
})
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
filename = f"{filename_prefix}_{timestamp}.{image_format}"
image_bytes = base64.b64decode(screenshot_data['data'])
with open(filename, "wb") as f:
f.write(image_bytes)
print(f"✅ Screenshot saved: {filename}")
return filename
def main(playwright: Playwright):
with browser_session("us-west-2") as client:
print("📡 Browser session started... waiting for readiness")
ws_url, headers = client.generate_ws_headers()
chromium: BrowserType = playwright.chromium
browser = chromium.connect_over_cdp(ws_url, headers=headers)
try:
context = browser.contexts[0] if browser.contexts else browser.new_context()
page = context.pages[0] if context.pages else context.new_page()
# Step 1: Navigate to Amazon
print("🌐 Navigating to Amazon...")
page.goto("https://www.amazon.com", wait_until="domcontentloaded")
time.sleep(2)
capture_cdp_screenshot(context, page, "amazon_home")
# Step 2: Search for "coffee maker"
print("🔎 Searching for 'coffee maker'...")
page.fill("input#twotabsearchtextbox", "coffee maker")
page.keyboard.press("Enter")
page.wait_for_selector(".s-result-item", timeout=10000)
time.sleep(2)
capture_cdp_screenshot(context, page, "coffee_maker_results")
finally:
print("🔒 Closing browser session...")
if not page.is_closed():
page.close()
browser.close()
if __name__ == "__main__":
with sync_playwright() as p:
main(p)
Browser Tool Architecture¶
The AWS Browser Tool provides:
- Fully managed browser environment running in the AWS cloud
- WebSocket-based CDP interface for browser control
- Integration with popular automation frameworks like Playwright, Puppeteer, and NovaAct
- Security and isolation between browser sessions
Best Practices¶
To get the most out of the browser tool:
- Use context managers for proper resource cleanup
- Handle exceptions properly to ensure browser sessions are closed
- Consider performance when making multiple browser requests
- Secure sensitive data - never hardcode credentials in browser automation code
- Use appropriate timeouts for network and page load operations
- Add error recovery for common browser automation challenges