Source: ocean/docs/stagehand.md | ✏️ Edit on GitHub

Stagehand.dev Configuration Guide

Stagehand is the AI-powered browser automation framework that lets you choose when to write code versus natural language, making it the natural choice for production e2e and UI testing. This guide provides step-by-step instructions for complete setup with Browserbase, OpenAI, Vitest, and Node.js 22.

Setting up your Stagehand environment from scratch

The fastest path to getting Stagehand running requires Node.js 20+ and avoiding Bun entirely due to Playwright compatibility issues. Start by ensuring you have the right Node.js version using nvm: nvm install 22 && nvm use 22. Stagehand offers three installation methods, with the interactive create-browser-app being the most beginner-friendly approach.

Quick start with create-browser-app provides an interactive setup that configures everything automatically. Running npx create-browser-app prompts you for project name, AI model preference (OpenAI GPT-4 vs Anthropic Claude), environment choice (Local vs Browserbase), and browser visibility settings. For specific templates, use npx create-browser-app --example quickstart or other available examples like chess or deploy-vercel.

Manual installation for existing projects requires installing core dependencies with npm install @browserbasehq/stagehand zod, followed by npx playwright install for local browser support. TypeScript projects should add npm install -D typescript @types/node for proper type support. The manual approach gives you more control over configuration but requires setting up environment variables and config files yourself.

Building from source offers maximum control for contributors or those needing bleeding-edge features. Clone the repository with git clone https://github.com/browserbase/stagehand.git, run npm install and npx playwright install, then build with npm run build. Examples can be run immediately with npm run example or specific demos like npm run example 2048.

Integrating Browserbase for scalable cloud execution

Browserbase transforms Stagehand from a local automation tool into a scalable cloud solution, offering SOC-2 Type 1 and HIPAA compliant browser infrastructure with advanced features like session replay, captcha solving, and global browser locations. The integration removes local resource constraints and provides production-ready deployment capabilities.

Setting up Browserbase requires signing up at browserbase.com, navigating to the Overview Dashboard, and copying your API key and Project ID. These credentials enable cloud browser execution through environment variables: BROWSERBASE_API_KEY and BROWSERBASE_PROJECT_ID. Optional session resumption uses BROWSERBASE_SESSION_ID for continuing previous browser sessions.

Advanced Browserbase configuration provides fine-grained control over browser behavior:

const stagehand = new Stagehand({
  env: 'BROWSERBASE',
  apiKey: process.env.BROWSERBASE_API_KEY,
  projectId: process.env.BROWSERBASE_PROJECT_ID,

  browserbaseSessionCreateParams: {
    projectId: process.env.BROWSERBASE_PROJECT_ID!,
    browserSettings: {
      blockAds: true,
      recordSession: true,
      solveCaptchas: true,
      viewport: { width: 1920, height: 1080 },
      locale: 'en-US',
      timezone: 'America/New_York',
    },
  },
})

Local execution suits development and simple tests, while Browserbase excels for production deployments, parallel test execution, and scenarios requiring consistent browser environments. The session recording feature particularly helps with debugging complex automations.

Configuring AI providers for intelligent automation

Stagehand supports multiple AI providers with OpenAI being the most popular choice, offering models from GPT-4o-mini for cost-effective testing to GPT-4o for complex automations. Anthropic's Claude models provide strong alternatives, especially Claude 3.5 Sonnet for nuanced understanding. Google's Gemini 2.0 Flash offers competitive performance at lower costs.

Environment setup for AI providers requires API keys in your .env file:

# Primary providers
OPENAI_API_KEY=sk-your-openai-key
ANTHROPIC_API_KEY=your-anthropic-key
GOOGLE_GENERATIVE_AI_API_KEY=your-google-key

Model configuration in code allows switching between providers easily:

// OpenAI configuration (recommended for most use cases)
const stagehand = new Stagehand({
  modelName: 'gpt-4o', // or "gpt-4o-mini" for cheaper testing
  modelClientOptions: {
    apiKey: process.env.OPENAI_API_KEY,
  },
})

// Anthropic for complex reasoning tasks
const stagehand = new Stagehand({
  modelName: 'claude-3-5-sonnet-latest',
  modelClientOptions: {
    apiKey: process.env.ANTHROPIC_API_KEY,
  },
})

Computer use models represent Stagehand 2.0's most powerful feature, enabling complex multi-step automations with a single instruction:

const agent = stagehand.agent({
  provider: 'openai',
  model: 'computer-use-preview',
  instructions: 'You are a helpful web automation assistant.',
})

await agent.execute('Find the cheapest flight from NYC to SF next month and add it to cart')

Token usage monitoring helps control costs with stagehand.metrics providing detailed breakdowns of prompt tokens, completion tokens, and inference times for each AI method. Enable logInferenceToFile: true to capture all LLM interactions for debugging and optimization.

Vitest configuration for robust test suites

While Stagehand doesn't mandate a specific test runner, Vitest provides an excellent modern testing experience with native TypeScript support and fast execution. Install Vitest with npm install -D vitest @vitest/ui and create a configuration that accommodates browser automation's longer execution times.

Essential Vitest configuration handles the unique requirements of browser testing:

// vitest.config.ts
import { defineConfig } from 'vitest/config'

export default defineConfig({
  test: {
    testTimeout: 60000, // Browser operations need more time
    hookTimeout: 30000,
    teardownTimeout: 10000,
    globals: true,
    environment: 'node',
  },
})

Structured test example demonstrates proper setup and teardown:

import { describe, it, expect, beforeAll, afterAll } from 'vitest'
import { Stagehand } from '@browserbasehq/stagehand'
import { z } from 'zod'

describe('E2E User Journey Tests', () => {
  let stagehand: Stagehand

  beforeAll(async () => {
    stagehand = new Stagehand({
      env: 'LOCAL',
      headless: true,
      verbose: 0, // Reduce noise in test output
    })
    await stagehand.init()
  })

  afterAll(async () => {
    await stagehand.close()
  })

  it('should complete checkout flow', async () => {
    const page = stagehand.page
    await page.goto('https://demo-store.com')

    await page.act("Search for 'wireless headphones'")
    await page.act('Click on the first product')

    const { price } = await page.extract({
      instruction: 'Extract the product price',
      schema: z.object({
        price: z.string(),
      }),
    })

    expect(price).toMatch(/\$\d+\.\d{2}/)

    await page.act('Add to cart')
    await page.act('Proceed to checkout')
  })
})

Package.json scripts streamline test execution: "test": "vitest" for watch mode during development, "test:run": "vitest run" for CI/CD pipelines, and "test:ui": "vitest --ui" for visual test debugging.

Node.js 22 environment considerations

Node.js 22 works perfectly with Stagehand, though version 20+ is the minimum requirement. The critical limitation is Bun's incompatibility with Playwright, making Node.js the only viable runtime. Use nvm for version management to ensure consistency across development environments.

Package manager selection impacts project maintainability. NPM works well for smaller projects with straightforward dependencies. PNPM excels in monorepos or projects with many dependencies, offering better disk space efficiency and faster installations through content-addressable storage.

TypeScript configuration optimizes for modern JavaScript features:

{
  "compilerOptions": {
    "target": "ES2022",
    "module": "ESNext",
    "moduleResolution": "node",
    "strict": true,
    "esModuleInterop": true,
    "lib": ["ES2022", "DOM", "DOM.Iterable"]
  }
}

Best practices for resilient automation

Writing effective Stagehand automations requires understanding when to leverage AI versus deterministic code. Use Stagehand's AI methods (act, extract, observe) for navigating unfamiliar pages, handling dynamic content, or building resilience against DOM changes. Reserve traditional Playwright for known selectors, performance-critical paths, and controlled environments.

Atomic actions yield better results than broad instructions. Instead of await page.act("Sign in to the website"), break it down: await page.act("Click the 'Sign In' button in the header") followed by await page.act("Type 'john@example.com' in the email field"). This granularity improves reliability and makes debugging easier.

Self-healing configuration enables automatic recovery from transient failures:

const stagehand = new Stagehand({
  enableSelfHealing: true,
  domSettleTimeoutMs: 30000, // Wait for dynamic content
  enableCaching: true, // Reuse successful action patterns
})

Performance optimization balances speed with reliability. Use observe() to preview actions before execution, saving tokens on repeated operations. Combine Playwright's speed for navigation with Stagehand's intelligence for interaction: await page.goto("https://example.com") followed by await page.act("Click the search button").

Complete configuration examples

A production-ready stagehand.config.ts demonstrates comprehensive configuration:

import type { ConstructorParams } from '@browserbasehq/stagehand'
import dotenv from 'dotenv'

dotenv.config()

const StagehandConfig: ConstructorParams = {
  env: process.env.BROWSERBASE_API_KEY ? 'BROWSERBASE' : 'LOCAL',

  // Browserbase cloud configuration
  apiKey: process.env.BROWSERBASE_API_KEY,
  projectId: process.env.BROWSERBASE_PROJECT_ID,
  browserbaseSessionCreateParams: {
    projectId: process.env.BROWSERBASE_PROJECT_ID!,
    browserSettings: {
      blockAds: true,
      recordSession: true,
      solveCaptchas: true,
      viewport: { width: 1920, height: 1080 },
    },
  },

  // AI model settings
  modelName: 'gpt-4o',
  modelClientOptions: {
    apiKey: process.env.OPENAI_API_KEY,
  },

  // Performance optimizations
  enableCaching: true,
  enableSelfHealing: true,
  domSettleTimeoutMs: 30000,

  // Debugging configuration
  verbose: process.env.NODE_ENV === 'development' ? 2 : 0,
  debugDom: process.env.NODE_ENV === 'development',
  logInferenceToFile: process.env.LOG_INFERENCE === 'true',
}

export default StagehandConfig

Troubleshooting common challenges

Browser automation failures often stem from timing issues or dynamic content. Increase domSettleTimeoutMs for slow-loading pages, use observe() to debug element detection, and leverage computer use agents for iframe-heavy sites that resist traditional automation.

API key issues manifest as authentication errors. Verify environment variables load correctly with console.log("API Key present:", !!process.env.OPENAI_API_KEY). Ensure .env sits in the project root and isn't accidentally gitignored. Rate limiting requires implementing retry logic or using cheaper models like gpt-4o-mini during development.

Advanced debugging combines multiple techniques:

const stagehand = new Stagehand({
  verbose: 2,
  debugDom: true,
  logInferenceToFile: true,
  localBrowserLaunchOptions: {
    headless: false,
    devtools: true, // Opens Chrome DevTools
  },
  logger: (logLine) => {
    if (logLine.category === 'error') {
      console.error(`[ERROR] ${logLine.message}`)
    }
  },
})

Latest Stagehand 2.0 capabilities

Version 2.0 introduced native computer use model support, enabling complex multi-step automations with minimal code. Performance improvements delivered lightning-fast act() and extract() methods using accessibility trees instead of DOM parsing. Enhanced caching reduces token usage while maintaining accuracy.

Context persistence enables resuming browser sessions across script executions:

const stagehand = new Stagehand({
  browserbaseSessionID: 'existing-session-id',
})

New model support includes GPT-4.1, Claude 3.7 Sonnet, and experimental Ollama integration for local models. The action history feature (stagehand.history) provides complete audit trails of all operations, invaluable for debugging complex workflows.

This configuration guide provides the foundation for building robust, scalable browser automations with Stagehand. The framework's unique combination of deterministic code and AI adaptability makes it ideal for production e2e testing where reliability matters most. Join the Stagehand Slack community for support and check the official documentation for the latest updates.

Setting up your Stagehand environment from scratch​

Integrating Browserbase for scalable cloud execution​

Configuring AI providers for intelligent automation​

Vitest configuration for robust test suites​

Node.js 22 environment considerations​

Best practices for resilient automation​

Complete configuration examples​

Troubleshooting common challenges​

Latest Stagehand 2.0 capabilities​