**AI 浏览器操控新范式：Vercel Agent Browser 完整实战指南，让你的 AI Agent 秒变网页交互高手**

AI 浏览器操控新范式：Vercel Agent Browser 完整实战指南，让你的 AI Agent 秒变网页交互高手

在人工智能飞速发展的今天，AI Agent 已经成为炙手可热的研究方向。然而，一个核心问题始终困扰着开发者：当 AI Agent 需要与真实网页进行交互时，应该如何实现？传统的 API 调用方案往往无法处理复杂的网页交互场景，而手动编写爬虫脚本又缺乏灵活性和通用性。

就在最近，Vercel Labs 推出了一款革命性的开源工具——Agent Browser，为这个难题提供了一个优雅的解决方案。这个工具能够让 AI Agent 像人类一样自然地操控浏览器，完成点击、填表、截图、执行 JavaScript 等操作，极大地降低了 AI 与网页交互的技术门槛。

本文将为你带来 Agent Browser 的完整实战指南，从环境搭建到高级应用，从基础概念到最佳实践，手把手教你掌握这个强大的工具。无论你是 AI 开发者、前端工程师，还是对自动化测试感兴趣的爱好者，都能从中获得有价值的内容。

为什么值得关注：重新定义 AI 与网页的交互方式

传统方案的痛点

在 Agent Browser 出现之前，开发者通常采用以下几种方案来实现 AI 与网页的交互，但每种方案都存在明显的局限性。

基于 API 的方案虽然稳定可靠，但适用范围极其有限。这类方案需要网站提供 API 接口，而大多数网站并没有这样的接口。更重要的是，即使有 API，也需要繁琐的认证流程和权限申请，开发成本高昂。

传统爬虫方案能够获取网页内容，但在处理动态渲染页面时力不从心。现代网页大量使用 JavaScript 动态生成内容，传统的 HTTP 请求只能获取初始 HTML，无法获取渲染后的完整页面。而且，爬虫方案难以处理需要登录、验证码、滑动验证等复杂交互的场景。

Selenium 和 Playwright 等自动化框架虽然功能强大，但它们本质上是被动工具，需要开发者精确指定每一步操作。这些框架无法理解页面的语义含义，更无法让 AI 自主决策下一步应该做什么。

Agent Browser 的核心优势

Agent Browser 的出现彻底改变了这一局面。它不仅仅是一个浏览器自动化工具，更是一个专为 AI Agent 设计的智能交互框架。

多模态理解能力是 Agent Browser 最显著的特点。与传统工具不同，Agent Browser 能够理解网页的视觉布局和语义结构。它可以识别页面上不同元素的类型（按钮、输入框、图片等）和功能含义（搜索、提交、导航等），从而让 AI 能够以更接近人类的方式与网页交互。

智能操作规划使得 AI Agent 能够自主完成复杂的网页任务。Agent Browser 提供了任务描述接口，你只需要告诉它目标是什么（比如“登录 GitHub 并在我的仓库中创建一个新项目”），它就能自动规划出一系列操作步骤并执行。

统一的抽象接口简化了开发流程。Agent Browser 屏蔽了底层浏览器操作的复杂性，提供了一套简洁直观的 API。开发者无需关心浏览器如何启动、元素如何定位、操作如何执行等底层细节，只需要关注业务逻辑的实现。

截图与视觉反馈让调试和问题排查变得轻而易举。每次操作后，Agent Browser 都可以生成当前页面的截图，帮助开发者直观地了解执行结果，及时发现和修正问题。

应用场景的广泛性

Agent Browser 的应用场景非常广泛，几乎涵盖了所有需要 AI 与网页交互的领域。

在智能客服与自动化办公场景中，Agent Browser 可以代替人工完成重复性的网页操作，如自动回复邮件、填写表单、查询信息等。

在数据采集与监控场景中，Agent Browser 能够处理复杂的动态网页，采集传统爬虫无法获取的数据，并支持定时监控和告警。

在自动化测试场景中，Agent Browser 可以模拟真实用户的操作行为，对网页应用进行端到端的功能测试，发现传统单元测试无法覆盖的问题。

在无代码/低代码平台场景中，Agent Browser 可以作为执行引擎，配合可视化流程编排工具，让非技术用户也能轻松实现网页自动化。

环境搭建：从零开始配置开发环境

系统要求与依赖

在开始之前，我们需要确保开发环境满足 Agent Browser 的基本要求。

Agent Browser 基于 Node.js 开发，因此首先需要安装 Node.js 运行时。建议使用 Node.js 18 或更高版本，以确保获得最佳的兼容性和性能支持。你可以通过以下命令检查当前 Node.js 的版本：

node --version

如果版本低于 18，建议先升级 Node.js。推荐使用 nvm（Node Version Manager）来管理多个 Node.js 版本，这样可以在不同项目间灵活切换。

除了 Node.js，Agent Browser 还需要 Playwright 作为底层的浏览器自动化引擎。Playwright 支持多种浏览器内核，包括 Chromium（Chrome 的开源版本）、Firefox 和 WebKit。由于 Chromium 在兼容性方面表现最佳且功能最完善，我们主要使用 Chromium 作为默认浏览器。

安装步骤详解

创建项目目录并初始化是最基本的第一步。选择一个合适的目录路径，在其中创建新的项目文件夹：

mkdir agent-browser-demo
cd agent-browser-demo
npm init -y

执行完上述命令后，npm 会自动生成一个基础的 package.json 文件。接下来，我们需要安装 Agent Browser 本身以及相关的依赖包：

npm install @vercel/agent-browser

这个命令会从 npm 官方仓库下载并安装 Agent Browser 及其所有依赖项。安装过程可能需要几分钟时间，取决于网络状况和系统性能。

Playwright 作为 Agent Browser 的底层依赖，通常会在安装过程中自动配置。但如果你遇到浏览器驱动相关的错误，可能需要手动安装浏览器：

npx playwright install chromium

这条命令会下载并安装 Chromium 浏览器及其相关的系统依赖。如果你的系统缺少某些运行时库（如 Linux 上的 libgtk、libnss 等），Playwright 会提示你安装相应的依赖包。

项目结构规划

为了更好地组织代码和资源，建议在项目中建立清晰的文件结构。以下是一个推荐的目录布局：

agent-browser-demo/
├── src/
│   ├── index.js          # 入口文件
│   ├── examples/         # 示例代码
│   │   ├── basic.js      # 基础功能示例
│   │   ├── search.js     # 搜索功能示例
│   │   └── screenshot.js # 截图功能示例
│   └── utils/            # 工具函数
│       └── helpers.js
├── screenshots/          # 截图保存目录
├── logs/                 # 日志文件目录
├── package.json
└── README.md

在 src 目录下创建 index.js 作为项目的主入口文件。这个文件将包含 Agent Browser 的核心配置和初始化逻辑：

// Agent Browser 初始化配置示例
// src/index.js

// 导入 Agent Browser 核心模块
const { AgentBrowser } = require('@vercel/agent-browser');

// 创建浏览器实例配置
const browserConfig = {
  headless: true,           // 是否以无头模式运行
  viewport: {
    width: 1920,            // 视口宽度
    height: 1080            // 视口高度
  },
  userAgent: 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36',
  timeout: 30000            // 操作超时时间（毫秒）
};

// 初始化浏览器实例
async function initializeBrowser() {
  const browser = new AgentBrowser(browserConfig);
  await browser.launch();
  return browser;
}

module.exports = { initializeBrowser };

核心功能详解：深入理解 Agent Browser 的设计哲学

Browser 实例管理

Agent Browser 的核心是 Browser 类，它封装了浏览器的所有操作。这个类采用 Promise 链式调用和 async/await 两种编程模式的支持，让异步操作的编写变得简洁直观。

创建浏览器实例是所有操作的第一步。Browser 构造函数接受一个配置对象，其中包含多个可定制选项：

// 完整的浏览器配置示例
const { AgentBrowser } = require('@vercel/agent-browser');

const config = {
  // 运行模式配置
  headless: true,  // true: 无头模式（不显示浏览器窗口）
                   // false: 有头模式（显示浏览器窗口，便于调试）

  // 视口配置 - 决定页面的渲染尺寸
  viewport: {
    width: 1280,   // 建议使用常见的桌面分辨率
    height: 720,
    deviceScaleFactor: 1  // 设备像素比，影响截图清晰度
  },

  // 用户代理字符串 - 标识浏览器身份
  userAgent: 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',

  // 网络请求配置
  ignoreHTTPSErrors: false,  // 是否忽略 HTTPS 证书错误
  slowMo: 0,                  // 操作延迟（毫秒），用于慢速录制

  // 超时配置
  timeout: 30000,             // 默认操作超时时间

  // 代理配置
  proxy: null,  // 支持 HTTP 和 SOCKS 代理，格式: { server: 'http://proxy:8080' }

  // 下载行为配置
  downloadPath: './downloads',
  acceptDownloads: true
};

// 创建浏览器实例
const browser = new AgentBrowser(config);

启动和关闭浏览器是必须掌握的基本操作。启动浏览器时会创建多个浏览器上下文（Context），每个上下文相互隔离，可以模拟多个独立的浏览器会话：

// 浏览器生命周期管理
async function manageBrowser() {
  const browser = new AgentBrowser();

  try {
    // 启动浏览器
    await browser.launch();
    console.log('浏览器启动成功');

    // 创建新页面
    const page = await browser.newPage();

    // 执行网页操作...

  } catch (error) {
    console.error('浏览器操作失败:', error);
  } finally {
    // 关闭浏览器，释放资源
    await browser.close();
    console.log('浏览器已关闭');
  }
}

Page 对象与页面操作

Page 对象代表一个浏览器标签页，是执行大多数网页操作的主要接口。它提供了丰富的方法来导航页面、操作元素、获取内容等。

页面导航是最基础的功能。Agent Browser 支持多种导航方式，可以根据不同的场景选择最合适的方法：

// 页面导航的各种方式
async function navigatePages(page) {
  // 直接导航到 URL
  await page.goto('https://github.com');

  // 导航到文件（本地 HTML 文件）
  await page.goto('file:///path/to/local/page.html');

  // 带选项的导航
  await page.goto('https://example.com', {
    waitUntil: 'networkidle',  // 等待网络空闲
    timeout: 30000,            // 超时时间
    referer: 'https://google.com'  // 来源页面
  });

  // 等待特定条件后再导航
  await page.goto('https://example.com');
  await page.waitForSelector('#content', { timeout: 5000 });

  // 前进和后退
  await page.goBack();
  await page.goForward();

  // 刷新页面
  await page.reload();
}

页面状态监控对于处理动态网页非常重要。Agent Browser 提供了多种等待机制，确保操作在正确的时机执行：

// 页面状态等待机制
async function waitForConditions(page) {
  // 等待 DOM 元素出现
  await page.waitForSelector('.login-form', { 
    timeout: 10000,
    state: 'visible'  // visible | hidden | attached | detached
  });

  // 等待函数返回 true
  await page.waitForFunction(() => {
    const element = document.querySelector('.result');
    return element && element.textContent.includes('Success');
  });

  // 等待网络请求完成
  await page.waitForLoadState('networkidle');

  // 等待导航完成
  await page.waitForNavigation({ waitUntil: 'domcontentloaded' }, async () => {
    await page.click('a.next-page');
  });

  // 等待超时时间
  await page.waitForTimeout(2000);  // 等待 2 秒
}

元素定位与操作

元素定位是网页自动化的核心技能。Agent Browser 支持多种定位策略，可以根据页面的特点和需求选择最优的方案。

CSS 选择器是最常用也是最灵活的元素定位方式。它功能强大，语法简洁，几乎可以定位页面上的任何元素：

// CSS 选择器定位元素
async function locateElements(page) {
  // 基本选择器
  const header = await page.$('header');           // 标签名
  const activeItem = await page.$('.nav.active');   // 类名
  const submitBtn = await page.$('#submit-btn');    // ID
  const link = await page.$('a[href*="login"]');    // 属性选择器

  // 组合选择器
  const menuItem = await page.$('nav.menu > ul > li:first-child');
  const disabledInput = await page.$('input[type="text"]:disabled');

  // 选择器组
  const buttons = await page.$$('button.primary, button.secondary');

  // 文本内容选择（Agent Browser 扩展支持）
  const loginLink = await page.getByText('登录');
  const searchBox = await page.getByPlaceholder('搜索...');

  // XPath 选择器（用于复杂定位场景）
  const element = await page.$('xpath=//div[@class="container"]//button');
}

元素交互涵盖了所有常见的用户操作。Agent Browser 将这些操作封装成简洁的方法调用：

// 元素交互操作
async function interactWithElements(page) {
  // 鼠标点击
  await page.click('#submit-button');

  // 双击和右键点击
  await page.dblclick('.file-item');
  await page.click('.context-menu', { button: 'right' });

  // 悬停（Hover）
  await page.hover('.dropdown-trigger');

  // 文本输入
  await page.fill('#search-input', '关键词');
  await page.type('#comment-input', '这是一段文字', { delay: 100 });  // 逐字输入

  // 清空输入框
  await page.fill('#search-input', '');

  // 文件上传
  await page.setInputFiles('#file-upload', '/path/to/file.pdf');

  // 下拉框选择
  await page.selectOption('#country-select', 'China');
  await page.selectOption('#skills-select', ['JavaScript', 'Python']);  // 多选

  // 复选框和单选框
  await page.check('#agree-terms');
  await page.radioCheck('input[name="plan"][value="pro"]');

  // 拖拽操作
  await page.dragAndDrop('.draggable', '.drop-zone');

  // 键盘操作
  await page.keyboard.press('Enter');
  await page.keyboard.down('Shift');
  await page.keyboard.up('Shift');
  await page.keyboard.type('Hello World');
}

表单处理与数据提交

表单是网页中最常见的交互元素，几乎所有需要用户输入数据的场景都离不开表单。Agent Browser 提供了完善的表单处理能力。

基础表单操作是最常见的场景，包括填写各种类型的输入框和提交表单：

// 基础表单操作示例
async function handleForm(page) {
  // 导航到表单页面
  await page.goto('https://example.com/register');

  // 填写文本输入框
  await page.fill('input[name="username"]', 'john_doe');
  await page.fill('input[name="email"]', 'john@example.com');
  await page.fill('input[name="password"]', 'SecurePass123');
  await page.fill('input[name="confirm_password"]', 'SecurePass123');

  // 填写文本域（多行文本）
  await page.fill('textarea[name="bio"]', '我是来自北京的前端开发者，热爱开源项目。');

  // 填写日期输入框
  await page.fill('input[type="date"]', '1995-06-15');

  // 选择下拉选项
  await page.selectOption('select[name="country"]', 'CN');
  await page.selectOption('select[name="timezone"]', { label: '北京时间 (UTC+8)' });

  // 处理复选框
  await page.check('input[name="newsletter"]');
  await page.check('input[name="terms"]');

  // 提交表单
  await page.click('button[type="submit"]');

  // 或者直接提交表单元素
  await page.$eval('form', form => form.submit());
}

复杂表单场景可能涉及动态加载的字段、验证逻辑、表单嵌套等情况：

// 复杂表单处理示例
async function handleComplexForm(page) {
  await page.goto('https://example.com/profile/edit');

  // 等待页面完全加载
  await page.waitForLoadState('networkidle');

  // 处理动态出现的表单字段
  await page.click('#add-phone');
  await page.waitForSelector('#phone-input', { state: 'visible' });
  await page.fill('#phone-input', '+86-138-0000-1234');

  // 处理带验证的输入框
  await page.fill('input[name="phone"]', '13800001234');
  await page.waitForSelector('.phone-valid', { state: 'visible' });

  // 处理文件上传（支持拖拽）
  const fileChooserPromise = page.waitForEvent('filechooser');
  await page.click('#upload-trigger');
  const fileChooser = await fileChooserPromise;
  await fileChooser.setFiles('/path/to/avatar.jpg');

  // 处理表单验证提示
  await page.click('button[type="submit"]');
  const errorMessage = await page.textContent('.error-message');
  console.log('验证错误:', errorMessage);

  // 修正错误并重新提交
  await page.fill('input[name="email"]', 'valid@example.com');
  await page.click('button[type="submit"]');

  // 等待提交结果
  await page.waitForSelector('.success-message', { timeout: 5000 });
  console.log('表单提交成功');
}

截图功能详解

截图是调试网页自动化脚本和记录操作结果的重要功能。Agent Browser 提供了灵活的截图 API，支持多种截图模式和格式。

基础截图操作可以快速捕获当前页面状态：

// 基础截图功能
async function takeScreenshots(page) {
  // 完整页面截图（滚动整个页面）
  await page.screenshot({ 
    path: './screenshots/full-page.png',
    fullPage: true
  });

  // 当前视口截图（只截取可见区域）
  await page.screenshot({ 
    path: './screenshots/viewport.png' 
  });

  // 带选项的截图
  await page.screenshot({
    path: './screenshots/optimized.png',
    type: 'png',           // png | jpeg
    quality: 90,           // 图片质量（1-100）
    omitBackground: false,  // 是否隐藏背景色
    encoding: 'base64'     // binary | base64
  });
}

元素截图可以精确定位并截取页面中的特定区域：

// 元素级别截图
async function elementScreenshots(page) {
  await page.goto('https://example.com/dashboard');

  // 获取特定元素并截取
  const element = await page.$('.chart-container');
  if (element) {
    await element.screenshot({ 
      path: './screenshots/chart.png' 
    });
  }

  // 截取元素并添加边框（用于调试定位是否正确）
  const boundingBox = await page.$eval('#target-element', el => {
    const rect = el.getBoundingClientRect();
    return { x: rect.x, y: rect.y, width: rect.width, height: rect.height };
  });

  await page.screenshot({
    path: './screenshots/highlighted.png',
    clip: {
      x: boundingBox.x - 10,
      y: boundingBox.y - 10,
      width: boundingBox.width + 20,
      height: boundingBox.height + 20
    }
  });
}

JavaScript 执行与注入

Agent Browser 允许在页面上下文中执行任意的 JavaScript 代码，这为处理复杂的交互逻辑提供了强大的灵活性。

执行页面脚本可以完成 DOM 操作、数据提取、状态修改等任务：

// JavaScript 执行与数据提取
async function executeJavaScript(page) {
  await page.goto('https://example.com/products');

  // 在页面上下文中执行脚本
  const title = await page.evaluate(() => {
    return document.title;
  });
  console.log('页面标题:', title);

  // 提取页面数据
  const products = await page.evaluate(() => {
    const items = document.querySelectorAll('.product-item');
    return Array.from(items).map(item => ({
      name: item.querySelector('.product-name').textContent,
      price: item.querySelector('.product-price').textContent,
      url: item.querySelector('.product-link').href
    }));
  });
  console.log('产品列表:', JSON.stringify(products, null, 2));

  // 修改页面状态
  await page.evaluate(() => {
    document.body.style.backgroundColor = '#f0f0f0';
    const header = document.querySelector('.header');
    header.style.position = 'fixed';
  });

  // 触发自定义事件
  await page.evaluate(() => {
    const event = new CustomEvent('app:initialized', { detail: { version: '1.0' } });
    window.dispatchEvent(event);
  });
}

高级 JavaScript 注入可以用于模拟复杂的用户交互和处理异步操作：

// 高级 JavaScript 注入技术
async function advancedInjection(page) {
  await page.goto('https://example.com/data-table');

  // 滚动加载更多数据
  await page.evaluate(async () => {
    const scrollContainer = document.querySelector('.scroll-container');
    while (scrollContainer.scrollHeight - scrollContainer.scrollTop <= scrollContainer.clientHeight + 100) {
      scrollContainer.scrollBy(0, 500);
      await new Promise(resolve => setTimeout(resolve, 500));
    }
  });

  // 监听网络请求并记录
  const networkData = await page.evaluate(() => {
    return new Promise(resolve => {
      const requests = [];
      window.addEventListener('fetch', event => {
        requests.push({
          url: event.request.url,
          method: event.request.method
        });
      });
      setTimeout(() => resolve(requests), 3000);
    });
  });

  // 注入第三方库并使用
  await page.evaluate(() => {
    const script = document.createElement('script');
    script.src = 'https://cdnjs.cloudflare.com/ajax/libs/moment.js/2.29.1/moment.min.js';
    document.head.appendChild(script);
  });
  await page.waitForFunction(() => typeof moment !== 'undefined');

  const formattedDate = await page.evaluate(() => {
    return moment().format('YYYY-MM-DD HH:mm:ss');
  });
  console.log('格式化日期:', formattedDate);
}

实战教程：从基础到高级的完整项目示例

项目一：GitHub 仓库信息采集工具

让我们从实际项目开始，首先创建一个 GitHub 仓库信息采集工具。这个工具将演示如何登录 GitHub、导航到特定仓库、采集关键信息。

需求分析：我们需要实现一个能够自动访问 GitHub 页面并采集仓库信息的工具。具体功能包括：获取仓库的基本信息（名称、描述、星标数、Fork 数）、获取最近的提交记录、获取贡献者列表。

代码实现：

// src/examples/github-scraper.js
// GitHub 仓库信息采集工具

const { AgentBrowser } = require('@vercel/agent-browser');

// 配置参数
const CONFIG = {
  github: {
    username: process.env.GITHUB_USERNAME || 'your-username',
    token: process.env.GITHUB_TOKEN || 'your-personal-access-token'
  },
  target: {
    owner: 'vercel',
    repo: 'next.js'
  }
};

// 初始化浏览器
async function initBrowser() {
  const browser = new AgentBrowser({
    headless: true,
    viewport: { width: 1920, height: 1080 }
  });
  await browser.launch();
  return browser;
}

// 采集仓库基本信息
async function scrapeRepositoryInfo(page) {
  const url = `https://github.com/${CONFIG.target.owner}/${CONFIG.target.repo}`;
  await page.goto(url, { waitUntil: 'networkidle' });

  // 等待仓库信息加载完成
  await page.waitForSelector('.repository-content', { timeout: 10000 });

  // 提取仓库信息
  const repoInfo = await page.evaluate(() => {
    // 获取仓库名称和所有者
    const fullName = document.querySelector('.public') ? 
      document.querySelector('.public .d-inline-flex').textContent.trim() : '';

    // 获取描述
    const descriptionEl = document.querySelector('[itemprop="description"]');
    const description = descriptionEl ? descriptionEl.textContent.trim() : '';

    // 获取统计数据
    const stats = {};
    document.querySelectorAll('[data-testid="social-count"]').forEach(el => {
      const text = el.textContent.trim();
      const label = el.getAttribute('aria-label') || '';
      stats[label] = text;
    });

    // 获取编程语言
    const languageEl = document.querySelector('[itemprop="programmingLanguage"]');
    const language = languageEl ? languageEl.textContent.trim() : '';

    return {
      fullName,
      description,
      language,
      stars: stats['stars'] || 'N/A',
      forks: stats['forks'] || 'N/A',
      watching: stats['watching'] || 'N/A'
    };
  });

  return repoInfo;
}

// 采集提交记录
async function scrapeCommits(page) {
  const url = `https://github.com/${CONFIG.target.owner}/${CONFIG.target.repo}/commits`;
  await page.goto(url, { waitUntil: 'networkidle' });

  await page.waitForSelector('.commits', { timeout: 10000 });

  const commits = await page.evaluate(() => {
    const items = document.querySelectorAll('.commit');
    return Array.from(items).slice(0, 10).map(item => {
      const messageEl = item.querySelector('.commit-title a');
      const authorEl = item.querySelector('.commit-author');
      const dateEl = item.querySelector('time');

      return {
        message: messageEl ? messageEl.textContent.trim() : '',
        author: authorEl ? authorEl.textContent.trim() : '',
        date: dateEl ? dateEl.getAttribute('datetime') : ''
      };
    });
  });

  return commits;
}

// 采集贡献者列表
async function scrapeContributors(page) {
  const url = `https://github.com/${CONFIG.target.owner}/${CONFIG.target.repo}/graphs/contributors`;
  await page.goto(url, { waitUntil: 'networkidle' });

  await page.waitForSelector('.graph-canvas', { timeout: 15000 });

  // 等待贡献者图表加载
  await page.waitForTimeout(2000);

  const contributors = await page.evaluate(() => {
    const items = document.querySelectorAll('.contribution-row');
    return Array.from(items).slice(0, 20).map(item => {
      const nameEl = item.querySelector('.contributor-name');
      const countEl = item.querySelector('.num');

      return {
        name: nameEl ? nameEl.textContent.trim() : '',
        contributions: countEl ? countEl.textContent.trim() : '0'
      };
    });
  });

  return contributors;
}

// 主函数
async function main() {
  let browser;
  try {
    console.log('='.repeat(50));
    console.log('GitHub 仓库信息采集工具');
    console.log('='.repeat(50));

    browser = await initBrowser();
    const page = await browser.newPage();

    // 采集仓库信息
    console.log('\n正在采集仓库基本信息...');
    const repoInfo = await scrapeRepositoryInfo(page);
    console.log('\n仓库基本信息:');
    console.log('-'.repeat(40));
    console.log(`仓库全名: ${repoInfo.fullName}`);
    console.log(`描述: ${repoInfo.description}`);
    console.log(`编程语言: ${repoInfo.language}`);
    console.log(`星标数: ${repoInfo.stars}`);
    console.log(`Fork 数: ${repoInfo.forks}`);

    // 采集提交记录
    console.log('\n正在采集最近的提交记录...');
    const commits = await scrapeCommits(page);
    console.log('\n最近提交:');
    console.log('-'.repeat(40));
    commits.forEach((commit, index) => {
      console.log(`${index + 1}. [${commit.date}] ${commit.message}`);
      console.log(`   作者: ${commit.author}`);
    });

    // 采集贡献者
    console.log('\n正在采集贡献者信息...');
    const contributors = await scrapeContributors(page);
    console.log('\n活跃贡献者:');
    console.log('-'.repeat(40));
    contributors.slice(0, 10).forEach((contributor, index) => {
      console.log(`${index + 1}. ${contributor.name} - ${contributor.contributions} 次提交`);
    });

    // 保存截图
    await page.screenshot({ 
      path: './screenshots/github-repo.png',
      fullPage: true 
    });
    console.log('\n截图已保存: ./screenshots/github-repo.png');

    console.log('\n' + '='.repeat(50));
    console.log('采集完成!');
    console.log('='.repeat(50));

  } catch (error) {
    console.error('采集过程中发生错误:', error);
  } finally {
    if (browser) {
      await browser.close();
    }
  }
}

// 运行主函数
main().catch(console.error);

运行结果：

执行上述脚本后，你将看到类似以下的输出：

==================================================
GitHub 仓库信息采集工具
==================================================

正在采集仓库基本信息...

仓库基本信息:
----------------------------------------
仓库全名: vercel / next.js
描述: The React Framework for the Web
编程语言: TypeScript
星标数: 118k
Fork 数: 26.6k

正在采集最近的提交记录...

最近提交:
----------------------------------------
1. [2024-01-15T10:30:00Z] chore: Update turbopack
   作者: Joe Haddad
2. [2024-01-15T09:15:00Z] fix: Correct HMR handling for edge runtime
   作者: Alex
3. [2024-01-14T18:45:00Z] feat: Add new image optimization API
   作者: Sarah Chen

正在采集贡献者信息...

活跃贡献者:
----------------------------------------
1. Guillermo Rauch - 1234 次提交
2. Timer150 - 892 次提交
3. ijjk - 756 次提交
...

截图已保存: ./screenshots/github-repo.png

==================================================
采集完成!
==================================================

项目二：自动化表单填写与提交

第二个项目将演示如何处理复杂的表单场景，包括动态表单字段、文件上传、表单验证等。

需求分析：创建一个可以自动填写在线表单的工具，我们将以一个模拟的注册表单为例，展示各种表单元素的处理方法。

代码实现：

// src/examples/form-automation.js
// 自动化表单填写与提交

const { AgentBrowser } = require('@vercel/agent-browser');

// 表单数据配置
const FORM_DATA = {
  personal: {
    firstName: '明',
    lastName: '张',
    email: 'zhangming@example.com',
    phone: '138-0000-1234',
    birthDate: '1995-06-15',
    gender: 'male'
  },
  account: {
    username: 'zhangming_dev',
    password: 'SecureP@ss123!',
    confirmPassword: 'SecureP@ss123!'
  },
  preferences: {
    newsletter: true,
    notifications: ['email', 'sms'],
    theme: 'dark'
  },
  avatar: './assets/avatar.jpg'
};

// 创建表单 HTML（用于演示）
function generateFormHTML() {
  return `
<!DOCTYPE html>
<html lang="zh-CN">
<head>
  <meta charset="UTF-8">
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
  <title>用户注册表单</title>
  <style>
    * { box-sizing: border-box; margin: 0; padding: 0; }
    body {
      font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
      background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
      min-height: 100vh;
      display: flex;
      align-items: center;
      justify-content: center;
      padding: 20px;
    }
    .form-container {
      background: white;
      border-radius: 20px;
      box-shadow: 0 20px 60px rgba(0,0,0,0.3);
      padding: 40px;
      max-width: 600px;
      width: 100%;
    }
    h1 {
      text-align: center;
      color: #333;
      margin-bottom: 30px;
    }
    .form-group {
      margin-bottom: 20px;
    }
    label {
      display: block;
      margin-bottom: 8px;
      color: #555;
      font-weight: 500;
    }
    input[type="text"],
    input[type="email"],
    input[type="password"],
    input[type="tel"],
    input[type="date"],
    select,
    textarea {
      width: 100%;
      padding: 12px 16px;
      border: 2px solid #e1e1e1;
      border-radius: 10px;
      font-size: 16px;
      transition: border-color 0.3s;
    }
    input:focus, select:focus, textarea:focus {
      outline: none;
      border-color: #667eea;
    }
    .name-row {
      display: grid;
      grid-template-columns: 1fr 1fr;
      gap: 20px;
    }
    .checkbox-group, .radio-group {
      display: flex;
      gap: 20px;
    }
    .checkbox-group label, .radio-group label {
      display: flex;
      align-items: center;
      gap: 8px;
      font-weight: normal;
    }
    .file-upload {
      border: 2px dashed #ccc;
      border-radius: 10px;
      padding: 30px;
      text-align: center;
      cursor: pointer;
      transition: all 0.3s;
    }
    .file-upload:hover {
      border-color: #667eea;
      background: #f8f8ff;
    }
    .file-upload input { display: none; }
    .file-preview {
      max-width: 100px;
      max-height: 100px;
      margin-top: 10px;
      border-radius: 50%;
    }
    button {
      width: 100%;
      padding: 16px;
      background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
      color: white;
      border: none;
      border-radius: 10px;
      font-size: 18px;
      font-weight: 600;
      cursor: pointer;
      transition: transform 0.2s;
    }
    button:hover {
      transform: translateY(-2px);
    }
    button:disabled {
      opacity: 0.6;
      cursor: not-allowed;
    }
    .error {
      border-color: #e74c3c !important;
    }
    .error-message {
      color: #e74c3c;
      font-size: 12px;
      margin-top: 5px;
    }
    .success-message {
      display: none;
      background: #2ecc71;
      color: white;
      padding: 20px;
      border-radius: 10px;
      text-align: center;
    }
  </style>
</head>
<body>
  <div class="form-container">
    <h1>用户注册</h1>
    <form id="registration-form">
      <div class="form-group name-row">
        <div>
          <label for="firstName">名 *</label>
          <input type="text" id="firstName" name="firstName" required>
        </div>
        <div>
          <label for="lastName">姓 *</label>
          <input type="text" id="lastName" name="lastName" required>
        </div>
      </div>

      <div class="form-group">
        <label for="email">邮箱地址 *</label>
        <input type="email" id="email" name="email" placeholder="your@email.com" required>
      </div>

      <div class="form-group">
        <label for="phone">手机号码</label>
        <input type="tel" id="phone" name="phone" placeholder="138-0000-0000">
      </div>

      <div class="form-group">
        <label for="birthDate">出生日期</label>
        <input type="date" id="birthDate" name="birthDate">
      </div>

      <div class="form-group">
        <label for="gender">性别</label>
        <select id="gender" name="gender">
          <option value="">请选择</option>
          <option value="male">男</option>
          <option value="female">女</option>
          <option value="other">其他</option>
        </select>
      </div>

      <div class="form-group">
        <label for="username">用户名 *</label>
        <input type="text" id="username" name="username" minlength="3" required>
      </div>

      <div class="form-group">
        <label for="password">密码 *</label>
        <input type="password" id="password" name="password" minlength="8" required>
        <div class="error-message" id="password-error"></div>
      </div>

      <div class="form-group">
        <label for="confirmPassword">确认密码 *</label>
        <input type="password" id="confirmPassword" name="confirmPassword" required>
      </div>

      <div class="form-group">
        <label>接收通知</label>
        <div class="checkbox-group">
          <label><input type="checkbox" name="newsletter" value="newsletter"> 订阅新闻</label>
          <label><input type="checkbox" name="notifications" value="email"> 邮件通知</label>
          <label><input type="checkbox" name="notifications" value="sms"> 短信通知</label>
        </div>
      </div>

      <div class="form-group">
        <label>头像</label>
        <div class="file-upload" id="avatar-upload">
          <input type="file" id="avatar" name="avatar" accept="image/*">
          <div class="upload-content">
            <div class="upload-icon">📷</div>
            <div class="upload-text">点击上传头像</div>
          </div>
          <img class="file-preview" id="preview" style="display: none;">
        </div>
      </div>

      <button type="submit" id="submit-btn">注册</button>
    </form>

    <div class="success-message" id="success-message">
      🎉 注册成功！欢迎加入我们！
    </div>
  </div>

  <script>
    // 密码验证
    const passwordInput = document.getElementById('password');
    const confirmPasswordInput = document.getElementById('confirmPassword');
    const passwordError = document.getElementById('password-error');

    passwordInput.addEventListener('input', () => {
      const password = passwordInput.value;
      if (password.length < 8) {
        passwordError.textContent = '密码长度至少为 8 个字符';
        passwordInput.classList.add('error');
      } else if (!/[A-Z]/.test(password)) {
        passwordError.textContent = '密码必须包含至少一个大写字母';
        passwordInput.classList.add('error');
      } else if (!/[0-9]/.test(password)) {
        passwordError.textContent = '密码必须包含至少一个数字';
        passwordInput.classList.add('error');
      } else {
        passwordError.textContent = '';
        passwordInput.classList.remove('error');
      }
    });

    confirmPasswordInput.addEventListener('input', () => {
      if (confirmPasswordInput.value !== passwordInput.value) {
        confirmPasswordInput.setCustomValidity('两次输入的密码不匹配');
      } else {
        confirmPasswordInput.setCustomValidity('');
      }
    });

    // 文件上传预览
    const fileInput = document.getElementById('avatar');
    const preview = document.getElementById('preview');
    const uploadContent = document.querySelector('.upload-content');

    fileInput.addEventListener('change', function() {
      const file = this.files[0];
      if (file) {
        const reader = new FileReader();
        reader.onload = function(e) {
          preview.src = e.target.result;
          preview.style.display = 'block';
          uploadContent.style.display = 'none';
        };
        reader.readAsDataURL(file);
      }
    });

    // 表单提交
    const form = document.getElementById('registration-form');
    form.addEventListener('submit', async (e) => {
      e.preventDefault();
      const submitBtn = document.getElementById('submit-btn');
      submitBtn.textContent = '正在提交...';
      submitBtn.disabled = true;

      // 模拟提交延迟
      await new Promise(resolve => setTimeout(resolve, 1500));

      form.style.display = 'none';
      document.getElementById('success-message').style.display = 'block';
    });
  </script>
</body>
</html>
  `;
}

// 自动化填写表单
async function autoFillForm(page) {
  // 加载表单页面
  await page.setContent(generateFormHTML());
  await page.waitForLoadState('domcontentloaded');

  console.log('表单已加载，开始自动填写...\n');

  // 填写姓名
  console.log('填写姓名...');
  await page.fill('#firstName', FORM_DATA.personal.firstName);
  await page.fill('#lastName', FORM_DATA.personal.lastName);

  // 填写邮箱
  console.log('填写邮箱...');
  await page.fill('#email', FORM_DATA.personal.email);

  // 填写手机号码
  console.log('填写手机号码...');
  await page.fill('#phone', FORM_DATA.personal.phone);

  // 填写出生日期
  console.log('填写出生日期...');
  await page.fill('#birthDate', FORM_DATA.personal.birthDate);

  // 选择性别
  console.log('选择性别...');
  await page.selectOption('#gender', FORM_DATA.personal.gender);

  // 填写用户名
  console.log('填写用户名...');
  await page.fill('#username', FORM_DATA.account.username);

  // 填写密码（触发验证）
  console.log('填写密码...');
  await page.fill('#password', FORM_DATA.account.password);
  await page.waitForTimeout(500);  // 等待验证消息

  // 确认密码
  console.log('确认密码...');
  await page.fill('#confirmPassword', FORM_DATA.account.confirmPassword);

  // 处理复选框
  console.log('设置通知偏好...');
  if (FORM_DATA.preferences.newsletter) {
    await page.check('input[name="newsletter"]');
  }

  // 处理多个复选框
  for (const notification of FORM_DATA.preferences.notifications) {
    await page.check(`input[name="notifications"][value="${notification}"]`);
  }

  // 文件上传
  console.log('上传头像...');
  const fileInput = await page.$('#avatar');
  await fileInput.setInputFiles(FORM_DATA.avatar);

  // 等待文件预览加载
  await page.waitForTimeout(500);

  // 截图表单填写完成状态
  await page.screenshot({ 
    path: './screenshots/form-filled.png',
    fullPage: true 
  });
  console.log('表单填写完成截图已保存');

  // 提交表单
  console.log('\n提交表单...');
  await page.click('#submit-btn');

  // 等待提交结果
  await page.waitForSelector('#success-message', { state: 'visible', timeout: 5000 });

  // 截图提交结果
  await page.screenshot({ 
    path: './screenshots/form-success.png',
    fullPage: true 
  });
  console.log('提交成功截图已保存');

  // 获取提交后的数据
  const submittedData = await page.evaluate(() => {
    const formData = new FormData(document.getElementById('registration-form'));
    return Object.fromEntries(formData);
  });

  console.log('\n提交的数据:');
  console.log('-'.repeat(40));
  console.log(JSON.stringify(submittedData, null, 2));
}

// 主函数
async function main() {
  let browser;
  try {
    console.log('='.repeat(50));
    console.log('自动化表单填写演示');
    console.log('='.repeat(50) + '\n');

    browser = new AgentBrowser({ headless: true });
    await browser.launch();
    const page = await browser.newPage();

    await autoFillForm(page);

    console.log('\n' + '='.repeat(50));
    console.log('演示完成!');
    console.log('='.repeat(50));

  } catch (error) {
    console.error('执行过程中发生错误:', error);
    throw error;
  } finally {
    if (browser) {
      await browser.close();
    }
  }
}

// 运行
main().catch(console.error);

项目三：网页自动化测试框架

第三个项目将展示如何将 Agent Browser 集成到自动化测试框架中，实现端到端的功能测试。

需求分析：创建一个可复用的测试框架，支持页面导航验证、元素存在性检查、表单交互测试、断言和报告生成。

代码实现：

// src/test-framework/test-runner.js
// Agent Browser 自动化测试框架

const { AgentBrowser } = require('@vercel/agent-browser');
const fs = require('fs');
const path = require('path');

// 测试结果收集器
class TestRunner {
  constructor(options = {}) {
    this.browser = null;
    this.context = null;
    this.page = null;
    this.results = {
      passed: 0,
      failed: 0,
      errors: 0,
      tests: []
    };
    this.screenshotDir = options.screenshotDir || './screenshots';
    this.headless = options.headless !== false;

    // 确保截图目录存在
    if (!fs.existsSync(this.screenshotDir)) {
      fs.mkdirSync(this.screenshotDir, { recursive: true });
    }
  }

  // 初始化浏览器
  async init() {
    this.browser = new AgentBrowser({
      headless: this.headless,
      viewport: { width: 1280, height: 720 }
    });
    await this.browser.launch();
    this.context = this.browser;
    this.page = await this.browser.newPage();
    console.log('浏览器已初始化\n');
  }

  // 关闭浏览器
  async close() {
    if (this.browser) {
      await this.browser.close();
      console.log('\n浏览器已关闭');
    }
  }

  // 导航到指定页面
  async navigate(url, options = {}) {
    const startTime = Date.now();
    try {
      await this.page.goto(url, {
        waitUntil: options.waitUntil || 'networkidle',
        timeout: options.timeout || 30000
      });
      console.log(`✓ 导航到: ${url}`);
      return true;
    } catch (error) {
      console.log(`✗ 导航失败: ${url}`);
      console.log(`  错误: ${error.message}`);
      return false;
    }
  }

  // 等待指定时间
  async wait(ms) {
    await this.page.waitForTimeout(ms);
  }

  // 断言：元素存在
  async assertElementExists(selector, timeout = 5000) {
    const testName = `元素存在: ${selector}`;
    try {
      await this.page.waitForSelector(selector, { timeout });
      this.recordTest(testName, 'passed');
      console.log(`✓ ${testName}`);
      return true;
    } catch (error) {
      this.recordTest(testName, 'failed', error.message);
      console.log(`✗ ${testName}`);
      return false;
    }
  }

  // 断言：元素包含文本
  async assertTextContains(selector, expectedText, timeout = 5000) {
    const testName = `文本包含 "${expectedText}"`;
    try {
      await this.page.waitForSelector(selector, { timeout });
      const actualText = await this.page.textContent(selector);
      if (actualText.includes(expectedText)) {
        this.recordTest(testName, 'passed');
        console.log(`✓ ${testName}`);
        return true;
      } else {
        throw new Error(`期望包含 "${expectedText}"，实际为 "${actualText}"`);
      }
    } catch (error) {
      this.recordTest(testName, 'failed', error.message);
      console.log(`✗ ${testName}: ${error.message}`);
      return false;
    }
  }

  // 断言：元素具有指定属性
  async assertAttribute(selector, attribute, expectedValue, timeout = 5000) {
    const testName = `属性验证: ${selector}[${attribute}]`;
    try {
      await this.page.waitForSelector(selector, { timeout });
      const actualValue = await this.page.getAttribute(selector, attribute);
      if (actualValue === expectedValue) {
        this.recordTest(testName, 'passed');
        console.log(`✓ ${testName}`);
        return true;
      } else {
        throw new Error(`期望 "${expectedValue}"，实际 "${actualValue}"`);
      }
    } catch (error) {
      this.recordTest(testName, 'failed', error.message);
      console.log(`✗ ${testName}: ${error.message}`);
      return false;
    }
  }

  // 断言：页面标题
  async assertTitle(expectedTitle) {
    const testName = `页面标题: "${expectedTitle}"`;
    try {
      const actualTitle = await this.page.title();
      if (actualTitle === expectedTitle) {
        this.recordTest(testName, 'passed');
        console.log(`✓ ${testName}`);
        return true;
      } else {
        throw new Error(`期望 "${expectedTitle}"，实际 "${actualTitle}"`);
      }
    } catch (error) {
      this.recordTest(testName, 'failed', error.message);
      console.log(`✗ ${testName}: ${error.message}`);
      return false;
    }
  }

  // 断言：页面 URL
  async assertUrl(pattern) {
    const testName = `URL 匹配: ${pattern}`;
    try {
      const url = this.page.url();
      const regex = new RegExp(pattern);
      if (regex.test(url)) {
        this.recordTest(testName, 'passed');
        console.log(`✓ ${testName}`);
        return true;
      } else {
        throw new Error(`URL "${url}" 不匹配模式 "${pattern}"`);
      }
    } catch (error) {
      this.recordTest(testName, 'failed', error.message);
      console.log(`✗ ${testName}: ${error.message}`);
      return false;
    }
  }

  // 断言：自定义条件
  async assert(condition, testName) {
    try {
      if (condition) {
        this.recordTest(testName, 'passed');
        console.log(`✓ ${testName}`);
        return true;
      } else {
        throw new Error('条件不满足');
      }
    } catch (error) {
      this.recordTest(testName, 'failed', error.message);
      console.log(`✗ ${testName}: ${error.message}`);
      return false;
    }
  }

  // 点击元素
  async click(selector, options = {}) {
    try {
      await this.page.click(selector, options);
      console.log(`✓ 点击: ${selector}`);
      return true;
    } catch (error) {
      console.log(`✗ 点击失败: ${selector}`);
      console.log(`  错误: ${error.message}`);
      return false;
    }
  }

  // 输入文本
  async type(selector, text, options = {}) {
    try {
      if (options.clear) {
        await this.page.fill(selector, '');
      }
      await this.page.fill(selector, text);
      console.log(`✓ 输入: ${selector} = "${text}"`);
      return true;
    } catch (error) {
      console.log(`✗ 输入失败: ${selector}`);
      console.log(`  错误: ${error.message}`);
      return false;
    }
  }

  // 选择下拉选项
  async selectOption(selector, value) {
    try {
      await this.page.selectOption(selector, value);
      console.log(`✓ 选择: ${selector} = "${value}"`);
      return true;
    } catch (error) {
      console.log(`✗ 选择失败: ${selector}`);
      console.log(`  错误: ${error.message}`);
      return false;
    }
  }

  // 截图
  async screenshot(name) {
    const filename = `${name}-${Date.now()}.png`;
    const filepath = path.join(this.screenshotDir, filename);
    await this.page.screenshot({ path: filepath });
    console.log(`📷 截图已保存: ${filename}`);
    return filepath;
  }

  // 记录测试结果
  recordTest(name, status, error = null) {
    this.results.tests.push({ name, status, error });
    if (status === 'passed') {
      this.results.passed++;
    } else if (status === 'failed') {
      this.results.failed++;
    } else {
      this.results.errors++;
    }
  }

  // 生成报告
  generateReport() {
    console.log('\n' + '='.repeat(50));
    console.log('测试报告');
    console.log('='.repeat(50));
    console.log(`总测试数: ${this.results.tests.length}`);
    console.log(`通过: ${this.results.passed} ✓`);
    console.log(`失败: ${this.results.failed} ✗`);
    console.log(`错误: ${this.results.errors} !`);
    console.log(`通过率: ${((this.results.passed / this.results.tests.length) * 100).toFixed(1)}%`);

    if (this.results.failed > 0) {
      console.log('\n失败详情:');
      console.log('-'.repeat(40));
      this.results.tests
        .filter(t => t.status === 'failed')
        .forEach((test, index) => {
          console.log(`${index + 1}. ${test.name}`);
          console.log(`   错误: ${test.error}`);
        });
    }

    console.log('\n' + '='.repeat(50));
    return this.results;
  }
}

// 创建示例测试页面
function createTestPage() {
  return `
<!DOCTYPE html>
<html lang="zh-CN">
<head>
  <meta charset="UTF-8">
  <title>测试页面 - 产品管理系统</title>
  <style>
    body { font-family: Arial, sans-serif; max-width: 800px; margin: 50px auto; padding: 20px; }
    .header { border-bottom: 2px solid #333; padding-bottom: 20px; margin-bottom: 30px; }
    .form-group { margin-bottom: 20px; }
    label { display: block; margin-bottom: 5px; font-weight: bold; }
    input, select { width: 100%; padding: 10px; border: 1px solid #ccc; border-radius: 4px; }
    button { padding: 12px 24px; background: #007bff; color: white; border: none; border-radius: 4px; cursor: pointer; }
    button:hover { background: #0056b3; }
    .product-list { margin-top: 30px; }
    .product-item { padding: 15px; border: 1px solid #ddd; margin-bottom: 10px; border-radius: 4px; }
    .product-name { font-size: 18px; font-weight: bold; color: #333; }
    .product-price { color: #28a745; font-size: 16px; }
    .error { color: #dc3545; font-size: 14px; margin-top: 5px; display: none; }
    .success { color: #28a745; padding: 20px; background: #d4edda; border-radius: 4px; display: none; }
  </style>
</head>
<body>
  <div class="header">
    <h1>产品管理系统</h1>
    <p>简单的产品管理演示页面</p>
  </div>

  <div class="form-section">
    <h2>添加新产品</h2>
    <form id="product-form">
      <div class="form-group">
        <label for="product-name">产品名称 *</label>
        <input type="text" id="product-name" name="productName" required>
        <div class="error" id="name-error">请输入产品名称</div>
      </div>
      <div class="form-group">
        <label for="product-price">产品价格 *</label>
        <input type="number" id="product-price" name="productPrice" min="0" step="0.01" required>
        <div class="error" id="price-error">请输入有效价格</div>
      </div>
      <div class="form-group">
        <label for="product-category">产品类别</label>
        <select id="product-category" name="productCategory">
          <option value="">请选择类别</option>
          <option value="electronics">电子产品</option>
          <option value="clothing">服装</option>
          <option value="food">食品</option>
        </select>
      </div>
      <button type="submit" id="add-product-btn">添加产品</button>
    </form>
  </div>

  <div class="success" id="success-message">
    ✓ 产品添加成功！
  </div>

  <div class="product-list" id="product-list">
    <h2>产品列表</h2>
    <div class="product-item" data-id="1">
      <div class="product-name">MacBook Pro</div>
      <div class="product-price">¥12999.00</div>
    </div>
    <div class="product-item" data-id="2">
      <div class="product-name">iPhone 15</div>
      <div class="product-price">¥6999.00</div>
    </div>
  </div>

  <script>
    const form = document.getElementById('product-form');
    const productList = document.getElementById('product-list');
    const successMessage = document.getElementById('success-message');

    let productIdCounter = 3;

    form.addEventListener('submit', function(e) {
      e.preventDefault();

      const nameInput = document.getElementById('product-name');
      const priceInput = document.getElementById('product-price');
      const categorySelect = document.getElementById('product-category');
      const nameError = document.getElementById('name-error');
      const priceError = document.getElementById('price-error');

      // 清除错误
      nameError.style.display = 'none';
      priceError.style.display = 'none';
      nameInput.style.borderColor = '#ccc';
      priceInput.style.borderColor = '#ccc';

      // 验证
      let hasError = false;
      if (!nameInput.value.trim()) {
        nameError.style.display = 'block';
        nameInput.style.borderColor = '#dc3545';
        hasError = true;
      }
      if (!priceInput.value || parseFloat(priceInput.value) <= 0) {
        priceError.style.display = 'block';
        priceInput.style.borderColor = '#dc3545';
        hasError = true;
      }

      if (hasError) return;

      // 添加产品
      const productItem = document.createElement('div');
      productItem.className = 'product-item';
      productItem.setAttribute('data-id', productIdCounter++);
      productItem.innerHTML = \`
        <div class="product-name">\${nameInput.value}</div>
        <div class="product-price">¥\${parseFloat(priceInput.value).toFixed(2)}</div>
      \`;
      productList.appendChild(productItem);

      // 显示成功消息
      successMessage.style.display = 'block';
      setTimeout(() => {
        successMessage.style.display = 'none';
      }, 3000);

      // 清空表单
      form.reset();
    });
  </script>
</body>
</html>
  `;
}

// 运行测试套件
async function runTests() {
  const runner = new TestRunner({ 
    screenshotDir: './screenshots',
    headless: true 
  });

  try {
    // 初始化
    await runner.init();

    // 加载测试页面
    const testHtml = createTestPage();
    await runner.page.setContent(testHtml);

    console.log('开始执行测试套件\n');
    console.log('-'.repeat(50) + '\n');

    // 测试 1: 页面标题验证
    await runner.assertTitle('测试页面 - 产品管理系统');

    // 测试 2: 关键元素存在性
    await runner.assertElementExists('#product-form');
    await runner.assertElementExists('#product-name');
    await runner.assertElementExists('#product-price');
    await runner.assertElementExists('#add-product-btn');
    await runner.assertElementExists('.product-item');

    // 测试 3: 初始产品列表
    const initialProducts = await runner.page.$$('.product-item');
    await runner.assert(initialProducts.length === 2, `初始产品数量: ${initialProducts.length} === 2`);

    // 测试 4: 填写表单
    await runner.screenshot('form-empty');
    await runner.type('#product-name', 'AirPods Pro', { clear: true });
    await runner.type('#product-price', '1999.00', { clear: true });
    await runner.selectOption('#product-category', 'electronics');
    await runner.screenshot('form-filled');

    // 测试 5: 提交表单
    await runner.click('#add-product-btn');
    await runner.wait(500);

    // 测试 6: 验证产品添加成功
    const updatedProducts = await runner.page.$$('.product-item');
    await runner.assert(updatedProducts.length === 3, `添加后产品数量: ${updatedProducts.length} === 3`);

    // 测试 7: 验证成功消息显示
    await runner.assertElementExists('#success-message');

    // 测试 8: 验证新添加的产品
    await runner.screenshot('after-add');
    const lastProductName = await runner.page.textContent('.product-item:last-child .product-name');
    await runner.assertTextContains('.product-item:last-child .product-name', 'AirPods Pro');
    console.log(`  新产品名称: ${lastProductName}`);

    // 测试 9: 表单验证 - 空名称
    await runner.click('#add-product-btn');
    await runner.wait(300);
    const nameErrorVisible = await runner.page.isVisible('#name-error');
    await runner.assert(nameErrorVisible, '空名称时显示错误提示');

    // 测试 10: 表单验证 - 无效价格
    await runner.type('#product-name', '测试产品', { clear: true });
    await runner.type('#product-price', '-100', { clear: true });
    await runner.click('#add-product-btn');
    await runner.wait(300);
    const priceErrorVisible = await runner.page.isVisible('#price-error');
    await runner.assert(priceErrorVisible, '无效价格时显示错误提示');

    // 生成报告
    const results = runner.generateReport();

    // 保存测试报告
    const reportPath = path.join(runner.screenshotDir, 'test-report.json');
    fs.writeFileSync(reportPath, JSON.stringify(results, null, 2));
    console.log(`\n测试报告已保存: ${reportPath}`);

    return results;

  } catch (error) {
    console.error('测试执行失败:', error);
    throw error;
  } finally {
    await runner.close();
  }
}

// 运行
if (require.main === module) {
  runTests()
    .then(results => {
      const passed = results.failed === 0 && results.errors === 0;
      process.exit(passed ? 0 : 1);
    })
    .catch(() => process.exit(1));
}

module.exports = { TestRunner, runTests };

常见使用场景与进阶技巧

场景一：批量数据采集

在实际项目中，经常需要从多个页面批量采集数据。下面的示例展示了如何高效地完成这类任务：

// 批量数据采集示例
// src/examples/batch-scraping.js

const { AgentBrowser } = require('@vercel/agent-browser');

async function batchScrape() {
  const browser = new AgentBrowser({ headless: true });
  await browser.launch();

  // 目标 URL 列表
  const urls = [
    'https://example.com/product/1',
    'https://example.com/product/2',
    'https://example.com/product/3',
    'https://example.com/product/4',
    'https://example.com/product/5'
  ];

  // 用于存储采集结果
  const results = [];

  // 逐个页面采集
  for (let i = 0; i < urls.length; i++) {
    const page = await browser.newPage();

    try {
      console.log(`正在采集 (${i + 1}/${urls.length}): ${urls[i]}`);

      await page.goto(urls[i], { waitUntil: 'networkidle' });

      // 提取页面数据
      const data = await page.evaluate(() => {
        return {
          title: document.querySelector('h1')?.textContent || '',
          price: document.querySelector('.price')?.textContent || '',
          description: document.querySelector('.description')?.textContent || ''
        };
      });

      results.push({ url: urls[i], ...data });
      console.log(`  ✓ 采集成功: ${data.title}`);

      // 每采集 2 个页面后关闭一些标签页释放内存
      if (i > 0 && i % 2 === 0) {
        await page.close();
      }

    } catch (error) {
      console.log(`  ✗ 采集失败: ${error.message}`);
      results.push({ url: urls[i], error: error.message });
    }
  }

  // 导出结果
  const fs = require('fs');
  fs.writeFileSync('./results.json', JSON.stringify(results, null, 2));
  console.log('\n结果已保存到 results.json');

  await browser.close();
}

batchScrape().catch(console.error);

场景二：处理登录与认证

很多网站需要登录才能访问完整内容。Agent Browser 支持处理各种登录场景：

// 登录处理示例
// src/examples/login-handler.js

const { AgentBrowser } = require('@vercel/agent-browser');

async function handleLogin() {
  const browser = new AgentBrowser({ headless: true });
  await browser.launch();

  const context = await browser.newContext({
    // 保存登录状态到文件
    storageState: './auth-state.json'
  });

  const page = await context.newPage();

  // 检查是否已有保存的登录状态
  const fs = require('fs');
  if (fs.existsSync('./auth-state.json')) {
    console.log('发现已保存的登录状态，加载中...');
    await context.close();
    await browser.close();
    return;  // 可以直接使用保存的状态
  }

  // 执行登录流程
  console.log('开始登录流程...');
  await page.goto('https://example.com/login');
  await page.waitForSelector('#username');

  await page.fill('#username', process.env.USERNAME);
  await page.fill('#password', process.env.PASSWORD);
  await page.click('#login-button');

  // 等待登录完成
  await page.waitForSelector('.user-profile', { timeout: 10000 });
  console.log('登录成功!');

  // 保存登录状态
  await context.storageState({ path: './auth-state.json' });
  console.log('登录状态已保存');

  await browser.close();
}

handleLogin().catch(console.error);

场景三：异常处理与重试机制

网络不稳定或页面加载异常是常见问题，健壮的自动化脚本需要完善的异常处理：

// 异常处理与重试机制
// src/examples/retry-handler.js

const { AgentBrowser } = require('@vercel/agent-browser');

// 重试装饰器函数
async function withRetry(fn, options = {}) {
  const maxAttempts = options.maxAttempts || 3;
  const retryDelay = options.retryDelay || 1000;
  const retryCondition = options.retryCondition || (() => true);

  let lastError;

  for (let attempt = 1; attempt <= maxAttempts; attempt++) {
    try {
      return await fn();
    } catch (error) {
      lastError = error;
      console.log(`尝试 ${attempt}/${maxAttempts} 失败: ${error.message}`);

      if (attempt < maxAttempts && retryCondition(error)) {
        console.log(`等待 ${retryDelay}ms 后重试...`);
        await new Promise(resolve => setTimeout(resolve, retryDelay));
      }
    }
  }

  throw lastError;
}

// 使用示例
async function robustScrape(url) {
  const browser = new AgentBrowser({ headless: true });
  await browser.launch();

  try {
    const page = await browser.newPage();

    // 包装导航操作
    await withRetry(
      () => page.goto(url, { waitUntil: 'networkidle' }),
      {
        maxAttempts: 3,
        retryDelay: 2000,
        retryCondition: error => error.message.includes('Timeout')
      }
    );

    // 包装元素等待
    await withRetry(
      () => page.waitForSelector('.content', { timeout: 5000 }),
      { maxAttempts: 5, retryDelay: 1000 }
    );

    const data = await page.evaluate(() => {
      return document.querySelector('.content')?.textContent;
    });

    return data;

  } finally {
    await browser.close();
  }
}

module.exports = { withRetry, robustScrape };

最佳实践与性能优化

代码组织最佳实践

良好的代码组织不仅提高可维护性，还能增强代码的可测试性和复用性：

// src/utils/page-actions.js
// 封装可复用的页面操作

class PageActions {
  constructor(page) {
    this.page = page;
  }

  // 安全点击（带重试）
  async safeClick(selector, options = {}) {
    const maxAttempts = options.maxAttempts || 3;

    for (let i = 0; i < maxAttempts; i++) {
      try {
        await this.page.waitForSelector(selector, { state: 'visible', timeout: 5000 });
        await this.page.click(selector);
        return true;
      } catch (error) {
        if (i === maxAttempts - 1) throw error;
        await this.page.waitForTimeout(500);
      }
    }
    return false;
  }

  // 滚动到元素并可见
  async scrollIntoView(selector) {
    await this.page.evaluate((sel) => {
      const element = document.querySelector(sel);
      if (element) element.scrollIntoView({ behavior: 'smooth', block: 'center' });
    }, selector);
    await this.page.waitForTimeout(300);
  }

  // 清空并输入文本
  async clearAndType(selector, text) {
    await this.page.click(selector, { clickCount: 3 });  // 全选
    await this.page.press('Backspace');  // 删除
    await this.page.type(selector, text);
  }

  // 等待元素消失
  async waitForElementHidden(selector, timeout = 5000) {
    await this.page.waitForSelector(selector, { state: 'hidden', timeout });
  }

  // 等待页面稳定（无网络活动）
  async waitForPageIdle(timeout = 3000) {
    await this.page.waitForLoadState('networkidle');
    await this.page.waitForTimeout(timeout);
  }
}

module.exports = { PageActions };

性能优化技巧

在处理大量页面时，性能优化尤为重要：

// 性能优化示例
// src/examples/performance-optimization.js

async function optimizedScraping() {
  const browser = new AgentBrowser({
    headless: true,
    // 禁用图片加载加速
    // 注意：这可能影响依赖图片的页面
  });

  await browser.launch();

  // 创建一个共享的上下文
  const context = await browser.newContext({
    // 禁用图片
    // 注意：现代网页可能依赖图片内容
    // viewport: { width: 1280, height: 720 }
  });

  // 预创建多个页面实例
  const pageCount = 5;
  const pages = await Promise.all(
    Array(pageCount).fill(null).map(() => context.newPage())
  );

  // 任务队列
  const urls = ['https://example.com/page1', 'https://example.com/page2', /* ... */];
  const results = [];

  // 并发处理
  const batchSize = 3;
  for (let i = 0; i < urls.length; i += batchSize) {
    const batch = urls.slice(i, i + batchSize);
    const batchPromises = batch.map((url, index) => processPage(pages[index], url));
    const batchResults = await Promise.all(batchPromises);
    results.push(...batchResults);
  }

  // 关闭页面释放资源
  await Promise.all(pages.map(page => page.close()));
  await browser.close();

  return results;
}

async function processPage(page, url) {
  await page.goto(url, { waitUntil: 'domcontentloaded' });  // 使用更快的加载策略
  return await page.evaluate(() => {
    // 提取数据
    return { title: document.title };
  });
}

module.exports = { optimizedScraping };

调试技巧

调试自动化脚本时，有效的调试技巧可以大大加快问题定位：

// 调试配置示例
// src/examples/debug-config.js

const { AgentBrowser } = require('@vercel/agent-browser');

// 使用有头模式便于调试
async function debugSession() {
  const browser = new AgentBrowser({
    headless: false,  // 显示浏览器窗口
    slowMo: 100,       // 操作延迟 100ms，便于观察
    devtools: true     // 打开开发者工具
  });

  await browser.launch();
  const page = await browser.newPage();

  // 开启详细日志
  page.on('console', msg => {
    console.log('[浏览器控制台]', msg.type(), msg.text());
  });

  page.on('pageerror', error => {
    console.error('[页面错误]', error.message);
  });

  page.on('request', request => {
    console.log('[请求]', request.url().substring(0, 100));
  });

  page.on('response', response => {
    if (response.status() >= 400) {
      console.log('[错误响应]', response.status(), response.url());
    }
  });

  // 你的测试代码...
  await page.goto('https://example.com');

  // 任意时刻截图
  await page.screenshot({ path: './debug-screenshot.png' });

  // 导出页面内容
  const content = await page.content();
  require('fs').writeFileSync('./debug-page.html', content);

  await browser.close();
}

module.exports = { debugSession };

与其他工具的集成

集成到 AI Agent 系统

Agent Browser 的核心价值在于让 AI Agent 能够与真实网页交互。以下是一个简化的 AI Agent 集成示例：

// src/examples/ai-agent-integration.js
// 将 Agent Browser 集成到 AI Agent 系统

const { AgentBrowser } = require('@vercel/agent-browser');

// 简化的 AI 指令生成器
// 实际项目中可以使用 GPT-4、Claude 等大模型
class AIAgent {
  constructor(browser) {
    this.browser = browser;
    this.page = null;
  }

  async init() {
    this.page = await this.browser.newPage();
  }

  // 根据自然语言指令执行操作
  async execute(command) {
    console.log(`执行指令: ${command}`);

    // 将指令转换为可执行的操作序列
    const actions = this.parseCommand(command);

    for (const action of actions) {
      await this.performAction(action);
    }

    // 返回执行结果
    return {
      success: true,
      screenshot: await this.takeSnapshot(),
      message: '任务完成'
    };
  }

  // 简化的命令解析（实际应用中应使用 LLM）
  parseCommand(command) {
    const lowerCommand = command.toLowerCase();

    if (lowerCommand.includes('打开') || lowerCommand.includes('goto')) {
      const url = this.extractUrl(command);
      return [{ type: 'goto', url }];
    }

    if (lowerCommand.includes('点击')) {
      const selector = this.extractSelector(command);
      return [{ type: 'click', selector }];
    }

    if (lowerCommand.includes('输入')) {
      const { selector, text } = this.extractInput(command);
      return [{ type: 'input', selector, text }];
    }

    if (lowerCommand.includes('截图')) {
      return [{ type: 'screenshot' }];
    }

    return [];
  }

  async performAction(action) {
    switch (action.type) {
      case 'goto':
        await this.page.goto(action.url, { waitUntil: 'networkidle' });
        console.log(`  已打开: ${action.url}`);
        break;

      case 'click':
        await this.page.click(action.selector);
        console.log(`  已点击: ${action.selector}`);
        break;

      case 'input':
        await this.page.fill(action.selector, action.text);
        console.log(`  已输入: ${action.selector} = "${action.text}"`);
        break;

      case 'screenshot':
        await this.takeSnapshot();
        break;
    }
  }

  async takeSnapshot() {
    const timestamp = Date.now();
    const path = `./screenshots/agent-${timestamp}.png`;
    await this.page.screenshot({ path, fullPage: true });
    console.log(`  截图已保存: ${path}`);
    return path;
  }

  // 辅助方法
  extractUrl(command) {
    const urlMatch = command.match(/https?:\/\/[^\s]+/);
    return urlMatch ? urlMatch[0] : 'https://example.com';
  }

  extractSelector(command) {
    // 简化的选择器提取
    const text = command.replace(/点击|click/g, '').trim();
    return `text=${text}`;
  }

  extractInput(command) {
    // 简化的输入提取
    return {
      selector: 'input',
      text: command.split('为')[1] || ''
    };
  }
}

// 使用示例
async function main() {
  const browser = new AgentBrowser({ headless: false });
  await browser.launch();

  const agent = new AIAgent(browser);
  await agent.init();

  // 执行一系列 AI 指令
  await agent.execute('打开 https://github.com');
  await agent.execute('点击 Sign in');
  await agent.execute('输入用户名为 test');
  await agent.execute('截图');

  console.log('\n演示完成');
  await browser.close();
}

main().catch(console.error);

集成到现有测试框架

如果你正在使用 Jest、Mocha 或其他测试框架，可以轻松集成 Agent Browser：

// src/test/examples.test.js
// Jest 测试集成示例

const { AgentBrowser } = require('@vercel/agent-browser');

// Jest 测试套件
describe('产品页面测试', () => {
  let browser;
  let page;

  // 测试前初始化
  beforeAll(async () => {
    browser = new AgentBrowser({ headless: true });
    await browser.launch();
  });

  // 测试后清理
  afterAll(async () => {
    if (browser) {
      await browser.close();
    }
  });

  // 每个测试前创建新页面
  beforeEach(async () => {
    page = await browser.newPage();
  });

  // 每个测试后关闭页面
  afterEach(async () => {
    if (page) {
      await page.close();
    }
  });

  test('页面标题正确', async () => {
    await page.goto('https://example.com');
    await expect(page).toHaveTitle(/Example/);
  });

  test('产品列表不为空', async () => {
    await page.goto('https://example.com/products');
    await page.waitForSelector('.product-item');
    const items = await page.$$('.product-item');
    expect(items.length).toBeGreaterThan(0);
  });

  test('可以添加产品到购物车', async () => {
    await page.goto('https://example.com/product/1');
    await page.click('#add-to-cart');
    await page.waitForSelector('.cart-badge');
    const badge = await page.textContent('.cart-badge');
    expect(badge).toBe('1');
  });
});

总结与展望

通过本文的详细讲解，你应该已经掌握了 Vercel Agent Browser 的核心概念和实战技巧。从基础的环境搭建，到页面导航和元素操作，再到复杂的表单处理和批量数据采集，Agent Browser 提供了一套完整且优雅的解决方案。

核心要点回顾：

Agent Browser 的核心优势在于它提供了简洁直观的 API，将复杂的浏览器自动化操作封装成易于理解的方法调用。无论是导航、元素定位、交互操作还是截图，你都可以用最少的代码实现最强大的功能。

多上下文管理让你能够模拟多个独立的浏览器会话，适合需要并行处理的场景。而完善的异常处理机制和等待策略，则确保了你的自动化脚本能够稳定运行。

在实际应用中，Agent Browser 的价值不仅限于自动化测试和数据采集。它更大的潜力在于与 AI Agent 的结合——当 AI 能够像人类一样操控浏览器时，许多以前无法自动化的业务流程都将变得可能。

项目资源链接：

以下是本文涉及的相关资源和深入学习的建议：

Vercel Agent Browser 官方仓库：https://github.com/vercel-labs/agent-browser

Playwright 官方文档（Agent Browser 基于此构建）：https://playwright.dev/docs/intro

MDN Web Docs CSS 选择器参考：https://developer.mozilla.org/zh-CN/docs/Web/CSS/CSS_Selectors

如果你对 AI Agent 和浏览器自动化感兴趣，还可以关注以下相关项目：

Playwright（浏览器自动化基础框架）

Puppeteer（Google 开发的 Node.js 浏览器自动化库）

Selenium（跨浏览器的自动化测试框架）

AutoGPT（自主 AI Agent 实验项目）

LangChain（构建 LLM 应用的开发框架）

下一步学习建议：

在掌握了本文内容后，你可以尝试以下进阶方向：深入学习 Playwright 的高级特性，如网络拦截和服务端请求模拟；探索 Agent Browser 与大语言模型的结合，构建真正智能的 AI Agent；学习设计模式在自动化项目中的应用，提高代码的可维护性和可扩展性；尝试将 Agent Browser 集成到你自己的项目中，解决实际问题。

希望本文能够帮助你在 AI Agent 和浏览器自动化的道路上迈出坚实的一步。技术的世界日新月异，保持学习的热情，你一定能在这个领域取得更大的成就！

本文档会持续更新，欢迎关注获取最新内容。如果有任何问题或建议，欢迎在评论区留言讨论。

AI 浏览器操控新范式：Vercel Agent Browser 完整实战指南，让你的 AI Agent 秒变网页交互高手

☕ 如果内容对您有帮助，欢迎打赏

评论区

发表回复取消回复

☕ 如果内容对您有帮助，欢迎打赏

相关文章

别再忍受Webpack龟速构建了！Vite以10-100倍速度碾压，现代化前端开发新标准

别再盲目折腾AI变现了！这款开源工具让我副业收入翻倍，手把手教你从入门到精通

别再手动拼接Prompt了！Pydantic-AI让AI应用开发像写API一样优雅

评论区

发表回复 取消回复

发表回复取消回复