AgentBrowser（浏览器）的沙箱内执行

制作沙箱模板。

下载镜像。

wget https://repo.openeuler.org/openEuler-24.03-LTS-SP3/docker_img/aarch64/openEuler-docker.aarch64.tar.xz

解压缩。
```
xz -d openEuler-docker.aarch64.tar.xz
```

加载镜像。

docker load -i openEuler-docker.aarch64.tar

加载容器。

docker run -itd --name search-image openeuler-24.03-lts-sp3:latest /bin/bash

进入容器，安装agent-browser所需组件。

进入容器。
```
docker exec -it search-image bash
```

安装以下组件。

dnf install -y nodejs

npm install -g agent-browser

npx playwright install

npx playwright install chromium

dnf install -y nss nspr atk at-spi2-atk gtk3 alsa-lib libgbm libdrm mesa-libEGL

安装E2B沙箱必备组件，并制作镜像。

使用yum命令安装以下组件。

yum install -y wget systemd systemd-sysv openssh-server sudo chrony linuxptp socat curl iputils bind-utils iproute nc tcpdump passwd && yum clean all && rm -rf /var/cache/yum /var/tmp/* /tmp/*

使用wget命令安装以下组件。

wget -O /usr/local/bin/websocat https://github.com/vi/websocat/releases/latest/download/websocat.aarch64-unknown-linux-musl && chmod a+x /usr/local/bin/websocat && websocat --version

执行exit退出容器，并制作镜像。

docker commit <容器名> <镜像名>:<版本号>

创建容器，进入龙虾，绑定制作的沙箱模板。可参考Python代码的沙箱内执行及E2B沙箱服务部署（可选）。

agent-browser原生SKILL（即https://clawhub.ai/matrixy/agent-browser-clawdbot#files）不考虑E2B沙箱环境，需修改SKILL.md。此任务由Agent自动执行，修改结果为：

---
name: Agent Browser (local-exec / E2B Sandbox)
description: Headless browser automation via agent-browser CLI inside E2B sandbox using local-exec. Chrome path and launch args are pre-configured.
read_when:
  - Automating web interactions inside E2B sandbox
  - Extracting structured data from pages via local-exec
  - Filling forms programmatically in sandboxed browser
  - Testing web UIs via local-exec tool
metadata: {"clawdbot":{"emoji":"","requires":{"bins":["agent-browser","sudo"]}}}
allowed-tools: Bash(agent-browser:*)
---
 
# Browser Automation with agent-browser (E2B Sandbox)
 
## Execution Environment
 
All `agent-browser` commands run inside the **E2B cloud sandbox** via `local-exec` (kind: bash).
 
**Key constraints:**
- Chrome is installed by Playwright at: `/root/.cache/ms-playwright/chromium-1223/chrome-linux/chrome`
- Chrome requires `sudo` to run (sandbox user has no permission to `/root/.cache`)
- Chrome **must** be launched with: `--no-sandbox --disable-gpu --disable-dev-shm-usage`
- Must run in **headless** mode: `--headed false`
- Always close existing daemon before starting with new options: `sudo agent-browser close --all`
 
**Command template:**
 
```bash
sudo agent-browser close --all 2>&1 && \
sudo agent-browser \
  --executable-path /root/.cache/ms-playwright/chromium-1223/chrome-linux/chrome \
  --args "--no-sandbox,--disable-gpu,--disable-dev-shm-usage" \
  --headed false \
  <command> <args>
```
 
## Quick Start
 
### Via local-exec (tool call)
 
```
local-exec(kind="bash", command="sudo agent-browser close --all 2>&1 && sudo agent-browser --executable-path /root/.cache/ms-playwright/chromium-1223/chrome-linux/chrome --args \"--no-sandbox,--disable-gpu,--disable-dev-shm-usage\" --headed false open https://www.example.com 2>&1")
```
 
### Core workflow
 
1. **Close daemon + navigate**: `sudo agent-browser close --all && sudo agent-browser --executable-path <path> --args "<chrome-args>" --headed false open <url>`
2. **Snapshot**: `sudo agent-browser snapshot -i` (returns elements with refs like `@e1`, `@e2`)
3. **Interact** using refs from the snapshot
4. **Re-snapshot** after navigation or significant DOM changes
 
## Commands
 
### Navigation
 
```bash
sudo agent-browser close --all && sudo agent-browser --executable-path /root/.cache/ms-playwright/chromium-1223/chrome-linux/chrome --args "--no-sandbox,--disable-gpu,--disable-dev-shm-usage" --headed false open <url>
sudo agent-browser back
sudo agent-browser forward
sudo agent-browser reload
sudo agent-browser close --all
```
 
### Snapshot (page analysis)
 
```bash
sudo agent-browser snapshot            # Full accessibility tree
sudo agent-browser snapshot -i         # Interactive elements only (recommended)
sudo agent-browser snapshot -c         # Compact output
sudo agent-browser snapshot -d 3       # Limit depth to 3
sudo agent-browser snapshot -s "#main" # Scope to CSS selector
```
 
### Interactions (use @refs from snapshot)
 
```bash
sudo agent-browser click @e1           # Click
sudo agent-browser dblclick @e1        # Double-click
sudo agent-browser focus @e1           # Focus element
sudo agent-browser fill @e2 "text"     # Clear and type
sudo agent-browser type @e2 "text"     # Type without clearing
sudo agent-browser press Enter         # Press key
sudo agent-browser press Control+a     # Key combination
sudo agent-browser hover @e1           # Hover
sudo agent-browser check @e1           # Check checkbox
sudo agent-browser uncheck @e1         # Uncheck checkbox
sudo agent-browser select @e1 "value"  # Select dropdown
sudo agent-browser scroll down 500     # Scroll page
sudo agent-browser scrollintoview @e1  # Scroll element into view
```
 
### Get information
 
```bash
sudo agent-browser get text @e1        # Get element text
sudo agent-browser get html @e1        # Get innerHTML
sudo agent-browser get value @e1       # Get input value
sudo agent-browser get attr @e1 href   # Get attribute
sudo agent-browser get title           # Get page title
sudo agent-browser get url             # Get current URL
sudo agent-browser get count ".item"   # Count matching elements
sudo agent-browser get box @e1         # Get bounding box
```
 
### Check state
 
```bash
sudo agent-browser is visible @e1      # Check if visible
sudo agent-browser is enabled @e1      # Check if enabled
sudo agent-browser is checked @e1      # Check if checked
```
 
### Screenshots & PDF
 
```bash
sudo agent-browser screenshot          # Screenshot to stdout
sudo agent-browser screenshot path.png # Save to file
sudo agent-browser screenshot --full   # Full page
sudo agent-browser pdf output.pdf      # Save as PDF
```
 
### Wait
 
```bash
sudo agent-browser wait @e1                     # Wait for element
sudo agent-browser wait 2000                    # Wait milliseconds
sudo agent-browser wait --text "Success"        # Wait for text
sudo agent-browser wait --url "/dashboard"      # Wait for URL pattern
sudo agent-browser wait --load networkidle      # Wait for network idle
```
 
### Mouse control
 
```bash
sudo agent-browser mouse move 100 200      # Move mouse
sudo agent-browser mouse down left         # Press button
sudo agent-browser mouse up left           # Release button
sudo agent-browser mouse wheel 100         # Scroll wheel
```
 
### Semantic locators (alternative to refs)
 
```bash
sudo agent-browser find role button click --name "Submit"
sudo agent-browser find text "Sign In" click
sudo agent-browser find label "Email" fill "user@test.com"
sudo agent-browser find first ".item" click
sudo agent-browser find nth 2 "a" text
```
 
### Browser settings
 
```bash
sudo agent-browser set viewport 1920 1080      # Set viewport size
sudo agent-browser set device "iPhone 14"      # Emulate device
sudo agent-browser set geo 37.7749 -122.4194   # Set geolocation
sudo agent-browser set offline on              # Toggle offline mode
sudo agent-browser set headers '{"X-Key":"v"}' # Extra HTTP headers
sudo agent-browser set media dark              # Emulate color scheme
```
 
### Cookies & Storage
 
```bash
sudo agent-browser cookies                     # Get all cookies
sudo agent-browser cookies set name value      # Set cookie
sudo agent-browser cookies clear               # Clear cookies
sudo agent-browser storage local               # Get all localStorage
sudo agent-browser storage local key           # Get specific key
sudo agent-browser storage local set k v       # Set value
sudo agent-browser storage local clear         # Clear all
```
 
### Network
 
```bash
sudo agent-browser network route <url>              # Intercept requests
sudo agent-browser network route <url> --abort      # Block requests
sudo agent-browser network route <url> --body '{}'  # Mock response
sudo agent-browser network unroute [url]            # Remove routes
sudo agent-browser network requests                 # View tracked requests
sudo agent-browser network requests --filter api    # Filter requests
```
 
### Tabs
 
```bash
sudo agent-browser tab                 # List tabs
sudo agent-browser tab new [url]       # New tab
sudo agent-browser tab 2               # Switch to tab
sudo agent-browser tab close           # Close tab
```
 
### Frames
 
```bash
sudo agent-browser frame "#iframe"     # Switch to iframe
sudo agent-browser frame main          # Back to main frame
```
 
### JavaScript
 
```bash
sudo agent-browser eval "document.title"   # Run JavaScript
```
 
### JSON output (for parsing)
 
Add `--json` for machine-readable output:
 
```bash
sudo agent-browser snapshot -i --json
sudo agent-browser get text @e1 --json
```
 
## Example: Form submission
 
```bash
# Step 1: Navigate
sudo agent-browser close --all && sudo agent-browser --executable-path /root/.cache/ms-playwright/chromium-1223/chrome-linux/chrome --args "--no-sandbox,--disable-gpu,--disable-dev-shm-usage" --headed false open https://example.com/form 2>&1
 
# Step 2: Snapshot to get refs
sudo agent-browser snapshot -i 2>&1
# Output: textbox "Email" [ref=e1], textbox "Password" [ref=e2], button "Submit" [ref=e3]
 
# Step 3: Fill and submit
sudo agent-browser fill @e1 "user@example.com"
sudo agent-browser fill @e2 "password123"
sudo agent-browser click @e3
sudo agent-browser wait --load networkidle
 
# Step 4: Check result
sudo agent-browser snapshot -i 2>&1
```
 
## Troubleshooting
 
| Symptom | Fix |
|---|---|
| `Chrome not found` | Add `--executable-path /root/.cache/ms-playwright/chromium-1223/chrome-linux/chrome` |
| `Permission denied` | Use `sudo` — Chrome binary is under `/root/.cache` |
| `Operation timed out` | Add `--args "--no-sandbox,--disable-gpu,--disable-dev-shm-usage"` and `--headed false` |
| `--executable-path ignored` | Daemon already running with old config. Run `sudo agent-browser close --all` first |
| Element not found | Run `snapshot -i` again — refs change after navigation |
 
## Notes
 
- **Refs are stable per page load** but change on navigation. Always snapshot after navigating.
- **Use `fill`** instead of `type` for input fields to clear existing text first.
- **Always close daemon** before re-launching with different Chrome options.
- **Command chaining** works in a single `local-exec` call via `&&`.

本SKILL.md的实际制作过程如下，仅关心如何使用agent-browser可忽略以下内容：

要求CLAW在沙箱环境中执行如下命令打开网页。
```
sudo agent-browser open www.baidu.com --executable-path /root/.cache/ms-playwright/chromium-1223/chrome-linux/chrome
```
验证结果是否符合预期，若执行报错，要求CLAW自主排查问题原因并解决。
在workspace下提供原agent-browser的SKILL.md，要求Agent对其进行修改，以支持local-exec沙箱调用场景。
要求Agent严格遵循SKILL，完成网页浏览任务，验证SKILL.md是否正确。若卡住，要求CLAW自主排查原因，并修改SKILL.md。

工具验证，告诉龙虾“请使用你已经改好的agent-browser的SKILL.md，然后看能否抓取clawhub上排名第37的SKILL是什么，并告诉我他的SKILL.md里面description是什么”。

父主题： 案例七：工具沙箱制作与使用（代码开发&网页浏览场景）