Dropbox Nova：让 AI 编程 Agent 在工程体系里真正跑起来

预计阅读时间：11 分钟

当 AI 编程 Agent 从"个人工具"变成"团队基础设施"，问题就不再是"能不能写代码"，而是"怎么可靠地调度、观测和管控几十个 Agent 同时干活"。Dropbox 近期公开了内部平台 Nova——一个专门用来编排和运营 AI 编程 Agent 的系统，把 Agent 从实验玩具推进到了工程流水线的一环。

Agent 从单机到规模的断层

单个开发者用 Copilot 补全几行代码，体验已经足够平滑。但一旦你想让 Agent 批量处理仓库迁移、自动修复 lint 错误、跨服务重构接口，事情就变了：

并发冲突：多个 Agent 同时改同一个文件，merge 灾难频发。
权限边界：Agent 该能改哪些仓库、推到哪些分支？没有统一策略就只能靠人肉审批。
结果不可测：Agent 生成的代码跑没跑测试？有没有引入新 bug？缺乏闭环反馈。
成本失控：每次调用 LLM 都在烧 token，没有预算感知的调度就是无底洞。

Nova 的核心定位就是填这个断层——它不是又一个 Agent 框架，而是 Agent 之上的"运行平台"。

Nova 的编排思路

根据公开信息，Nova 的设计围绕几个关键能力：

任务拆分与分配。一个大的工程任务（比如"把所有 Python 2 的 print 语句改成函数调用"）会被拆成子任务，分配给不同的 Agent 实例并行执行，同时保证子任务之间的依赖顺序和文件冲突最小化。

沙箱执行环境。Agent 不是直接在你的主分支上动手，而是在隔离的沙箱里操作——有自己的工作树、自己的依赖安装空间，改动完成后才通过审核流程合入。

策略与权限管控。哪些仓库允许 Agent 修改、允许哪种级别的改动（只改测试 vs 可以改业务逻辑）、最大 token 消耗上限——这些策略集中配置，Agent 实例启动时自动加载。

观测与回滚。每次 Agent 运行都有完整日志：改了哪些文件、跑了哪些测试、token 消耗多少。如果结果不达标，可以一键回滚到 Agent 执行前的状态。

自己搭一个轻量 Agent 编排平台

Nova 是 Dropbox 内部系统，外部无法直接使用。但它的核心思路——任务拆分、沙箱隔离、策略管控、观测回滚——完全可以自己落地。下面是一个最小可运行的编排框架示例。

项目结构

agent-orchestrator/
├── config/
│   └── policies.yaml      # Agent 策略配置
│   └── tasks.yaml         # 任务定义
├── orchestrator.py        # 编排核心
├── sandbox.py             # 沙箱管理
└── runner.py              # Agent 执行器

策略配置 — `config/policies.yaml`

# 定义 Agent 可以做什么、不能做什么
policies:
  repo-allowlist:
    - "backend-api"
    - "frontend-web"
    - "shared-libs"

  change-levels:
    backend-api:
      max: "test-only"        # 只允许改测试文件
    frontend-web:
      max: "business-logic"   # 允许改业务代码
    shared-libs:
      max: "refactor"         # 允许重构级改动

  budgets:
    max-tokens-per-run: 50000
    max-concurrent-agents: 5
    max-retries: 2

  sandbox:
    base-image: "python:3.11-slim"
    mount-repo: true
    network-access: false     # Agent 执行时禁止联网

任务定义 — `config/tasks.yaml`

tasks:
  - id: py2-print-migration
    description: "将 Python 2 print 语句改为函数调用"
    repos: ["backend-api", "shared-libs"]
    change-level: "refactor"
    prompt: |
      找出所有使用 `print ...` 语法的 Python 文件，
      将其改为 `print(...)` 函数调用形式。
      不要改动注释和文档字符串中的 print。
      每改完一个文件，运行该文件相关的单元测试确认无破坏。
    subtask-strategy: "file-per-agent"   # 每个文件分配给一个 Agent

编排核心 — `orchestrator.py`

"""轻量 Agent 编排器：拆任务、分沙箱、收结果"""

import subprocess
import json
import yaml
from pathlib import Path
from concurrent.futures import ThreadPoolExecutor, as_completed

CONFIG_DIR = Path("config")

def load_config(name: str) -> dict:
    with open(CONFIG_DIR / name) as f:
        return yaml.safe_load(f)

def check_policy(task: dict, policies: dict) -> bool:
    """校验任务是否符合策略"""
    for repo in task["repos"]:
        if repo not in policies["policies"]["repo-allowlist"]:
            print(f"❌ 仓库 {repo} 不在允许列表中")
            return False
        allowed_level = policies["policies"]["change-levels"].get(repo, {}).get("max", "none")
        levels = ["test-only", "business-logic", "refactor"]
        if levels.index(task["change-level"]) > levels.index(allowed_level):
            print(f"❌ {repo} 不允许 {task['change-level']} 级改动")
            return False
    return True

def split_subtasks(task: dict) -> list[dict]:
    """按策略拆分子任务（简化版：按文件拆）"""
    strategy = task.get("subtask-strategy", "file-per-agent")
    subtasks = []
    for repo in task["repos"]:
        repo_path = Path(f"repos/{repo}")
        py_files = list(repo_path.rglob("*.py"))
        for f in py_files:
            subtasks.append({
                "task_id": task["id"],
                "repo": repo,
                "file": str(f.relative_to(repo_path)),
                "prompt": task["prompt"],
                "change_level": task["change-level"],
            })
    return subtasks

def run_agent_in_sandbox(subtask: dict, policies: dict) -> dict:
    """在沙箱中执行单个 Agent 子任务"""
    budget = policies["policies"]["budgets"]
    result = subprocess.run(
        [
            "docker", "run", "--rm",
            "-v", f"./repos/{subtask['repo']}:/workspace",
            "-e", f"TARGET_FILE={subtask['file']}",
            "-e", f"PROMPT={subtask['prompt']}",
            "-e", f"MAX_TOKENS={budget['max-tokens-per-run']}",
            policies["policies"]["sandbox"]["base-image"],
            "python", "/agent/runner.py",
        ],
        capture_output=True, text=True, timeout=300,
    )
    return {
        "subtask": subtask,
        "success": result.returncode == 0,
        "stdout": result.stdout,
        "stderr": result.stderr,
    }

def orchestrate():
    policies = load_config("policies.yaml")
    tasks_cfg = load_config("tasks.yaml")

    for task in tasks_cfg["tasks"]:
        if not check_policy(task, policies):
            continue

        subtasks = split_subtasks(task)
        budget = policies["policies"]["budgets"]
        max_workers = min(budget["max-concurrent-agents"], len(subtasks))

        print(f"🚀 任务 {task['id']}：拆出 {len(subtasks)} 个子任务，并发 {max_workers}")

        results = []
        with ThreadPoolExecutor(max_workers=max_workers) as pool:
            futures = {
                pool.submit(run_agent_in_sandbox, st, policies): st
                for st in subtasks
            }
            for future in as_completed(futures):
                res = future.result()
                status = "✅" if res["success"] else "❌"
                print(f"  {status} {res['subtask']['repo']}::{res['subtask']['file']}")
                results.append(res)

        # 输出汇总报告
        report_path = Path(f"reports/{task['id']}.json")
        report_path.parent.mkdir(exist_ok=True)
        with open(report_path, "w") as f:
            json.dump(results, f, indent=2, ensure_ascii=False)
        print(f"📊 报告已写入 {report_path}")

if __name__ == "__main__":
    orchestrate()

运行方式

# 1. 准备仓库目录
mkdir -p repos/backend-api repos/frontend-web repos/shared-libs

# 2. 按实际项目放入代码（这里用示例仓库）
git clone --depth 1 your-org/backend-api repos/backend-api

# 3. 执行编排
pip install pyyaml
python orchestrator.py

注意：上面的 runner.py（沙箱内的 Agent 执行器）需要你自己实现——它负责接收 prompt 和目标文件、调用 LLM API、写回改动、跑测试。这是与具体 LLM 服务（OpenAI、Anthropic、自部署模型）对接的部分，逻辑因平台而异。上面的编排框架只管"怎么调度"，不管"Agent 怎么写代码"。

落地时的关键取舍

沙箱隔离的代价。Docker 沙箱启动有延迟，每个子任务都起一个容器会很慢。Dropbox 的做法推测是预热容器池。轻量方案可以用 git worktree 替代 Docker——每个 Agent 在独立 worktree 里操作，零容器开销，但隔离性弱于完整沙箱。

拆分粒度的平衡。按文件拆最简单，但跨文件的重构（比如改接口签名后所有调用方都要跟着改）就需要依赖感知的拆分策略。粒度太细，Agent 之间缺乏上下文；粒度太粗，并发优势消失。Nova 的价值之一就在这里：它不是简单粗暴地按文件分，而是理解代码依赖图后做拆分。

人工审核不可省略。即使 Agent 在沙箱里跑完了测试，自动合入主分支仍然风险极高。务实的做法是 Agent 完成后自动创建 PR，标注改动范围和测试结果，由工程师快速审核后合入。Nova 的定位是"加速而非替代人工判断"。

成本监控要前置。在策略配置里设 token 上限只是第一步。更成熟的做法是按仓库、按任务类型统计 token 消耗趋势，发现异常（某个 Agent 反复重试、某个任务 token 消耗远超预期）时自动熔断。

Dropbox 公开 Nova 的信号很明确：AI 编程 Agent 的下一个竞争维度不是"谁的模型写代码更好"，而是"谁能让 Agent 在真实工程体系里稳定、可控、可观测地运转"。编排平台这件事，迟早是每家有一定规模的技术团队都要面对的基础设施问题。现在用轻量框架跑起来，比等完美方案更值。

Dropbox Nova：让 AI 编程 Agent 在工程体系里真正跑起来

Agent 从单机到规模的断层

Nova 的编排思路

自己搭一个轻量 Agent 编排平台

项目结构

策略配置 — `config/policies.yaml`

任务定义 — `config/tasks.yaml`

编排核心 — `orchestrator.py`

运行方式

落地时的关键取舍

相关推荐

建议反馈

Dropbox Nova：让 AI 编程 Agent 在工程体系里真正跑起来

Agent 从单机到规模的断层

Nova 的编排思路

自己搭一个轻量 Agent 编排平台

项目结构

策略配置 — config/policies.yaml

任务定义 — config/tasks.yaml

编排核心 — orchestrator.py

运行方式

落地时的关键取舍

相关推荐

策略配置 — `config/policies.yaml`

任务定义 — `config/tasks.yaml`

编排核心 — `orchestrator.py`