What’s New in v2.0

Looking for v1.0 docs? They live on the release/v1.0 branch on GitHub. This page describes what changed since.

v2.0 is a major release. The codebase grew from a single-process agent with file I/O into a multi-substrate runtime with cloud compute, durable provenance, and registry-backed orchestration. v1.0 stays available as a separate marketing-tag artifact (tag v1.0, branch release/v1.0); semver starts at v2.0.0.

Headlines

  • Cloud compute via SkyPilot — Run scientific simulations on cloud clusters without leaving the agent. New tools: compute_run, compute_exec, compute_cluster, materialize, materialize_workspace. See Cloud Compute.
  • Durable provenance log — Every tool call, compute job, artifact, and verification result lands in an append-only JSONL log per session. Cross-LLM verifiable — a fresh model from any provider can audit a session it didn’t run. See Provenance Log Schema.
  • Task orchestration — A unified registry for in-flight work (task_index) covering cloud jobs and background subagents alike. New tools: task_list, task_get, task_wait. See Task Orchestration.
  • Background subagents with checkpoint & resumespawn(background=True) returns a task_id; per-iteration checkpoints survive crashes; the parent gets a 3-way resume prompt on the next spawn. See Task Orchestration → Checkpoint & resume.
  • Two new subagent kindscompute (cloud-job orchestration with token isolation) and analyze (post-job derivation: plots, statistics, light fits). The verifier subagent now reads the durable provenance log, so a different model can audit a session it didn’t run. See Architecture → Sub-agents.
  • Updated default models — Sonnet 4.6 / Opus 4.7 / Haiku 4.5 across the tier system. Provider-agnostic via LiteLLM.
  • Two-layer Python configAgentConfig (agent-loop concerns: model, tokens, iterations, compaction) is now joined by a separate CloudConfig (cloud-side concerns: cost gate, workspace store, autostop default, subagent warm-resume). AgentLoop accepts both as independent kwargs. Existing env vars and ~/.sciagent/config.yaml keys still work; precedence is env > config dataclass > yaml > built-in default. See Configuration.
  • New scientific serviceparaview (multi-arch with EGL) for post-processing OpenFOAM and other simulation outputs. The OpenFOAM image now also ships with a SWAK4Foam-extended variant for field-processing, used internally when a case needs it.
  • First cloud-compute case studyDatacenter CFD with OpenFOAM reproduces published results end-to-end on a SkyPilot cluster.

Migrating from v1.0

There are no breaking API changes — v1.0 code that called AgentLoop, SubAgentOrchestrator, or the core tool registry continues to work. The v2.0 surface is additive:

  • The default model has changed from claude-opus-4-5 to claude-sonnet-4-6 for the scientific tier. To pin to v1.0 behavior, pass --model anthropic/claude-opus-4-5-20251101.
  • The compute_*, materialize*, task_*, bg_*, and monitor* tools are new — main agents pre-v2.0 didn’t have them. They’re registered automatically; nothing to opt into.
  • pip install '.[cloud]' is required for SkyPilot. The base install does not pull it in.

If you have v1.0 sessions saved in .agent_states/, they remain loadable — the session schema is back-compat.

Component changes at a glance

Area v1.0 v2.0
Compute backends Local Docker only Local Docker + SkyPilot (managed jobs + cluster mode)
Output handling Local files Cloud-agnostic URIs (S3/GCS/Azure/R2/OCI) + per-session workspace bucket
Long-running work Foreground subagents Background subagents + checkpoint/resume + task_index
Audit trail Inline tool logs Durable per-session JSONL log (provenance.jsonl)
Verification Three-tier gates (data/exec/LLM) Same gates + cross-LLM verifier reading the provenance log
Subagent kinds explore, debug, research, plan, general, verifier + compute, + analyze
Scientific services openfoam, … openfoam (with SWAK4Foam-extended variant), paraview, …
Default scientific model claude-opus-4-5 claude-sonnet-4-6