Datasets
| Dataset | Tasks |
|---|---|
| Dataset | Tasks |
|---|---|
ERP-Bench is the Odoo 19 benchmark used in the Anchor paper, "Preventing Artifact Drift in Agent Benchmark Generation." It contains 300 long-horizon procurement and manufacturing tasks generated from a single CP-SAT-backed specification.
harbor run -d agentic-labs/erp-bench