Organization collection

Scale AI

@scale-ai · Datasets published by this organization.

⌘K

Organization inventory

Private datasets are available only to members of this organization.

Dataset	AccessVisibility	Tasks
scale-ai/swe-bench-pro	Public	731
scale-ai/swe-atlas-qna SWE-Atlas - Codebase QnA is a benchmark of deep codebase comprehension and QnA problems for coding agents. Checkout https://github.com/scaleapi/SWE-Atlas/ for instructions on running it.	Public	124
scale-ai/swe-atlas-tw SWE-Atlas - Test Writing -- A benchmark of comprehensive test writing problems for coding agents. Checkout https://github.com/scaleapi/SWE-Atlas/ for instructions on running it.	Public	90
scale-ai/hil-bench HiL-Bench Harbor Release	Public	600
scale-ai/swe-atlas-rf SWE-Atlas - Refactoring -- A benchmark of refactoring tasks for coding agents	Public	70

5 datasets

Organization collection

@scale-ai · Datasets published by this organization.

⌘K

Organization inventory

Private datasets are available only to members of this organization.

Dataset	AccessVisibility	Tasks
scale-ai/swe-bench-pro	Public	731
scale-ai/swe-atlas-qna SWE-Atlas - Codebase QnA is a benchmark of deep codebase comprehension and QnA problems for coding agents. Checkout https://github.com/scaleapi/SWE-Atlas/ for instructions on running it.	Public	124
scale-ai/swe-atlas-tw SWE-Atlas - Test Writing -- A benchmark of comprehensive test writing problems for coding agents. Checkout https://github.com/scaleapi/SWE-Atlas/ for instructions on running it.	Public	90
scale-ai/hil-bench HiL-Bench Harbor Release	Public	600
scale-ai/swe-atlas-rf SWE-Atlas - Refactoring -- A benchmark of refactoring tasks for coding agents	Public	70

5 datasets