Datasets
| Dataset | Tasks |
|---|---|
| Dataset | Tasks |
|---|---|
MultiMedia-TerminalBench (MMTB): a benchmark of 105 realistic multimedia-file tasks in persistent terminal workspaces, across 5 meta-categories grounded in paid practitioner workflows.
harbor run -d mmtb/multimedia-terminalbench| Task |
|---|
| Task |
|---|
mmtb/deictic-ui-reference |
mmtb/line-failure-annotation |
mmtb/2-speaker-diarized-transcript-from-podcast-audio |
mmtb/av-privacy-exposure |
mmtb/debate-attribution |
Displaying 5 of 105 tasks