-
Notifications
You must be signed in to change notification settings - Fork 20
Pull requests: OpenHands/benchmarks
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
feat(gaia): Add SWE-Bench format report generation
#204
opened Dec 24, 2025 by
simonrosenberg
•
Draft
Add add_resolve_rate_to_predictions function to output_utils
#199
opened Dec 23, 2025 by
juanmichelini
•
Draft
build(deps): bump the version-all group across 1 directory with 14 updates
dependencies
Pull requests that update a dependency file
python:uv
Pull requests that update python:uv code
#186
opened Dec 22, 2025 by
dependabot
bot
Loading…
Moves and renames SWT-Bench eval report to same folder as output.jsonl
#182
opened Dec 19, 2025 by
juanmichelini
Loading…
Add detection and reporting for incomplete evaluation runs
#148
opened Dec 10, 2025 by
simonrosenberg
•
Draft
Fix Browser action deserialization by using OpenHandsModel
#136
opened Dec 6, 2025 by
simonrosenberg
Loading…
API-based Critic implementation
build-swebench-200
Build 200 SWE-Bench Verified Image based on SDK version on this PR.
#117
opened Nov 26, 2025 by
xingyaoww
Loading…
build(deps): bump the version-all group with 2 updates
dependencies
Pull requests that update a dependency file
github_actions
Pull requests that update GitHub Actions code
#113
opened Nov 24, 2025 by
dependabot
bot
Loading…
ProTip!
Add no:assignee to see everything that’s not assigned.