Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This bench seems to be entirely python based. Are there similar benchmarks that test different languages for these tools?


I'm one of the co-authors of SWE-bench. We just created a Javascript (+visual) SWE-bench: https://www.swebench.com/multimodal.html

We're going to release the eval suite for this soon so that people can start making submissions.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: