Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
alyxya
9 days ago
|
parent
|
context
|
favorite
| on:
IQuest-Coder: A new open-source code model beats C...
Given the decrease in the benchmark score from the correction, I don't think you can assume they didn't check a single output. Clearly the model is still very capable and the model cheating its results didn't affect most of the benchmark.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: