Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is the common belief but not quite correct! The Muon update was proposed by Bernstein as the result of a theoretical paper suggesting concrete realizations of the theory, and Keller implemented it and added practical things to get it to work well (input/output AdamW, aggressive coefficients, post-Nesterov, etc).

Both share equal credit I feel (also, the paper's co-authors!), both put in a lot of hard work for it, though I tend to bring up Bernstein since he tends to be pretty quiet about it himself.

(Source: am experienced speedrunner who's been in these circles for a decent amount of time)



I think it's good to bring up Bernstein & Newhouse as well as Yuchen Jin, Jiacheng You and the other speedrunners who helped iterate on Muon. But I think it's very fair to call Keller Jordan the main author of Muon of its current form. I'm also in the speedrunning community though maybe not as long as you have




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: