Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This thing is dramatically slower than a 4090 both in prefill and decode. And I do mean DRAMATICALLY.

I have no immediate numbers for prefill, but the memory bandwidth is ~4x greater on a 4090 which will lead to ~4x faster decode.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: