Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

All of this is for batch size 1.


I know. That was my point.

Throughput doesn't scale on CPU as well as it does on GPU.


We both agree. Batch size 1 is only relevant to people who want to run models on their own private machines. Which is the case of OP.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: