Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Fine-tuned MedPalm is worse than GPT-4 on most Medical Challenge Tests. Fine-tuned Minerva is much worse on arithmetic benchmarks.

The LLM space is just different. There's no guarantee a fine-tuned model will beat a bigger generalist one.



Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: