Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Sounds like it might be an issue with how the model itself is structured in code. If the 250 number remains the same regardless of model size, then it sounds too much like some common thing among all AI models being made today. GGML? PyTorch? Transformers? I think the issue lies in that area.


Isn't this just a desirable property of LLMs? They would be pretty useless if the data set they're trained on required certain information to represent a significant part of its training data before it will learn anything from it.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: