Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Llama.cpp is great but I have moved to mostly using Ollama because it is both good on the command line and ‘ollama server’ runs a very convenient to use REST server.

In any case, I had fun with MLX today, and I hope it implements 4 bit quantization soon.



Does Ollama let you set server parameters? (e.g., temperature, max_tokens)


yes, you put them in a Modelfile along with whatever system prompt and model you choose. The grammar is similar to a Dockerfile.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: