The one I'm running is the 8.54GB file. I'm using Ollama like this:
ollama run hf.co/unsloth/DeepSeek-R1-Distill-Llama-8B-GGUF:Q8_0
You can prompt it directly there, but I'm using my LLM tool and the llm-ollama plugin to run and log prompts against it. Once Ollama has loaded the model (from the above command) you can try those with uvx like this:
uvx --with llm-ollama \
llm -m 'hf.co/unsloth/DeepSeek-R1-Distill-Llama-8B-GGUF:Q8_0' \
'a joke about a pelican and a walrus who run a tea room together'
The one I'm running is the 8.54GB file. I'm using Ollama like this:
You can prompt it directly there, but I'm using my LLM tool and the llm-ollama plugin to run and log prompts against it. Once Ollama has loaded the model (from the above command) you can try those with uvx like this: Here's what I got - the joke itself is rubbish but the "thinking" section is fascinating: https://gist.github.com/simonw/f505ce733a435c8fc8fdf3448e381...I also set an alias for the model like this:
Now I can run "llm -m r1l" (for R1 Llama) instead.I wrote up my experiments so far on my blog: https://simonwillison.net/2025/Jan/20/deepseek-r1/