- Reinforcement learning with verifiable rewards (RLVR): instead of using a grader model you use a domain that can be deterministically graded, such as math problems.
There is a mapping. An internal, fully learned mapping that's derived from seeing misspellings and words spelled out letter by letter. Some models make it an explicit part of the training with subword regularization, but many don't.
It's hard to access that mapping though.
A typical LLM can semi-reliably spell common words out letter by letter - but it can't say how many of each are in a single word immediately.
But spelling the word out first and THEN counting the letters? That works just fine.
You wouldn't know anything about it considering you've been wrong in all your accusations and predictions. Glad to see no-one takes you seriously anymore.
Admittedly I've not tried running on system RAM often, but every time I've tried it's been abysmally slow (< 1 T/s) when I've tried on something like KoboldCPP or ollama. Is there any particular method required to run them faster? Or is it just "get faster RAM"? I fully admit my DDR3 system has quite slow RAM...
reply