I tried many of the examples in this article in Gemini 2.5 pro and it seems to h...

simonw · 2025-10-05T20:52:41 1759697561

Glitch tokens should be tokenizer-specific. Gemini uses a different tokenizer from the OpenAI models.

The origins of the OpenAI glitch tokens are pretty interesting: the trained an early tokenizer on common strings in their early training data but it turns out popular subreddits caused some weird tokens to be common enough to get assigned an integer, like davidjl - a frequent poster in the https://reddit.com/r/counting subreddit. More on that here: https://simonwillison.net/2023/Jun/8/gpt-tokenizers/#glitch-...