Recursion in LLM's

As far as I understand about LLMs, they don’t directly process words as we perceive them. Instead, they work with tokenized text, which breaks language into smaller numerical units called tokens. A token might be a whole word, part of a word, or even a punctuation mark. For example, I once asked an LLM to generate a text of 512 tokens and then manually counted the words—it turned out to be 443 words.

This highlights that the model’s internal representation (tokens) doesn’t directly correspond to human word counts. So, tasks like counting ‘words’ might seem nonsensical because the model is reasoning about text through tokens, which may not map 1:1 to words. This mismatch could partly explain why LLMs struggle with such tasks.

1 Like