Recursion in LLM's

Pankaj8922 · December 7, 2024, 2:58pm

As far as I understand about LLMs, they don’t directly process words as we perceive them. Instead, they work with tokenized text, which breaks language into smaller numerical units called tokens. A token might be a whole word, part of a word, or even a punctuation mark. For example, I once asked an LLM to generate a text of 512 tokens and then manually counted the words—it turned out to be 443 words.

This highlights that the model’s internal representation (tokens) doesn’t directly correspond to human word counts. So, tasks like counting ‘words’ might seem nonsensical because the model is reasoning about text through tokens, which may not map 1:1 to words. This mismatch could partly explain why LLMs struggle with such tasks.

Topic		Replies	Views
Mathematic Mistakes 🤗Transformers	1	128	June 28, 2024
How to stop LLM from going up to the max token limit? Intermediate	1	135	September 25, 2024
How to count input tokens in vision model? 🤗Transformers	4	683	December 19, 2024
Unisloth 4-bit Llama models acting weirdly when used in a Function Beginners	0	166	May 8, 2024
1024 tokens limit in Bart Models	1	451	February 5, 2024

Recursion in LLM's

Related topics