Could binary systems make AI text output more efficient?

Xersa · August 18, 2025, 4:18pm

I’ve been thinking about how LLMs actually generate the text we see, and I realized something that feels like a small revelation.

When an AI like GPT or Gemini “generates text,” there are really two different processes happening:

Inside the model (math side):
- Input text gets split into token IDs.
- The model runs a softmax over all possible tokens to decide which ID should come next.
- Example: ID 15496 = "hello".
Outside the model (display side):
- The AI’s job is done once it outputs "hello".
- The browser/app does the actual visual rendering — fonts, pixels, showing you the word on your screen.

So the AI doesn’t “draw” the text — it just sends back the chosen string, and your system makes it visible.

That got me thinking:

Token IDs today are just flat numbers, basically arbitrary slots in a big lookup table.
But in principle, you could structure IDs in a binary or hierarchical system (like Huffman codes, tries, or prefix trees).
This wouldn’t change the fact that browsers handle rendering, but it could make the handoff between AI and display more efficient — through compression, faster decoding, or maybe even new ways of structuring reasoning.

Right now, direct indexing is optimal for speed, so this hasn’t been necessary. But I wonder if revisiting binary systems for token ID organization could have benefits in scaling, transmission efficiency, or model design.

Has anyone seen research that looks at token IDs in this way? Or thoughts on whether a binary/structured ID space could help in practice?

Pimpcat-AU · August 18, 2025, 8:51pm

This is exactly what I am already doing. Tribit is built on 6x6 pixel glyphs where each glyph replaces a word or command as a singular memory unit. Instead of relying on binary token IDs, the system encodes and recalls meaning directly from glyphs. It is a fully deterministic way to store and access memory.

Ernst03 · August 19, 2025, 4:43pm

Welcome to posting @Xersa

I am interested in your question.

Shobhan-Kumar · August 21, 2025, 5:58pm

Interesting point. If we use structured IDs, it could reduce memory usage and transmission size, but the decoding process would likely take more time

Pimpcat-AU · August 22, 2025, 12:23pm

No there’s no need to decode anything. I simply trained my bot on the glyphs. It’s no different to using any other vocab.

Ernst03 · August 23, 2025, 2:20pm

I see a banner that says you are a first time poster @Shobhan-Kumar so welcome to posting!

I have been watching YouTube videos on AI and related topics.

I read and see that if we could avoid floating point our compute becomes more affordable.

Is that what you mean?

Ernst03 · August 23, 2025, 2:22pm

@Pimpcat-AU : Andrew, did I mention to you that Dynamic Unary generates glyphs that scale from one bit length to n-bit length?

I forgot if I mentioned that to you.

Topic		Replies	Views
Better generated tokens from GPT2 Beginners	1	613	July 25, 2020
GPT2: many bad_words_ids leading to slow text generation? Intermediate	0	1558	September 4, 2021
Speeding up GPT2 generation Beginners	3	4831	October 29, 2020
Is "EOS token" mandatory for T5 model in text classification task Beginners	0	696	October 10, 2021
T5 Generates very short summaries 🤗Transformers	22	5654	September 11, 2020

Could binary systems make AI text output more efficient?

Related topics