[Tool] Open-source prompt compressor for LLMs – 22% avg savings with spaCy + rules

metawake · April 14, 2025, 11:47pm

Hey everyone !

I recently built a very experimental semantic prompt compressor aimed at reducing LLM token usage without losing important context.
Still not sure how worth is the idea, but I did have fun with this experiment.

Built with spaCy and YAML rule configs
Domain-sensitive (best for human queries)
Preserves >95% named entities and technical terms
Achieves ~22% compression across real-world prompts

It’s designed to work both for runtime compression and prompt normalization before storage / vector DB ingestion.

Open source and ready to test:
GitHub: GitHub - metawake/prompt_compressor
Full writeup

Would love feedback from the community whether this looks useful or not and whether you faced the need to implement something similar.
Is anyone fighting “token reduction” fight?

Cheers!

metawake · April 17, 2025, 10:35am

I’m planning a second iteration focused on adaptive output shaping — would love to hear what compression needs other devs are facing!

Topic		Replies	Views
Fine tuning reformer model 🤗Transformers	0	376	August 30, 2020
Better generated tokens from GPT2 Beginners	1	609	July 25, 2020
Which transformer or transformers in a pipeline transform a long text(book, books) abstractively into non-redundant bullet points without content restriction while being exhaustive? 🤗Transformers	0	260	April 10, 2023
Text generation, text2text: change output vocabulary, output distribution dimensions Intermediate	0	543	March 11, 2021
Can t5 transformer can be used to summarize conversations 🤗Transformers	1	452	January 19, 2021

[Tool] Open-source prompt compressor for LLMs – 22% avg savings with spaCy + rules

Related topics