[Tool] Open-source prompt compressor for LLMs – 22% avg savings with spaCy + rules

Hey everyone !

I recently built a very experimental semantic prompt compressor aimed at reducing LLM token usage without losing important context.
Still not sure how worth is the idea, but I did have fun with this experiment.

:gear: Built with spaCy and YAML rule configs
:test_tube: Domain-sensitive (best for human queries)
:locked: Preserves >95% named entities and technical terms
:chart_decreasing: Achieves ~22% compression across real-world prompts

It’s designed to work both for runtime compression and prompt normalization before storage / vector DB ingestion.

Open source and ready to test:
:backhand_index_pointing_right: GitHub: GitHub - metawake/prompt_compressor
:backhand_index_pointing_right: Full writeup

Would love feedback from the community whether this looks useful or not and whether you faced the need to implement something similar.
Is anyone fighting “token reduction” fight?

Cheers!

1 Like

I’m planning a second iteration focused on adaptive output shaping — would love to hear what compression needs other devs are facing!