Hi everyone,
I’m trying to build a simple autocomplete feature for a web-app search box, where the model/AI task predicatively completes what the user is typing (like Google suggestions). For example, if the user types “how to install freebsd”, the model should generate a few short possible completions.
Please offer some advice and suggestions. I plan on running the model on a dedicated GPU, would like it to be fast and accurate. Langue must be English.
What I’m looking for:
-
Which HF models work best for short, real-time autocomplete?
-
Is text-generation the right way, or should I be using embeddings / another method?
-
Any simple examples or repos showing how to do autocomplete specifically?
Just trying to get some guidance from people who’ve done this before.
Thanks!