Best Local LLM for Real-Time Q&A on German/English Transcript?

Baseult · June 19, 2025, 8:22am

Hi everyone,

I’m looking for model recommendations for a local Python application.

My project: A desktop app that live-transcribes my PC audio (mostly German). I want to use a local LLM to ask questions about this transcript in real-time.

My key requirements are:

Integration: Must work directly with the standard transformers pipeline() on a consumer AMD CPU 9800x3D. I cannot use a separate server like vLLM/TGI.
Performance: I’m looking for models in the ~5B to 13B range that are fast enough for interactive chat.
Languages: The model must be strong in both German and English .

My research suggests that meta-llama/Meta-Llama-3.1-8B-Instruct is currently the best choice for this.

Is this my best option, or are there other more recent high-performing bilingual models (especially finetunes) that fit these constraints?

Thanks for any suggestions

John6666 · June 19, 2025, 12:24pm

There doesn’t seem to be much new leaderboard data for German LLM, but within the available data, Llama 3.1 Instruct appears to be quite good. That model is generally well-designed. For 12B, Mistral Nemo might be a good option. Additionally, since Qwen 2’s scores aren’t bad, Qwen 2.5, which made significant progress compared to Qwen 2, and its successor Qwen 3, may also be promising.
For multilingual models, Gemma 2 and Gemma 3 are generally excellent.

Since the Hub now has a feature to search for models by size, it would be even better to find a version fine-tuned for German, regardless of which model you choose.

Topic		Replies	Views
Advice for locally run AI Assistant Beginners	6	830	March 10, 2025
Which LLM Works Best for Prompt and Response Generation in Chinese (Simplified and Traditional) Languages at Hugging Face	3	848	January 23, 2025
Best model for translating English to Japanese Models	7	2806	April 29, 2025
Best Open Source Models for English to Japanese Models	5	152	June 24, 2025
Best way to deploy a SLM/LLM model. Best library and approach? Research	6	829	March 11, 2025

Best Local LLM for Real-Time Q&A on German/English Transcript?

Related topics