Multi-GPU inference with LLM produces gibberish

LucasWeber · May 2, 2023, 2:19pm

There seems to be some deeper problem (it appears as if it has to do with some interaction of the hardware and the drivers and the latest version of transformers/tokenizers). We got in contact with NVIDIA about this.
Since it has only indirectly to do with transformers, this can be closed.

Topic		Replies	Views
Getting error when running inference in multiple GPUs 🤗Transformers	0	648	October 13, 2023
Why does Transformer (LLaMa 3.1-8B) give different logits during inference for the same sample when used with single versus multi gpu prediction? 🤗Accelerate	0	100	September 20, 2024
Problem in Inference on "meta-llama/Meta-Llama-3.1-70B" Beginners	3	426	September 16, 2024
Tranier not starting on multi-GPU setting 🤗Transformers	1	1057	February 15, 2024
Does anyone have an idea how we can run llama2 with multiple GPUs? 🤗Transformers	1	1278	October 26, 2023

Multi-GPU inference with LLM produces gibberish

Related topics