Llama 3.1 70-B run on 32 GB Vram?

To do this, it seems to be sufficient to specify device_map=“auto”.

However, HF libraries in general are still not very good at dealing with quantised files, so it would not be surprising if there were errors.