How to get Llama-2-13b-chat-hf to ACTUALLY RUN

cdcv · May 30, 2024, 10:15am

Has someone got a complete, fully specified and working ‘recipe’/instructions for how/where to set a large llama model up that will just work?

I am sure thousands of people have done this.

I have been trying for many, many days now to just get Llama-2-13b-chat-hf to run at all.
I have even hired a consultant, who has also spent a lot of time and so far failed.
The problem has been getting a machine with the right resources, environment, configuration, etc.
We have tried a lot of different approaches:

Local machine. Result: not enough GPU memory, even on Mac Studio.
Google Colab Pro, even on A100. Result: not enough GPU memory, various other problems.
Google Cloud Virtual Machine. Result: after many, many attempts have never gotten a version with the right boot disk, OS, environment, GPU, drivers, packages/installed software, that will actually run the model. Have spent literally days trying to guess which combination will work, and certainly tried the basics like using the ‘Deep Learning’ VM.

Thank you for your help.

Topic		Replies	Views
Llama-2 on colab Beginners	3	11394	November 28, 2023
Llama 2 envoirnment Beginners	0	201	November 1, 2023
What are gpu requirements to run llama 2 13b on spaces Spaces	0	521	November 8, 2023
How to run llama2 7b chat locally with 3060 6GB Ram Beginners	0	495	March 9, 2024
Inference Llama-2-13b not working 🤗Hub	4	7622	July 30, 2023

How to get Llama-2-13b-chat-hf to ACTUALLY RUN

Related topics