I am new to LLM programming in Python and I am trying to fine-tune the instructlab/merlinite-7b-lab model on my Mac M1. My goal is to teach this model to a new music composer Xenobi Amilen I have invented.
Using the new Ilab CLI from RedHat I created this training set for the model. It is a JSONL file with 100 questions/answers about the invented composer.
I wrote this Python script to train the model. I tested all the parts related to the tokenizer, datasets and it seems to work. However, the final train got this error:
RuntimeError: Placeholder storage has not been allocated on MPS device!
0%| | 0/75 [00:00<?, ?it/s]
I found a lot of articles about this error on Google and also StackOverflow like this, for example. The problem seems that in addition to the model I have to send to mps also the input parameters, but it’s not clear to me how to change my code to do that.
I tried several fixes but had no luck. Can anyone can help?
I added the option as you suggested but now I get this error:
Traceback (most recent call last):
File "/Users/sasadangelo/github.com/sasadangelo/llm-train/main.py", line 71, in <module>
training_args = TrainingArguments(
^^^^^^^^^^^^^^^^^^
File "<string>", line 129, in __init__
File "/Users/sasadangelo/github.com/sasadangelo/llm-train/venv/lib/python3.12/site-packages/transformers/training_args.py", line 1693, in __post_init__
self.device
File "/Users/sasadangelo/github.com/sasadangelo/llm-train/venv/lib/python3.12/site-packages/transformers/training_args.py", line 2171, in device
return self._setup_devices
^^^^^^^^^^^^^^^^^^^
File "/Users/sasadangelo/github.com/sasadangelo/llm-train/venv/lib/python3.12/site-packages/transformers/utils/generic.py", line 60, in __get__
cached = self.fget(obj)
^^^^^^^^^^^^^^
File "/Users/sasadangelo/github.com/sasadangelo/llm-train/venv/lib/python3.12/site-packages/transformers/training_args.py", line 2133, in _setup_devices
raise ValueError(
ValueError: Either you do not have an MPS-enabled device on this machine or MacOS version is not 12.3+ or current PyTorch install was not built with MPS enabled.
I have MacOS Sonoma 14.5, MPS is enabled because this test works fine:
The only thing I don’t know if PyTorch has MPS enabled but I don’t know how to verify. On MacOS I installed with pip the pythorch version (it is a recent nightly build):
torch==2.5.0.dev20240704
I used also the stable release in my tests but I didn’t try with the option you suggested. What you suggest to do?
FYI, no_cuda has been deprecated and raplaced by use_cpu that by default is already to False. So I don’t think the flag is the issue. I think it is something related to Pytorch.
Hi @sasadangelo - I’m a member of the InstructLab project. I am sorry I missed this post when you initially made it.
I looked at your repo - I don’t see the knowledge YAML or md you would need using the InstructLab methodology to train, do you have those available to take a look at?
I’m also wondering why you used a custom script instead of ilab train to train the model? Were you just interested in the datagen component and not the full workflow?
The point is that I don’t want to use instructlab to train the model and his methodology with skill and knowledge. I want to understand how training works at a lower level. This is the reason why I implemented my script.