[HELP] RuntimeError: CUDA error - when training my model?

lvwerra · August 17, 2021, 9:54am

I think that could indeed be the issue. Since the tokenizer has a different vocabulary size this is likely incompatible with the config you are loading which contains the vocab size of the original model. You can fix it with:

config = AutoConfig.from_pretrained(model_checkpoint, vocab_size=len(tokenizer))

I hope this helps!

PS: sometimes debugging these CUDA errors can be unreadable and it can help to execute the code for debugging purposes on the CPU instead (training_args.device=‘cpu’ should do the trick).

Topic		Replies	Views
[HELP] RuntimeError: CUDA error: device-side assert triggered Beginners	20	54477	October 23, 2024
Tokenizer setting for model = LlamaForCausalLM.from_pretrained(model_path, device_map='auto') Models	0	1129	August 25, 2023
I am getting Runtime error when i am trying to fine tune the Code LLama on custom dataset Intermediate	0	17	July 26, 2024
Bypassing "CUDA error: unspecified launch failure" error from trainer checkpoint loading 🤗Transformers	0	213	July 11, 2024
RuntimeError: CUDA error: device-side assert triggered CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1 Compile with `TORCH_USE_CU 🤗Transformers	2	1158	November 1, 2024

[HELP] RuntimeError: CUDA error - when training my model?

Related topics