Llama3 OutOfMemory on an A100 when doing CausalLM
|
|
0
|
194
|
June 5, 2024
|
Deepspeed ZeRO2, PEFT, bitsnbytes training
|
|
0
|
124
|
June 4, 2024
|
Generating text with multiple styles using llama
|
|
0
|
95
|
June 3, 2024
|
TypeError: map() got an unexpected keyword argument 'num_proc'
|
|
7
|
950
|
June 3, 2024
|
Parameter Count & Shape Discrepancies in 4-bit vs. Higher bit LLM models
|
|
2
|
638
|
June 3, 2024
|
Pre/Post Normalization-layers
|
|
0
|
87
|
June 3, 2024
|
Getting error - trainer.train()
|
|
4
|
973
|
June 3, 2024
|
Does the transformers Trainer.train() automatically set positional attention masks?
|
|
4
|
481
|
June 3, 2024
|
Offloading LLM models to CPU uses only single core
|
|
1
|
3967
|
June 3, 2024
|
Tutorials for using Colab TPUs with Huggingface Transformers?
|
|
16
|
20515
|
June 3, 2024
|
mT5 Question/Answering fine tuning is generating empty sentences during inference
|
|
2
|
654
|
June 2, 2024
|
AggregateScore error when computing metric
|
|
0
|
88
|
June 2, 2024
|
Checkpoints and disk storage
|
|
15
|
8018
|
June 2, 2024
|
Codellama will not stop generating at EOS
|
|
1
|
572
|
June 2, 2024
|
How to clear GPU memory with Trainer without commandline
|
|
1
|
2825
|
June 1, 2024
|
How to use from the Transformers library from Dogge / llama-3-70B-instruct-uncensored
|
|
0
|
76
|
June 1, 2024
|
CUDA OOM error when `ignore_mismatched_sizes` is enabled
|
|
0
|
202
|
May 31, 2024
|
Confidence Score in Setfit fine-tuned model
|
|
5
|
3309
|
May 31, 2024
|
Free up GPU memory after training is finished or interrupted (on Colab)
|
|
1
|
2159
|
May 30, 2024
|
Adding attention mask into MLM
|
|
1
|
345
|
May 30, 2024
|
Can only automatically infer lengths for datasets whose items are dictionaries with an 'input_ids' key
|
|
0
|
514
|
May 30, 2024
|
Inconsistency in logit values between generation and direct model prediction #31127
|
|
0
|
203
|
May 30, 2024
|
Using both packing and group_by_length in SFTTrainer
|
|
0
|
168
|
May 29, 2024
|
Electrician Emergency Service Vienna: Immediate Support for Electrical Problems
|
|
0
|
63
|
May 29, 2024
|
How to extract a specific paragraph from a text file
|
|
2
|
730
|
May 29, 2024
|
Embedding which takes in account order of words
|
|
0
|
70
|
May 29, 2024
|
Pipeline Error: PeftModel... is not supported for text-classification
|
|
1
|
451
|
May 29, 2024
|
Quantization of Transformers model
|
|
0
|
74
|
May 29, 2024
|
Helping for Evaluation Video Classification Model (with a IterableDataset)
|
|
0
|
80
|
May 29, 2024
|
Unexpected results using ORPO trl
|
|
0
|
96
|
May 29, 2024
|