Getting error - trainer.train()
|
|
4
|
991
|
June 3, 2024
|
Does the transformers Trainer.train() automatically set positional attention masks?
|
|
4
|
558
|
June 3, 2024
|
Offloading LLM models to CPU uses only single core
|
|
1
|
4028
|
June 3, 2024
|
Tutorials for using Colab TPUs with Huggingface Transformers?
|
|
16
|
20757
|
June 3, 2024
|
mT5 Question/Answering fine tuning is generating empty sentences during inference
|
|
2
|
662
|
June 2, 2024
|
AggregateScore error when computing metric
|
|
0
|
90
|
June 2, 2024
|
Checkpoints and disk storage
|
|
15
|
8141
|
June 2, 2024
|
Codellama will not stop generating at EOS
|
|
1
|
582
|
June 2, 2024
|
How to clear GPU memory with Trainer without commandline
|
|
1
|
2878
|
June 1, 2024
|
How to use from the Transformers library from Dogge / llama-3-70B-instruct-uncensored
|
|
0
|
76
|
June 1, 2024
|
CUDA OOM error when `ignore_mismatched_sizes` is enabled
|
|
0
|
218
|
May 31, 2024
|
Confidence Score in Setfit fine-tuned model
|
|
5
|
3350
|
May 31, 2024
|
Free up GPU memory after training is finished or interrupted (on Colab)
|
|
1
|
2462
|
May 30, 2024
|
Adding attention mask into MLM
|
|
1
|
377
|
May 30, 2024
|
Can only automatically infer lengths for datasets whose items are dictionaries with an 'input_ids' key
|
|
0
|
556
|
May 30, 2024
|
Inconsistency in logit values between generation and direct model prediction #31127
|
|
0
|
218
|
May 30, 2024
|
Using both packing and group_by_length in SFTTrainer
|
|
0
|
186
|
May 29, 2024
|
Electrician Emergency Service Vienna: Immediate Support for Electrical Problems
|
|
0
|
64
|
May 29, 2024
|
How to extract a specific paragraph from a text file
|
|
2
|
751
|
May 29, 2024
|
Embedding which takes in account order of words
|
|
0
|
71
|
May 29, 2024
|
Pipeline Error: PeftModel... is not supported for text-classification
|
|
1
|
502
|
May 29, 2024
|
Quantization of Transformers model
|
|
0
|
75
|
May 29, 2024
|
Helping for Evaluation Video Classification Model (with a IterableDataset)
|
|
0
|
83
|
May 29, 2024
|
Unexpected results using ORPO trl
|
|
0
|
97
|
May 29, 2024
|
ValueError when using PatchTSTForClassification
|
|
1
|
144
|
May 28, 2024
|
"No space left on device" when using model.push_to_hub()
|
|
1
|
360
|
May 28, 2024
|
Why GPU not be use in Evaluation of Trainer
|
|
0
|
96
|
May 28, 2024
|
LayoutLMV3 inference without label
|
|
0
|
103
|
May 28, 2024
|
Llama-2 output from forward function is nonsense, `.generate()` is okay
|
|
3
|
1068
|
May 27, 2024
|
Finetuning in 8bit with Custom Head
|
|
0
|
146
|
May 27, 2024
|