RuntimeError: Failed to import transformers.models.roberta.modeling_tf_roberta because of the following error (look up to see its traceback): No module named 'keras.engine'
|
|
6
|
6364
|
January 14, 2025
|
Initializing a big model on GPU with random weights
|
|
2
|
61
|
January 14, 2025
|
What is `self.loss_function` in `forward()` of newly released LLM?
|
|
0
|
42
|
January 14, 2025
|
ValueError: Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=True' to have batched tensors with the same length
|
|
4
|
36201
|
January 13, 2025
|
Qwen Not work anymore
|
|
1
|
170
|
January 13, 2025
|
No matter what I do the HFā¦
|
|
2
|
27
|
January 13, 2025
|
Expected `tensors` and `new_tensors` to have the same type but found <class 'tuple'> and <class 'torch.Tensor'>
|
|
2
|
14
|
January 12, 2025
|
Fine-tuning an NLLB model for a new language
|
|
7
|
2581
|
January 12, 2025
|
Preparing data for Donut training results in error "ArrowInvalid: offset overflow while concatenating arrays"
|
|
2
|
576
|
January 12, 2025
|
Coherforai I have API but I can't Access
|
|
1
|
20
|
January 10, 2025
|
Llama-2 7B-hf repeats context of question directly from input prompt, cuts off with newlines
|
|
16
|
28841
|
January 10, 2025
|
Multi-input tag and ,multi-label output for token classification using Bert pretrained model
|
|
1
|
80
|
January 9, 2025
|
TypeError: 'list' object is not callable
|
|
1
|
26
|
January 8, 2025
|
Generate() speculative decoding with static string
|
|
0
|
17
|
January 7, 2025
|
Unable to Run Sentence Transformer Text embedding in Docker
|
|
1
|
339
|
January 7, 2025
|
How to output loss from model.generate()?
|
|
16
|
5937
|
January 7, 2025
|
Labels in Audio Frame classification task (Wav2Vec2 For Audio Frame Classification)
|
|
1
|
657
|
January 7, 2025
|
All-mpnet-base-v2 get different results in Spark-NLP vs SentenceTransformers
|
|
2
|
51
|
January 6, 2025
|
Loss not Decreasing: Hiera MAE Pretraining from Scratch
|
|
0
|
24
|
January 6, 2025
|
Is there anyway to modify the Trainer eval_loop aggregate function?
|
|
1
|
21
|
January 6, 2025
|
NotImplementedError: Cannot copy out of meta tensor; no data!
|
|
3
|
6358
|
January 4, 2025
|
Timeout Issue with DeepSpeed on Multiple GPUs
|
|
1
|
489
|
January 3, 2025
|
How to Use torch_directml GPU with Transformers.Trainer for Fine-Tuning?
|
|
0
|
92
|
January 2, 2025
|
-inf values for logit score outputs with model.generate
|
|
3
|
802
|
January 2, 2025
|
Trainer warning with the new version
|
|
2
|
5130
|
January 2, 2025
|
ValueError: Make sure that you pass in as many target sizes as the batch dimension of the logits
|
|
2
|
89
|
January 1, 2025
|
Training loss no drop for MT5ForSequenceClassification
|
|
2
|
190
|
January 1, 2025
|
Batched Generation with Flash Attention
|
|
1
|
391
|
December 30, 2024
|
Using persistent storage on HF spaces
|
|
3
|
222
|
December 30, 2024
|
Trainer API object detection
|
|
2
|
41
|
December 29, 2024
|