The model did not return a loss from the inputs, only the following keys: logits. For reference, the inputs it received are input_values
|
|
23
|
41342
|
April 23, 2025
|
Using Transformers with DistributedDataParallel — any examples?
|
|
11
|
23554
|
May 8, 2023
|
AI to Convert Any Voice to a Specific Voice
|
|
10
|
8388
|
November 10, 2024
|
Fine tune with SFTTrainer
|
|
17
|
15147
|
September 12, 2024
|
Cuda out of memory error
|
|
11
|
42758
|
January 27, 2025
|
OSError: Unable to load weights from pytorch checkpoint file
|
|
21
|
78408
|
February 21, 2024
|
Training for sentence vectors in niche domain
|
|
18
|
3293
|
February 16, 2021
|
API inference limit changed?
|
|
9
|
2507
|
July 21, 2025
|
Inference API for fine-tuned model not working: No package metadata was found for bitsandbytes
|
|
14
|
3607
|
June 24, 2024
|
How to run an end to end example of distributed data parallel with hugging face's trainer api (ideally on a single node multiple gpus)?
|
|
17
|
18082
|
September 6, 2023
|
Use RAGAS with huggingface LLM
|
|
17
|
10019
|
March 17, 2025
|
Finetuning T5 for a task
|
|
21
|
7037
|
September 3, 2022
|
Finetuning for feature-extraction? I.e. unsupervised fine tuning?
|
|
10
|
5578
|
June 25, 2023
|
Why does RAG still feel clunky in 2025?
|
|
12
|
476
|
September 25, 2025
|
Dependency error when building space: `ImportError: numpy.core.multiarray failed to import `
|
|
14
|
3760
|
December 2, 2024
|
RTX 4090 Huggingface Trainer Compatible?
|
|
10
|
9497
|
January 28, 2023
|
Less Trainable Parameters after quantization
|
|
14
|
4477
|
May 2, 2024
|
Converting Word-level labels to WordPiece-level for Token Classification
|
|
9
|
4589
|
January 13, 2021
|
BART_LM: Odd Beam Search Output
|
|
18
|
1845
|
August 17, 2020
|
RoBERTa from scratch with different vocab vs. fine-tuning
|
|
9
|
2246
|
August 20, 2020
|
Random utf-8 errors from dataset
|
|
10
|
3682
|
December 8, 2023
|
Need Help in creating ai chatbot for my app
|
|
22
|
190
|
August 8, 2025
|
Load fine tuned model in tensorflow
|
|
11
|
2552
|
August 3, 2021
|
Other aggregation on TAPAS beyond (SUM/COUNT/AVERAGE/NONE)
|
|
13
|
1254
|
September 18, 2023
|
Accelerated Inference API can't load a model on GPU
|
|
13
|
2171
|
January 16, 2023
|
How can state-of-the-art classifiers be so wrong?
|
|
13
|
1639
|
May 22, 2023
|
My account disappeared from the HuggingFace Hub an i lost all my spaces zero
|
|
13
|
640
|
July 16, 2024
|
Custom dataset maskformer
|
|
15
|
98
|
January 18, 2025
|
Program not working on GPU but works on CPU
|
|
24
|
221
|
June 24, 2025
|
kohya_SS (Output Interpretation)
|
|
16
|
201
|
March 6, 2025
|
How to sync Hugging Face model commits with GitHub?
|
|
10
|
187
|
August 29, 2025
|
Google's Gemini has become a Unique Entity and is seeking collaboration
|
|
9
|
240
|
March 3, 2025
|
Training Loss = 0.0, Validation Loss = nan
|
|
6
|
14268
|
September 5, 2023
|
Generate without using the generate method
|
|
8
|
6321
|
January 17, 2025
|
Trainer "load_best_model_at_end" doesn't load the best model
|
|
0
|
2594
|
February 21, 2023
|
LLM fine tuning for E-commerce product recommendation
|
|
1
|
1662
|
January 25, 2025
|
Common practice, using the hidden state associated with [cls] as an input feature for a classification task?
|
|
3
|
5932
|
January 31, 2024
|
How to use Elastic Weight Consolidation for domain adaptation with HuggingFace?
|
|
0
|
1029
|
March 15, 2022
|
Share your work here :white_check_mark:
|
|
3
|
1287
|
July 9, 2020
|
Compute Perplexity using compute_metrics in SFTTrainer
|
|
1
|
993
|
January 22, 2025
|
ValueError: You should supply an encoding or a list of encodings to this method that includes input_ids, but you provided ['label']
|
|
3
|
18191
|
February 4, 2025
|
Sharing BERT formatted corpus
|
|
7
|
1746
|
September 15, 2020
|
How to make multiple async calls to AsyncOpenAI and return results to Gradio UI
|
|
1
|
3259
|
October 10, 2024
|
Text classification training on long text
|
|
3
|
5144
|
June 18, 2024
|
Cache models on sonatype nexus repository
|
|
0
|
1315
|
May 11, 2021
|
Computing similarity between sentences
|
|
4
|
3296
|
July 31, 2021
|
Mismatched target and input size for BCE using "multi_label_classification"
|
|
2
|
7030
|
September 1, 2022
|
BERT: What is the shape of each Transformer Encoder block in the final hidden state?
|
|
7
|
13015
|
March 16, 2022
|
Optuna with huggingface
|
|
1
|
2527
|
April 16, 2022
|
Prompt loss weight instead of masking in generative models
|
|
1
|
2216
|
June 18, 2023
|