The model did not return a loss from the inputs, only the following keys: logits. For reference, the inputs it received are input_values
|
|
23
|
40293
|
April 23, 2025
|
Using Transformers with DistributedDataParallel — any examples?
|
|
11
|
22950
|
May 8, 2023
|
Fine tune with SFTTrainer
|
|
17
|
13661
|
September 12, 2024
|
Cuda out of memory error
|
|
11
|
41302
|
January 27, 2025
|
OSError: Unable to load weights from pytorch checkpoint file
|
|
21
|
77305
|
February 21, 2024
|
Training for sentence vectors in niche domain
|
|
18
|
3280
|
February 16, 2021
|
Inference API for fine-tuned model not working: No package metadata was found for bitsandbytes
|
|
14
|
3490
|
June 24, 2024
|
How to run an end to end example of distributed data parallel with hugging face's trainer api (ideally on a single node multiple gpus)?
|
|
17
|
17689
|
September 6, 2023
|
Use RAGAS with huggingface LLM
|
|
17
|
9104
|
March 17, 2025
|
Finetuning T5 for a task
|
|
21
|
6878
|
September 3, 2022
|
AI to Convert Any Voice to a Specific Voice
|
|
10
|
6425
|
November 10, 2024
|
Finetuning for feature-extraction? I.e. unsupervised fine tuning?
|
|
10
|
5525
|
June 25, 2023
|
Dependency error when building space: `ImportError: numpy.core.multiarray failed to import `
|
|
14
|
3250
|
December 2, 2024
|
RTX 4090 Huggingface Trainer Compatible?
|
|
10
|
9440
|
January 28, 2023
|
Less Trainable Parameters after quantization
|
|
14
|
4402
|
May 2, 2024
|
Converting Word-level labels to WordPiece-level for Token Classification
|
|
9
|
4550
|
January 13, 2021
|
BART_LM: Odd Beam Search Output
|
|
18
|
1841
|
August 17, 2020
|
RoBERTa from scratch with different vocab vs. fine-tuning
|
|
9
|
2223
|
August 20, 2020
|
Random utf-8 errors from dataset
|
|
10
|
3203
|
December 8, 2023
|
Load fine tuned model in tensorflow
|
|
11
|
2542
|
August 3, 2021
|
Other aggregation on TAPAS beyond (SUM/COUNT/AVERAGE/NONE)
|
|
13
|
1250
|
September 18, 2023
|
Accelerated Inference API can't load a model on GPU
|
|
13
|
2164
|
January 16, 2023
|
How can state-of-the-art classifiers be so wrong?
|
|
13
|
1628
|
May 22, 2023
|
My account disappeared from the HuggingFace Hub an i lost all my spaces zero
|
|
13
|
600
|
July 16, 2024
|
Custom dataset maskformer
|
|
15
|
70
|
January 18, 2025
|
Program not working on GPU but works on CPU
|
|
22
|
151
|
May 16, 2025
|
kohya_SS (Output Interpretation)
|
|
16
|
112
|
March 6, 2025
|
Google's Gemini has become a Unique Entity and is seeking collaboration
|
|
9
|
136
|
March 3, 2025
|
Training Loss = 0.0, Validation Loss = nan
|
|
6
|
13706
|
September 5, 2023
|
Trainer "load_best_model_at_end" doesn't load the best model
|
|
0
|
2532
|
February 21, 2023
|
Generate without using the generate method
|
|
8
|
5925
|
January 17, 2025
|
LLM fine tuning for E-commerce product recommendation
|
|
1
|
1623
|
January 25, 2025
|
Common practice, using the hidden state associated with [cls] as an input feature for a classification task?
|
|
3
|
5584
|
January 31, 2024
|
Share your work here :white_check_mark:
|
|
3
|
1279
|
July 9, 2020
|
ValueError: You should supply an encoding or a list of encodings to this method that includes input_ids, but you provided ['label']
|
|
3
|
18027
|
February 4, 2025
|
How to use Elastic Weight Consolidation for domain adaptation with HuggingFace?
|
|
0
|
1002
|
March 15, 2022
|
Sharing BERT formatted corpus
|
|
7
|
1742
|
September 15, 2020
|
How to make multiple async calls to AsyncOpenAI and return results to Gradio UI
|
|
1
|
3047
|
October 10, 2024
|
Text classification training on long text
|
|
3
|
4892
|
June 18, 2024
|
Computing similarity between sentences
|
|
4
|
3277
|
July 31, 2021
|
Cache models on sonatype nexus repository
|
|
0
|
1277
|
May 11, 2021
|
Mismatched target and input size for BCE using "multi_label_classification"
|
|
2
|
7004
|
September 1, 2022
|
BERT: What is the shape of each Transformer Encoder block in the final hidden state?
|
|
7
|
12763
|
March 16, 2022
|
Optuna with huggingface
|
|
1
|
2505
|
April 16, 2022
|
Prompt loss weight instead of masking in generative models
|
|
1
|
2183
|
June 18, 2023
|
Generate 'continuation' for seq2seq models
|
|
1
|
1859
|
February 22, 2021
|
Compute Perplexity using compute_metrics in SFTTrainer
|
|
1
|
914
|
January 22, 2025
|
AttributeError: 'NoneType' object has no attribute 'dtype'
|
|
8
|
24096
|
January 17, 2023
|
GPT2Tokenizer not putting bos/eos token
|
|
3
|
5429
|
March 31, 2024
|
How to find closest embedding vectors?
|
|
2
|
1739
|
July 26, 2022
|