HFvalidationerror: Repo_id must be in the form repo_name
|
|
8
|
18492
|
March 19, 2025
|
Fine-Tuning a Mamba Model with using Hugging Face Transformers
|
|
1
|
14
|
March 18, 2025
|
TRL SFTTrainer 0.15 compute_token_accuracy error
|
|
2
|
33
|
March 18, 2025
|
How can I set `max_memory` parameter while loading Quantized model with Model Pipeline class?
|
|
2
|
12
|
March 18, 2025
|
E5 embedding models
|
|
1
|
7
|
March 17, 2025
|
Using TableTransformer in Standalone Mode Without Hugging Face Hub Access
|
|
1
|
8
|
March 17, 2025
|
The checkpoint you are trying to load has model type `gemma2` but Transformers does not recognize this architecture
|
|
8
|
5970
|
March 17, 2025
|
Get each generated token last layer hidden state
|
|
3
|
10
|
March 16, 2025
|
Why does automodelforcausallm.from_pretrained() work on base models and not instruct models?
|
|
4
|
23
|
March 15, 2025
|
[Possibly] Forgotten TODO Comment for `TrainingArguments.default_optim`
|
|
1
|
15
|
March 14, 2025
|
Metrics for Training Set in Trainer
|
|
11
|
24912
|
March 14, 2025
|
How can LLMs be fine-tuned for specialized domain knowledge?
|
|
1
|
63
|
March 14, 2025
|
Corrupted deepspeed checkpoint
|
|
1
|
9
|
March 13, 2025
|
Model does not exist, inference API don't work
|
|
5
|
88
|
March 13, 2025
|
Q&A the stock prediction
|
|
1
|
1181
|
January 7, 2024
|
Difference BertModel, AutoModel and AutoModelForMaskedLM
|
|
8
|
4689
|
March 9, 2025
|
Injecting Multiple Modalities into a Transformer Decoder via Cross-Attention
|
|
1
|
9
|
March 9, 2025
|
Support for LLaMA in EncoderDecoder framework
|
|
1
|
507
|
March 8, 2025
|
SFTTrainer Doubling Speed on a Single GPU with DeepSpeed: Proposal for an Update to the Official Documentation and Verification Report
|
|
1
|
15
|
March 7, 2025
|
After fine tuning openai whisper model, there shows OSError WinError 123
|
|
1
|
7
|
March 7, 2025
|
About Hyperparameter Search with Ray Tune
|
|
2
|
11
|
March 7, 2025
|
Trainer.train() runs for long and appears to be stuck. How do I know that it's processing and not in loop
|
|
2
|
391
|
March 7, 2025
|
As of transformers v4.44, default chat template is no longer allowed
|
|
2
|
1605
|
March 7, 2025
|
Multi Objective Hyperparameter Optimization
|
|
3
|
16
|
March 7, 2025
|
Repetitive Token Generation During Evaluation in Fine-Tuned LLaMA Model
|
|
1
|
11
|
March 6, 2025
|
Looks like the new transformer 4.49.0 has some issues
|
|
3
|
85
|
March 6, 2025
|
Don't apply complete
|
|
1
|
7
|
March 6, 2025
|
Speculative Decoding with Qwen Models
|
|
1
|
58
|
March 5, 2025
|
SSL Certificate Issue
|
|
8
|
22797
|
March 5, 2025
|
How to correctly count downloads in revisions in the transformers hub
|
|
1
|
6
|
March 3, 2025
|