The hidden_states when i use model.generate
|
|
4
|
2158
|
March 28, 2025
|
Fixing the random seed in the Trainer does not produce the same results across runs
|
|
5
|
17692
|
March 27, 2025
|
The size of tensor a (882) must match the size of tensor b (568) at non-singleton dimension 1
|
|
2
|
126
|
March 27, 2025
|
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:7 and cuda:0!
|
|
2
|
207
|
March 25, 2025
|
Unexpected behavior of load_best_model_at_end in Trainer (or am I doing it wrong?)
|
|
2
|
71
|
March 25, 2025
|
Load_best_model_at_end doesn't work?
|
|
1
|
122
|
March 25, 2025
|
Molformer model training error
|
|
6
|
61
|
March 25, 2025
|
Extract Attention Weights from a Specific Layer and Head Efficiently
|
|
1
|
228
|
March 25, 2025
|
Clarification on Commercial License Impact of LayoutLMv3ImageProcessor within UdopProcessor
|
|
0
|
58
|
March 24, 2025
|
Runtime Error: Cuda Initialization
|
|
13
|
387
|
March 24, 2025
|
Reasoning LLM Benchmarking
|
|
2
|
2180
|
March 24, 2025
|
Web worker fails to process input data
|
|
1
|
45
|
March 22, 2025
|
Adding dropout in custom model, but setting dropout through .from_pretrained()
|
|
2
|
70
|
March 21, 2025
|
Multimodal training
|
|
4
|
67
|
March 21, 2025
|
Target branch/tag/commit for automatic Hub pushes
|
|
1
|
15
|
March 21, 2025
|
Two questions when I wraped the AutoModelForMaskedLM
|
|
7
|
32
|
March 21, 2025
|
One question is about the pretrain method in Transformer packge ?
|
|
1
|
204
|
March 19, 2025
|
Partially loss calculation with transformers LLM Trainer and DataCollator
|
|
1
|
70
|
March 19, 2025
|
Custom VLM - Swapping a vision encoder from a VLM
|
|
1
|
275
|
March 19, 2025
|
HFvalidationerror: Repo_id must be in the form repo_name
|
|
8
|
30353
|
March 19, 2025
|
Fine-Tuning a Mamba Model with using Hugging Face Transformers
|
|
1
|
278
|
March 18, 2025
|
TRL SFTTrainer 0.15 compute_token_accuracy error
|
|
2
|
178
|
March 18, 2025
|
How can I set `max_memory` parameter while loading Quantized model with Model Pipeline class?
|
|
2
|
61
|
March 18, 2025
|
E5 embedding models
|
|
1
|
24
|
March 17, 2025
|
Using TableTransformer in Standalone Mode Without Hugging Face Hub Access
|
|
1
|
52
|
March 17, 2025
|
The checkpoint you are trying to load has model type `gemma2` but Transformers does not recognize this architecture
|
|
8
|
7297
|
March 17, 2025
|
Get each generated token last layer hidden state
|
|
3
|
56
|
March 16, 2025
|
Why does automodelforcausallm.from_pretrained() work on base models and not instruct models?
|
|
4
|
121
|
March 15, 2025
|
[Possibly] Forgotten TODO Comment for `TrainingArguments.default_optim`
|
|
1
|
32
|
March 14, 2025
|
Metrics for Training Set in Trainer
|
|
11
|
26929
|
March 14, 2025
|