Using DistributedSampler with accelerate
|
|
2
|
13
|
April 2, 2025
|
ValueError: Could not interpret optimizer identifier
|
|
1
|
174
|
April 1, 2025
|
Model_accepts_loss_kwargs detection based on **kwargs is too permissive
|
|
0
|
7
|
April 1, 2025
|
Limit mask size in Mask2Former results
|
|
1
|
11
|
April 1, 2025
|
Args in RewardConfig
|
|
1
|
8
|
April 1, 2025
|
FASTAI:TypeError: empty() received an invalid combination of arguments - got (tuple, dtype=NoneType, device=NoneType)
|
|
2
|
9
|
March 29, 2025
|
What is the most efficient way to dynamically change context mid-generation?
|
|
1
|
11
|
March 29, 2025
|
Optimize GPU Usage for Long-Context Training
|
|
2
|
16
|
March 28, 2025
|
The hidden_states when i use model.generate
|
|
4
|
1279
|
March 28, 2025
|
Fixing the random seed in the Trainer does not produce the same results across runs
|
|
5
|
16854
|
March 27, 2025
|
The size of tensor a (882) must match the size of tensor b (568) at non-singleton dimension 1
|
|
2
|
12
|
March 27, 2025
|
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:7 and cuda:0!
|
|
2
|
18
|
March 25, 2025
|
Unexpected behavior of load_best_model_at_end in Trainer (or am I doing it wrong?)
|
|
2
|
8
|
March 25, 2025
|
Load_best_model_at_end doesn't work?
|
|
1
|
63
|
March 25, 2025
|
Molformer model training error
|
|
6
|
13
|
March 25, 2025
|
Extract Attention Weights from a Specific Layer and Head Efficiently
|
|
1
|
13
|
March 25, 2025
|
Clarification on Commercial License Impact of LayoutLMv3ImageProcessor within UdopProcessor
|
|
0
|
6
|
March 24, 2025
|
Runtime Error: Cuda Initialization
|
|
13
|
50
|
March 24, 2025
|
Reasoning LLM Benchmarking
|
|
2
|
46
|
March 24, 2025
|
Web worker fails to process input data
|
|
1
|
7
|
March 22, 2025
|
Adding dropout in custom model, but setting dropout through .from_pretrained()
|
|
2
|
18
|
March 21, 2025
|
Multimodal training
|
|
4
|
32
|
March 21, 2025
|
Target branch/tag/commit for automatic Hub pushes
|
|
1
|
12
|
March 21, 2025
|
Two questions when I wraped the AutoModelForMaskedLM
|
|
7
|
19
|
March 21, 2025
|
One question is about the pretrain method in Transformer packge ?
|
|
1
|
202
|
March 19, 2025
|
Partially loss calculation with transformers LLM Trainer and DataCollator
|
|
1
|
56
|
March 19, 2025
|
Custom VLM - Swapping a vision encoder from a VLM
|
|
1
|
33
|
March 19, 2025
|
HFvalidationerror: Repo_id must be in the form repo_name
|
|
8
|
19680
|
March 19, 2025
|
Fine-Tuning a Mamba Model with using Hugging Face Transformers
|
|
1
|
36
|
March 18, 2025
|
TRL SFTTrainer 0.15 compute_token_accuracy error
|
|
2
|
45
|
March 18, 2025
|