Whisper warning about not predicting end of a timestamp
|
|
3
|
1593
|
September 16, 2025
|
Entropy-Based Self-Reflective Learning Framework for Language Models
|
|
0
|
3
|
September 15, 2025
|
How to get hugging face models running on vscode pluggin
|
|
5
|
5473
|
September 15, 2025
|
No errors but no output
|
|
5
|
19
|
September 10, 2025
|
Problem with Compute Metrics function
|
|
3
|
8
|
September 9, 2025
|
Layoutlmv3 word_labels does not match original labels from dataset
|
|
3
|
5
|
September 9, 2025
|
How to visualize the attention map of my Segformer model?
|
|
3
|
1386
|
September 8, 2025
|
Correct way to save/load adapters and checkpoints in PEFT
|
|
10
|
15851
|
September 8, 2025
|
API error for model sentence-transformers/all-MiniLM-L6-v2
|
|
8
|
48
|
September 4, 2025
|
Model Selection to convert Prompt to Json Object
|
|
1
|
4
|
September 4, 2025
|
Error Importing Seq2SeqTrainer
|
|
2
|
10
|
September 3, 2025
|
Batch generation Llama 3 Instruct | Tokenizer has no padding token
|
|
4
|
15
|
September 3, 2025
|
From TLinFormer to TConstFormer: The Leap to Constant-Time Transformer Attention: Achieving O(1) Computation and O(1) KV Cache during Autoregressive Inference
|
|
0
|
9
|
September 3, 2025
|
Dequantize 4bit B&B model to prepare for merging
|
|
4
|
22
|
September 2, 2025
|
How can artificial intelligence (AI) and machine learning (ML) make kids learning apps feel more personalized (tailored to each childâs needs), while also making sure the content is safe and suitable for their age?
|
|
0
|
7
|
September 1, 2025
|
TangLinFormer: A Revolutionary Breakthrough in Achieving True Linear Attention for Transformers
|
|
2
|
67
|
September 1, 2025
|
Which data parallel does trainer use? DP or DDP?
|
|
6
|
6428
|
August 30, 2025
|
Speed issues using tokenizer.train_new_from_iterator on ~50GB dataset
|
|
8
|
2325
|
August 29, 2025
|
Bert2bert translator?
|
|
6
|
44
|
August 28, 2025
|
Cannot Login to HF
|
|
1
|
15
|
August 26, 2025
|
Cannot import name '_resolve_process_group' from 'torch.distributed.distributed_c10d'
|
|
3
|
20
|
August 27, 2025
|
Is the reported loss averaged over logging steps
|
|
3
|
624
|
August 25, 2025
|
Issue with CamemBERT tokenizer â inconsistency with subword prefix (â) between pre-tokenization and training
|
|
1
|
11
|
August 25, 2025
|
Default models for pipeline tasks
|
|
8
|
2032
|
August 24, 2025
|
Calibrate Probabilities for Transformers Classifier
|
|
1
|
91
|
August 24, 2025
|
T5 model error: accessing variable that has not been defined
|
|
1
|
23
|
August 22, 2025
|
Using Trainer class + 4/8 bit quantised model for prediction
|
|
1
|
253
|
August 22, 2025
|
Can a Model Learn to Generate Better Augmented Data?
|
|
0
|
14
|
August 20, 2025
|
How to extract tables from images using Hugging Face models?
|
|
2
|
402
|
August 19, 2025
|
Best way to extend vocabulary of pretrained model?
|
|
5
|
2867
|
August 18, 2025
|