Time Series Transformer. Lagged values and time alignment
|
|
4
|
1027
|
December 14, 2023
|
Domain adaptation transformer
|
|
2
|
1316
|
April 21, 2021
|
ByT5: problem with tokenizer.decode()
|
|
3
|
1134
|
October 15, 2021
|
Trainer fails to resume training from a checkpoint, claiming there's not enough samples in the dataset
|
|
1
|
1602
|
May 29, 2023
|
TRL SFT super prone to nan when using data collator
|
|
2
|
1298
|
April 27, 2024
|
Deploying 🤗 ViT on Vertex AI
|
|
1
|
889
|
September 25, 2023
|
Padding to the left of the inputs, GPT2LMHeadModel gives different answer
|
|
2
|
1280
|
February 21, 2023
|
Ensemble Learning with various BERT models
|
|
1
|
1566
|
February 25, 2023
|
Negative KL-divergence RLHF implementation
|
|
1
|
1547
|
May 13, 2024
|
How to ignore attributes of TrainingArguments?
|
|
4
|
965
|
July 30, 2021
|
Run_backward: expected dtype Float but got dtype Long
|
|
4
|
964
|
July 3, 2024
|
Issue in deploying quantized meta-llama/Llama-3.1-8B-Instruct in aws sagemaker
|
|
0
|
68
|
October 10, 2024
|
Although doing RAG does it worth fine tuning the LLM on the documents? - Llama2
|
|
1
|
1520
|
September 14, 2023
|
Am I doing multiple GPU right?
|
|
8
|
401
|
November 29, 2024
|
Different outputs when using pipeline
|
|
2
|
1226
|
July 20, 2023
|
Speech synthesis model with Styles Like Emoticons or emphasis
|
|
3
|
186
|
December 25, 2024
|
Does batching in the standard question-answering pipeline provide a speedup?
|
|
1
|
1471
|
December 13, 2021
|
How can I make a Img2Text transformer using the existent modules?
|
|
1
|
821
|
October 21, 2021
|
How does DDP + huggingface Trainer handle input data?
|
|
3
|
1028
|
May 18, 2023
|
How does generation work with compute_metrics
|
|
0
|
365
|
December 9, 2023
|
Using datacollator for multi-task training
|
|
2
|
1184
|
January 24, 2022
|
BART generate() output not related to input
|
|
1
|
812
|
February 17, 2022
|
How to improve summarization?
|
|
2
|
1178
|
August 1, 2021
|
LayoutLM data format for bounding box classification
|
|
1
|
256
|
February 13, 2025
|
Train MLM on my own domain and fine tune on downstream classification task
|
|
3
|
1010
|
April 16, 2024
|
Load a single GPU checkpoint to 2 GPUS (deepspeed)
|
|
0
|
1987
|
June 29, 2022
|
Ai chatbot for lms
|
|
1
|
247
|
March 5, 2025
|
T5 cross-attention - inconsistent results
|
|
1
|
1382
|
May 10, 2021
|
NER - aggregation_strategy
|
|
1
|
1377
|
January 24, 2024
|
Gradio Error: UndefinedError: 'str object' has no attribute 'role'
|
|
1
|
1367
|
February 29, 2024
|
LayoutLMv3 Inference
|
|
2
|
1110
|
March 11, 2024
|
Blip2 peft training
|
|
2
|
198
|
May 9, 2025
|
How to load quantized LLM to CPU only device
|
|
0
|
1916
|
January 28, 2024
|
Training a language model from scratch with tensorflow (not pytorch)?
|
|
4
|
852
|
August 9, 2021
|
Using alpaca with local embedding
|
|
1
|
1347
|
July 19, 2023
|
Saving/Loading custom model build from varying HF models
|
|
1
|
1346
|
March 20, 2023
|
Xlm-roberta-base predicting always same class, other models don't
|
|
2
|
1098
|
June 7, 2023
|
Pytorch like loop for finetuning whisper
|
|
0
|
337
|
July 6, 2023
|
Deploying Whisper Based Live Transcription for 1000 Concurrent users
|
|
0
|
336
|
June 1, 2024
|
How to embed relational information in a Transformer?
|
|
2
|
611
|
August 5, 2022
|
MarianMt translation issue
|
|
1
|
415
|
January 2, 2021
|
How to fine-tune a subset of the vocabulary?
|
|
0
|
326
|
April 29, 2021
|
Data Conversion to Conll2003
|
|
4
|
810
|
December 28, 2023
|
Which VLM is best for defect detection in images
|
|
0
|
322
|
November 6, 2024
|
Write user-inputted data from app to csv in space directory
|
|
0
|
312
|
March 7, 2023
|
Creating A Team Of LLMs
|
|
2
|
182
|
February 6, 2025
|
Finetuning from multiclass to mutlilabel
|
|
4
|
778
|
September 1, 2021
|
Setting local path for Dataset: Fine-tuning Whisper model
|
|
2
|
989
|
November 13, 2023
|
Which EPYC CPU for inferencing? Self-hosted build
|
|
1
|
680
|
December 17, 2024
|
CUDA Runtime Error in the Middle of Training
|
|
1
|
1197
|
March 30, 2024
|