Intermediate

Topic	Replies	Views	Activity
Time Series Transformer. Lagged values and time alignment	4	1027	December 14, 2023
Domain adaptation transformer	2	1316	April 21, 2021
ByT5: problem with tokenizer.decode()	3	1134	October 15, 2021
Trainer fails to resume training from a checkpoint, claiming there's not enough samples in the dataset	1	1602	May 29, 2023
TRL SFT super prone to nan when using data collator	2	1298	April 27, 2024
Deploying 🤗 ViT on Vertex AI	1	889	September 25, 2023
Padding to the left of the inputs, GPT2LMHeadModel gives different answer	2	1280	February 21, 2023
Ensemble Learning with various BERT models	1	1566	February 25, 2023
Negative KL-divergence RLHF implementation	1	1547	May 13, 2024
How to ignore attributes of TrainingArguments?	4	965	July 30, 2021
Run_backward: expected dtype Float but got dtype Long	4	964	July 3, 2024
Issue in deploying quantized meta-llama/Llama-3.1-8B-Instruct in aws sagemaker	0	68	October 10, 2024
Although doing RAG does it worth fine tuning the LLM on the documents? - Llama2	1	1520	September 14, 2023
Am I doing multiple GPU right?	8	401	November 29, 2024
Different outputs when using pipeline	2	1226	July 20, 2023
Speech synthesis model with Styles Like Emoticons or emphasis	3	186	December 25, 2024
Does batching in the standard question-answering pipeline provide a speedup?	1	1471	December 13, 2021
How can I make a Img2Text transformer using the existent modules?	1	821	October 21, 2021
How does DDP + huggingface Trainer handle input data?	3	1028	May 18, 2023
How does generation work with compute_metrics	0	365	December 9, 2023
Using datacollator for multi-task training	2	1184	January 24, 2022
BART generate() output not related to input	1	812	February 17, 2022
How to improve summarization?	2	1178	August 1, 2021
LayoutLM data format for bounding box classification	1	256	February 13, 2025
Train MLM on my own domain and fine tune on downstream classification task	3	1010	April 16, 2024
Load a single GPU checkpoint to 2 GPUS (deepspeed)	0	1987	June 29, 2022
Ai chatbot for lms	1	247	March 5, 2025
T5 cross-attention - inconsistent results	1	1382	May 10, 2021
NER - aggregation_strategy	1	1377	January 24, 2024
Gradio Error: UndefinedError: 'str object' has no attribute 'role'	1	1367	February 29, 2024
LayoutLMv3 Inference	2	1110	March 11, 2024
Blip2 peft training	2	198	May 9, 2025
How to load quantized LLM to CPU only device	0	1916	January 28, 2024
Training a language model from scratch with tensorflow (not pytorch)?	4	852	August 9, 2021
Using alpaca with local embedding	1	1347	July 19, 2023
Saving/Loading custom model build from varying HF models	1	1346	March 20, 2023
Xlm-roberta-base predicting always same class, other models don't	2	1098	June 7, 2023
Pytorch like loop for finetuning whisper	0	337	July 6, 2023
Deploying Whisper Based Live Transcription for 1000 Concurrent users	0	336	June 1, 2024
How to embed relational information in a Transformer?	2	611	August 5, 2022
MarianMt translation issue	1	415	January 2, 2021
How to fine-tune a subset of the vocabulary?	0	326	April 29, 2021
Data Conversion to Conll2003	4	810	December 28, 2023
Which VLM is best for defect detection in images	0	322	November 6, 2024
Write user-inputted data from app to csv in space directory	0	312	March 7, 2023
Creating A Team Of LLMs	2	182	February 6, 2025
Finetuning from multiclass to mutlilabel	4	778	September 1, 2021
Setting local path for Dataset: Fine-tuning Whisper model	2	989	November 13, 2023
Which EPYC CPU for inferencing? Self-hosted build	1	680	December 17, 2024
CUDA Runtime Error in the Middle of Training	1	1197	March 30, 2024