Intermediate

Topic	Replies	Views	Activity
Outputting relevance scores	0	541	September 25, 2020
Text generation, text2text: change output vocabulary, output distribution dimensions	0	538	March 11, 2021
How to frozen the attention map in BERT	0	536	October 6, 2021
Error making predictions using LMM (LLaVA) model on multiple GPUs	0	536	March 27, 2024
Making a dataset that read the labels from parent folders	0	535	December 2, 2021
ASR on inference endpoints	1	378	February 11, 2024
How do I create a commercially usable workflow that can accurately swap faces on ComfyUI?	0	95	March 3, 2025
How big are differences between transformer implementations	0	532	April 26, 2022
Extending the tokenizer affects model generation	3	149	December 19, 2024
Pretrain encoder of tf T5 model	0	529	October 19, 2020
Modify HF model for training	1	374	December 22, 2023
Docker image "THIS IMAGE IS DEPRECATED and is scheduled for DELETION." message	0	94	January 6, 2025
mBART embedding matrix prunning	0	527	May 11, 2021
Downloading larger models with xet fails on macOS	3	149	June 5, 2025
Error while Fine tuning Zero shot classification model fb-bart-large-mnli	0	521	June 6, 2023
BART: get activation maps for encoder and decoder	0	521	November 3, 2021
How can I implement active learning in BERT?	0	520	November 9, 2021
Huggingface using only half of the cores for inference	0	517	September 6, 2023
NER - Lab Reports, Vitals	0	517	March 1, 2022
How to Implement Numerical Inference in a Text Generation Problem	0	516	May 17, 2022
Text to text classification	0	515	March 12, 2022
Best multi-GPU setup for finetuning and inference?	0	513	July 3, 2024
Windows 11 does not see my 2nd GPU (4090 + 4080)	2	296	October 3, 2024
MRPC Reproducibility with transformers-4.1.0	1	362	December 20, 2020
Possible error in Dataset elasticsearch	0	508	May 20, 2022
AOTInductor with Llama-3.2-3B-Instruct	0	90	November 14, 2024
Hugging Face and Distributed Training: DDP/DP Implementation Help Needed	0	506	February 14, 2024
Treating Punctuatio restoration as Seq2Seq task	0	506	December 11, 2020
[Transformer] how to tokenize nested object dataset?	0	498	November 23, 2022
How to use huge target data without source data	0	497	May 2, 2022
ValueError: expected sequence of length 128 at dim 1 (got 68)	0	496	October 25, 2023
Fine tuning for summarization script error	0	495	August 24, 2022
Additional pre-training objective function	0	495	July 3, 2021
What can you fine tune with 2x A6000s?	1	350	December 5, 2023
Does higher work with huggingface (hugging face, HF) models? e.g. ViT?	1	350	March 19, 2023
Input batch size not matching Target batch size	0	88	October 26, 2024
Peft following bits and bytes seems to have no effect on LLM	0	493	January 31, 2024
Looking for a simple tutorial on how to fine tune a model for relation extraction	0	493	July 2, 2022
Seq2SeqTrainingArguments due to missing accelerate library which is actually installed	3	246	March 17, 2024
LogitsProcessor guide	0	491	July 18, 2022
Roberta For Urdu Text Classification	0	489	December 23, 2021
How to compute metrics when ground truth and predictions have different data?	0	487	January 4, 2023
Unavailable wav2vec2 tokenizer	0	487	December 10, 2021
Not able to predict using Transformers Trainer class	2	158	October 2, 2024
Multi GPU traning with Accelerator vs Trainer	2	158	August 6, 2024
SportsBot Training Data and Modal	1	344	May 30, 2023
Need advise for fine-tuning BERT on opinion mining	0	486	October 25, 2021
How to ensure my custom Trainer is using my custom TrainerState and TrainerControl?	1	343	June 14, 2024
Tokenizer for Translation Pipeline with Bert2Bert EncoderDecoder	0	484	February 23, 2022
Fine-tune MT5ConditionalGeneration for question generation	0	483	January 4, 2022