Intermediate

Topic	Replies	Views	Activity
Pre-training LayoutLMv2	0	660	November 16, 2021
Facebook FAISS on Databricks	2	656	July 8, 2025
Baffling performance issue on most NVidia GPUs with simple transformers + pytorch code	5	4602	April 9, 2024
Past_key_value with multiple new tokens	1	1377	August 10, 2023
502 server error when running model	3	5467	July 4, 2023
How to customize behavior of added special tokens in a pretrained tokenizer?	0	609	May 5, 2021
Deepspeed integration with Trainer in Colab crashing: TypeError: object.__init__() takes exactly one argument (the instance to initialize)	2	1954	October 1, 2023
Retraining peft model	3	2970	March 1, 2024
How to train the embedding of special token?	1	4174	October 17, 2021
Saving model per some step when using Trainer	3	9311	December 11, 2023
TextIteratorStreamer compatibility with batch processing	3	1482	December 6, 2024
You should probably TRAIN this model on a down-stream task with BertForQuestionAnswering	3	8270	November 21, 2023
Custom trainer evaluation function	0	2814	June 20, 2022
Error saving quantized model	4	3957	February 16, 2023
SAM image size for fine-tuning	5	6291	April 3, 2024
How to use SentenceTransformers for contrastive learning?	5	5922	June 30, 2022
404 when instantiating private model/tokenizer	1	10060	March 5, 2021
Fine-tuning Zero-shot models	4	6357	February 7, 2023
Transformer vs Sentence-Transformer for text classification	0	2446	March 12, 2024
Fine tuning LLM for text classification -- error with SFTTrainer	2	1395	June 3, 2025
New Project - Echo Nova	3	117	September 3, 2025
How to deal with differences between CoNLL 2003 dataset tokenisation and BER tokeniser when fine tuning NER model?	6	2752	November 23, 2021
BART - Input format	4	1794	December 13, 2023
How is CLS special token embedding initialized?	1	2823	March 16, 2022
Torchrun, trainer, dataset setup	4	997	December 20, 2024
Running huggingface-cli from script	2	3950	May 2, 2022
Resuming training BERT from scratch with run_mlm.py	2	2212	October 31, 2021
Accessing model from a callback to predict between epochs	1	1498	August 17, 2023
HTML Embedding processing	8	3967	February 13, 2022
Unable to load checkpoint after finetuning	5	4810	February 21, 2024
Stucked on "Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding."	0	2083	June 27, 2023
Conversational AI + question answering model	5	2676	January 30, 2023
Multilabel classification performance metrics using Trainer API	3	5763	September 26, 2023
Parallel/ Concurrent request with vLLM	3	3231	November 27, 2024
Remove causal mask from Llama decoder	5	821	October 22, 2024
Finding gradients in zero-shot learning	4	2837	November 17, 2020
Need help with making a Nepali-to-English Translator	0	354	November 16, 2023
Tokenization: different results when tokenizing in one pass vs sample-by-sample	3	1762	October 23, 2023
Causal masks in BERT vs. GPT2	4	2748	December 30, 2022
Downloading larger models with xet fails on macOS	3	931	June 5, 2025
Resize embeddings on Peft model	4	831	May 12, 2025
Save LORA weights only in intermediate checkpoints	0	1838	June 14, 2023
Manual splitting of model across multi-GPU setup	1	4100	December 29, 2023
Finetuning Segment Anything and automatic prediction	2	5837	June 7, 2023
DeBERTa-v3: How to keep ELECTRA-style task-head?	5	2283	January 10, 2024
Fine tuning bert on next sentence prediction task	5	4053	September 30, 2020
Pruning a model embedding matrix for memory efficiency	7	3494	July 27, 2022
Multi-Task dataset with Custom Sampler and Sharding	4	1382	August 1, 2023
[Solved] Cannot restart training from deepspeed checkpoint	3	2727	December 28, 2023
Continue Pre-Training Roberta	3	2724	May 18, 2023