Intermediate

Topic	Replies	Views	Activity
No dynamic sized input with huggingface-transformers ALBERT and TFjs	0	1012	October 1, 2020
How to efficiently tokenize unknown tokens in GPT2	0	1006	January 12, 2022
Huggingface endpoint with chat agents for conversational NLU	0	178	February 1, 2024
Class weights in Trainer() instance	1	697	September 10, 2021
Running SDXL diffusers in a container on python running ubuntu 2204, system RAM not being released	0	978	November 27, 2023
Invalid key for dataset -- is this a bug with Trainers or with my code?	1	689	July 24, 2023
FlaxGPTNeoForCausalLM generates the same text regardless of seed, temperature, top_k and top_p values	1	387	September 22, 2021
Train and inference wav2vec2 using a language model	1	681	May 2, 2021
My model doesn't learn with my triplet loss	3	49	April 22, 2025
Implementing the REINFORCE algorithm for encoder-decoder model	1	676	March 14, 2022
AutoModelForCausalLM.from_pretrained refuses to load safetensors weights	0	944	December 5, 2023
How to properly train BEiT for Masked Image Modeling	0	942	March 7, 2022
Params stored in the GPU during training	1	666	April 27, 2022
How to fine tune BertForSequenceClassification with PEFT?	0	936	May 10, 2023
V100 or RTX A6000	0	933	June 2, 2021
Refiner SD-XL-1.0 is degraded latent of base model	1	659	October 11, 2023
How to correctly measure inference time?	0	930	July 25, 2022
Custom bert embedding cause "RuntimeError: Trying to backward through the graph a second time"	0	922	March 10, 2023
Why is using my DistilBERT model for inference so slow?	0	920	June 18, 2021
Pre-trained models that weren't trained on Wikipedia?	2	530	February 10, 2022
Unable to run Optuna hyperparam search	0	917	July 23, 2021
Multi-GPU support lost when overwriting functions for Custom Trainer	1	647	March 5, 2023
Transfer Learning on yolov8 object detection weights	1	364	October 10, 2024
Different results for the same mrm8488/t5-base-finetuned-emotion	1	645	May 20, 2022
Evaluating my own model	6	110	February 21, 2025
What could be causing " line 51, in write_predictions_to_file if not preds_list[example_id]: IndexError: list index out of range" in token-classification?	2	526	October 13, 2020
How to train TFBertForMaskedLM with TFTrainer	1	643	February 23, 2022
Get original image from trocr processor	1	642	October 10, 2022
GPT4all in a personal server to be access by many users	0	905	September 19, 2023
BART summarization token probabilities	0	903	October 8, 2021
Cuda out of memory issue training whisper model on single GPU	0	902	December 15, 2023
Problem with transformer Trainer with torch CustomDataset, during fine-tuning	3	450	September 12, 2024
Track more than one loss using Trainer and Wandb	1	636	July 11, 2024
Can BERT for mlm predict never seen words?	1	636	December 29, 2021
Regenerate Prompt tuning result with appended prompt on base model	0	881	August 6, 2023
Weighed Loss Function in Regression Task	1	621	April 6, 2024
Split compound words (windfall = wind + fall)	2	504	January 21, 2022
Is there a standard way to handle leftover batches when using gradient accumulation?	1	615	November 22, 2021
How to compile Sentence Transformer with Torch-TensorRT?	0	869	August 29, 2022
Evaluating RAG only with open-source	1	612	May 24, 2024
Guidance on Using Zero, Token, and Gradio API Together	1	108	December 14, 2024
Huggingface.co again scam poeple	1	605	April 6, 2023
How to avoid re-decoding for multiple inputs that have shared prefixes	0	152	April 17, 2024
BART-base generating completely wrong output after training for more than 3 epochs	0	854	July 8, 2021
🔬 Exploring Reinforcement Learning for Molecule Generation with GPT-Based Models; Loss Fluctuations	2	277	April 11, 2024
Grouping Tokens after Token Classification	1	598	January 6, 2022
Contributing to Github	2	488	September 22, 2020
T5 Fine-Tuning for summarization with multiple GPUs	0	844	June 28, 2022
Comparing Inference Instances for Text Embedding and Completion Tasks	1	335	May 23, 2023
Passing Trainer state as an artifact in kfp.v2 pipeline	1	335	June 28, 2021