Intermediate

Topic	Replies	Views	Activity
Read data of pdf or just image format as a part of promt	0	1351	May 29, 2023
Adding a new mask_token for BERT-like models/tokenizers	0	551	May 26, 2023
Does Trainer.train repeat streaming dataset when max_steps is not reached?	0	384	May 26, 2023
How to save model in S3 with Trainer?	5	5130	May 26, 2023
DeepSpeed Zero3 and Peft LoRA fp16 issue	3	3025	May 24, 2023
Comparing Inference Instances for Text Embedding and Completion Tasks	1	341	May 23, 2023
Creating an Instruction-to-Code model for a custom library: Strategies and Guidelines?	0	312	May 23, 2023
How can state-of-the-art classifiers be so wrong?	13	1639	May 22, 2023
Sockpuppet detector based on NLP: where to start?	0	214	May 21, 2023
Interpreting logs by the trainer	1	916	May 19, 2023
Mismatched Tokenizer and LLM leading to odd evaluation result	0	358	May 18, 2023
Continue Pre-Training Roberta	3	2715	May 18, 2023
How does DDP + huggingface Trainer handle input data?	3	1039	May 18, 2023
Timm & HuggingFace	0	195	May 16, 2023
Sampling strategies	1	572	April 4, 2023
A specific documents AI API for Hugging Face?	0	229	May 12, 2023
How to fine tune BertForSequenceClassification with PEFT?	0	951	May 10, 2023
Learning sets and disabling positional embedding knowledge?	0	296	May 10, 2023
Finetuned MT5 model generating the same first token for any input	0	231	May 9, 2023
Import HuggingFace PatentSBERTa Model support in EMR and PySpark	0	254	May 8, 2023
Machine Translation using Hugging Face problem	0	323	May 8, 2023
Using Transformers with DistributedDataParallel — any examples?	11	23507	May 8, 2023
Fine-tuning with LoRA; can't learn	0	1056	May 7, 2023
Implementing one prompt recommender	0	234	May 3, 2023
Train Roberta from scratch for custom dataset	1	946	May 2, 2023
Typical sampling decoding technique	1	1679	April 28, 2023
ValueError: Unable to generate dummy inputs for the model. Please provide a tokenizer or a preprocessor	0	559	April 28, 2023
Plotting separate loss curves for different datasets	0	262	April 28, 2023
Training large language models to consider two texts to generate output text	0	209	April 26, 2023
Generation is always CPU limited	0	585	April 21, 2023