Models

Topic	Replies	Views	Activity
Dropping columns for DPOTrainer logging	0	87	May 2, 2024
WhisperTokenizer bos_token appears incorrect	1	335	May 2, 2024
What is the difference between llama2_7B and llama2_7B_hf?	0	279	May 2, 2024
Permission error on model llama2_7B	2	222	May 2, 2024
Chat Usage Error - "Input validation error"	0	639	May 2, 2024
Warm start with BigBird	4	455	May 2, 2024
Is it possible to train ViT with different number of patches in every batch? (Non-square images dataset)	3	3103	May 1, 2024
Loss.backward() producing nan values with 8-bit Llama-3-70B-Instruct	3	802	May 1, 2024
Llama3 incomplete answer	1	318	May 1, 2024
Mistral or LLaMA?	3	3945	May 1, 2024
Memory Error While Fine-tuning AYA on 8 H100 GPUs	0	221	April 30, 2024
Finetune SAM for instance segmentation to output segmenatation masks along with label names	0	238	April 30, 2024
Regarding GGUF Quantize model	0	169	April 30, 2024
Fine tune of Mistral model	0	104	April 30, 2024
Comparision of text documents using AlpacaEval	0	83	April 29, 2024
Llama3 Response	2	715	April 29, 2024
Repetitive Generations	0	280	April 29, 2024
HuggingFace - Why does the T5 model shorten sentences?	2	749	April 28, 2024
What are the best prompt practices for fine-tuning a T5	1	1150	April 28, 2024
Pretrained NER model for recognizing business names	2	3171	April 27, 2024
MuRIL model error for tensorflow hub	0	99	April 27, 2024
Language model adds hashtags at the end of responses	1	411	April 27, 2024
Llama-2-7b-chat fine-tuning	4	6853	April 26, 2024
Why can't I find a better model?	1	110	April 25, 2024
No Improvement in Results after Implementing Unsupervised Denoising Training Technique for T5 Model using Hugging Face	0	122	April 25, 2024
Basic questions about padding in the Original ViT	0	265	April 25, 2024
Hosting of multiple models on a single TGI instance	0	428	April 24, 2024
Mistral-text-classification	1	472	April 23, 2024
SOTA in Open Source Document Understanding	1	1124	April 23, 2024
RoBERTa for Sentence-pair classification	2	2004	April 23, 2024