Models

Topic	Replies	Views	Activity
Access permission denied for the llama3 model application	0	78	May 7, 2024
Getting weird results from roberta new	0	50	May 7, 2024
Is this the correct way to perform an unsupervised training for LLM?	4	232	May 7, 2024
Model Inference API error	7	217	May 7, 2024
Getting the following error "valueError: You have to specify either decoder_input_ids or decoder_inputs_embeds"	2	89	May 6, 2024
Max_new_tokens warning for Flan-T5 fine-tuning	2	114	May 6, 2024
Num_experts_per_tok for MoE models	0	60	May 6, 2024
Chat UI With LLAMA 2	1	587	May 6, 2024
What to do when HuggingFace throws "Can't load tokenizer"	8	32166	May 5, 2024
Git clone/lfs broken for certain models	2	2871	May 4, 2024
Is there any fine-tuned model for article writing to dupe AI detectors?	0	103	May 3, 2024
AWQ quantized version of Llama 3 8B ChatQA	0	72	May 3, 2024
Question about RecurrentGemma intermediate_size	0	68	May 3, 2024
Output of Pyramid Vision Transformer	0	60	May 3, 2024
Gemma 2b model loading issue	0	84	May 3, 2024
Dropping columns for DPOTrainer logging	0	50	May 2, 2024
Download speeds slow on the popular Models	0	166	May 2, 2024
WhisperTokenizer bos_token appears incorrect	1	225	May 2, 2024
Problem for large context window (400k)	1	144	May 2, 2024
What is the difference between llama2_7B and llama2_7B_hf?	0	76	May 2, 2024
Permission error on model llama2_7B	2	89	May 2, 2024
Chat Usage Error - "Input validation error"	0	172	May 2, 2024
Warm start with BigBird	4	424	May 2, 2024
Is it possible to train ViT with different number of patches in every batch? (Non-square images dataset)	3	1845	May 1, 2024
Fine tune LLMs on PDF Documents	8	3025	May 1, 2024
Loss.backward() producing nan values with 8-bit Llama-3-70B-Instruct	3	118	May 1, 2024
Llama3 incomplete answer	1	107	May 1, 2024
Mistral or LLaMA?	3	356	May 1, 2024
Memory Error While Fine-tuning AYA on 8 H100 GPUs	0	59	April 30, 2024
Getting empty response from meta-llama/Meta-Llama-3-8B	0	132	April 30, 2024