Models

Topic	Replies	Views	Activity
codellama/CodeLlama-70b-Instruct-hf TGI server out-of-memory error in H100	2	289	March 22, 2024
yarn-mistral-7b-128k.Q8_0.gguf response seems out of control	0	260	March 21, 2024
torch.cuda.OutOfMemoryError for CodeLlama models in H100 single GPU inference	2	512	March 21, 2024
What is best LLM model for text classification using few shot learning?	0	412	March 21, 2024
Which PR to use for safetensors?	1	138	March 20, 2024
Llama2 prompt template for finetuning on text summaraization/generation	0	318	March 20, 2024
Which LLM model to use for Freeciv 3D?	0	164	March 20, 2024
How much time facebook/wav2vec2-xls-r-300m model will take to train on 311919 size of dataset?	0	91	March 20, 2024
Failed to Import transformers.models	5	24630	March 20, 2024
Wrror while accessing pretrained model with auth token	0	145	March 19, 2024
Qwen/Qwen1.5-72B-Chat	1	367	March 19, 2024
504 Gateway Timeout - LLaVA type GGUF models	3	252	March 19, 2024
Why is there no cross-gpu negative sample gathering for CLIP model in multiple-gpu training?	2	181	March 18, 2024
Plateau in Eval Loss after 100 steps in DPO Training	0	290	March 17, 2024
Llama2-7b-hf model not reproducible across runs	1	517	March 15, 2024
Mixtral 8x7B or any LLM evaluation	0	184	March 15, 2024
HTTP 502 Bad Gateway for url	2	4660	March 13, 2024
OS Error:Unable to load model distil-whisper/distil-small.en	0	564	March 12, 2024
String indices must be integers in BertPreTrainedModel	0	278	March 12, 2024
Bert with different layer architecture (Monarch Mixer) without pretrained weights	2	174	March 12, 2024
Looking for LLM about cancer biology pathways	0	95	March 11, 2024
Text Classification: pretrained transformer model Distilbert with tweet_eval irony dataset	1	236	March 8, 2024
LLaMa2 fine-tuning: Multi-turn conversation dataset template	2	5528	March 6, 2024
Unable to deploy the models on higginface	0	134	March 6, 2024
Mistral 7B FineTuning with Interview Data	4	6243	March 5, 2024
429 client error when uploading large models (65B)	0	463	March 5, 2024
Llama position_ids	2	2631	March 5, 2024
Timestamps reduce Whisper hallucinations?	1	888	March 5, 2024
Custom backbone of Owl-ViT model	0	200	March 5, 2024
Convert PyTorch Model to Hugging Face model (Inference API)	0	1184	March 5, 2024