Looking for a Tiny LLM (max 1.5GB) – Need Advice
|
|
6
|
5678
|
December 6, 2024
|
Improving precision of ViT for image classification
|
|
0
|
69
|
December 6, 2024
|
Tumblr Ücretsiz Yönlendirme Scripti
|
|
2
|
39
|
December 9, 2024
|
BERT Model - OSError
|
|
3
|
4909
|
December 6, 2024
|
Using detr with custom backbone
|
|
3
|
575
|
December 6, 2024
|
Evaluation metrics for BERT-like LMs
|
|
4
|
4584
|
December 6, 2024
|
Pretraining T5 from scratch using MLM
|
|
1
|
376
|
December 6, 2024
|
Albert pre-train convergence problem
|
|
1
|
631
|
December 6, 2024
|
LLMA model using Hugging Face: Getting no access
|
|
1
|
110
|
December 6, 2024
|
How can I generate the exact same image twice using AI image generation tools?
|
|
2
|
743
|
December 6, 2024
|
Why the memory usage is higher than expected when loading nvidia/NV-Embed-v2 model with FP16 precision?
|
|
0
|
88
|
December 6, 2024
|
TextIteratorStreamer compatibility with batch processing
|
|
3
|
1389
|
December 6, 2024
|
Can we run custom quantized llama3-8b on Npu?
|
|
0
|
55
|
December 6, 2024
|
Fine tune "meta-llama/Llama-2-7b-hf" Bug:RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0! (when checking argument for argument target in method wrapper_CUDA_nll_loss_forward)
|
|
15
|
140
|
December 6, 2024
|
Need a Model for Extracting Relevant Keywords for Given Titles
|
|
1
|
303
|
December 6, 2024
|
Why does moving ML model initialization into a function prevent GPU OOM errors when del, gc.collect(), and torch.cuda.empty_cache() fail?
|
|
0
|
71
|
December 5, 2024
|
DDP error for LoRA SFT
|
|
1
|
137
|
December 5, 2024
|
Trainer is not saving all layers when fine-tuning Llama with P-Tuning
|
|
0
|
44
|
December 5, 2024
|
Pretrained Models to Heroku Production Environment
|
|
5
|
1820
|
July 10, 2020
|
Searching Keywords by relatively long text
|
|
1
|
669
|
December 5, 2024
|
Computational needs for AI/ML Researchers
|
|
0
|
23
|
December 5, 2024
|
Create batch from list of ids in the dataset is very slow
|
|
4
|
856
|
December 5, 2024
|
Understanding GPT-2 logits
|
|
0
|
46
|
December 5, 2024
|
Parser Error, ERROR: Exception in ASGI application
|
|
2
|
712
|
December 5, 2024
|
UniDecodeError: 'charmap' codec can't decode byte from Load_dataset
|
|
0
|
44
|
December 5, 2024
|
Bad Performance Finetuning Llama Chat and Instruct Models on GSM8K
|
|
5
|
862
|
December 5, 2024
|
Make 5 minute video and speech from text story
|
|
0
|
62
|
December 5, 2024
|
How to log Trainer's training progress bars into a file
|
|
2
|
1723
|
December 5, 2024
|
This for creating a ai model for myself
|
|
0
|
22
|
December 5, 2024
|
Help Making most logical and rational thinking AI
|
|
0
|
35
|
December 5, 2024
|