Understanding DataCollation
|
|
0
|
14
|
July 18, 2024
|
Kosmos-2 batch modality and processing speed
|
|
0
|
19
|
July 18, 2024
|
Training out of memory
|
|
0
|
213
|
July 18, 2024
|
Tensor parallelism inference
|
|
0
|
62
|
July 18, 2024
|
Transformer Hangsup
|
|
0
|
48
|
July 17, 2024
|
Fine-tuning Decoder-only or Encoder-Decoder models for classification
|
|
0
|
620
|
July 17, 2024
|
Injecting multi modal embeddings into a language model breaks the `generate` function
|
|
0
|
55
|
July 17, 2024
|
Questions about outputs.logits,
|
|
0
|
342
|
July 17, 2024
|
Model is getting loaded unevenly with AutomodelforCasualLM
|
|
0
|
5
|
July 16, 2024
|
Model is getting loaded unevenly using AutomodelforCasualLM
|
|
0
|
4
|
July 16, 2024
|
Please remove the dependency "ipadic" because its own README says to not use it
|
|
0
|
7
|
July 16, 2024
|
My pytorch worked, but all of a sudden now has issues for Roberta
|
|
0
|
112
|
July 16, 2024
|
Image Regression (multivalue)
|
|
0
|
29
|
July 16, 2024
|
Two Whisper classes for generation but same functionalities?
|
|
2
|
201
|
July 16, 2024
|
Loading a specific model configuration in TGI
|
|
0
|
105
|
July 15, 2024
|
Run name issue, different run name file in webpage & local
|
|
0
|
55
|
July 15, 2024
|
OOM Error using PPO Trainer to LoRa-tune 4-bit Llama-3-8B Model
|
|
0
|
156
|
July 15, 2024
|
Online Decision Transformer
|
|
1
|
334
|
July 14, 2024
|
MLM Pretraining Domain Adaption
|
|
0
|
36
|
July 13, 2024
|
Adapt Decision Transformer collator to handle evaluation
|
|
1
|
232
|
July 13, 2024
|
Finetuning a small LLM on 32GB, 4vCPU
|
|
0
|
168
|
July 12, 2024
|
Are there any plans for replacing attention in transformers?
|
|
3
|
1001
|
July 11, 2024
|
The Impact of Pretraining on Fine-tuning and Inference
|
|
0
|
54
|
July 11, 2024
|
Bypassing "CUDA error: unspecified launch failure" error from trainer checkpoint loading
|
|
0
|
194
|
July 11, 2024
|
VivitModel last hidden states dimension Problem
|
|
0
|
47
|
July 11, 2024
|
Trainer predict or evaluate returns zero for metrics
|
|
0
|
52
|
July 11, 2024
|
Re-initialize decoder parameters of a pretrained model
|
|
0
|
60
|
July 11, 2024
|
Model is getting loaded unevenly on GPUs
|
|
1
|
49
|
July 11, 2024
|
Track multiple losses & different outputs size with Trainer and callbacks
|
|
4
|
3057
|
July 11, 2024
|
How to rewrite this code?
|
|
0
|
50
|
July 11, 2024
|