Load_in_8bit requires device_map but also does not support it
|
|
0
|
2784
|
December 19, 2022
|
How to log text with trainer's tensorboard tracking
|
|
0
|
316
|
December 18, 2022
|
Why there are other results with the same seed for Transformers?
|
|
0
|
334
|
December 18, 2022
|
Transformer generate function got low GPU utilization
|
|
1
|
839
|
December 18, 2022
|
Quantum Transformer?
|
|
0
|
1512
|
December 18, 2022
|
GPU memory error when trying to fine tune the whisper model with a custom data set
|
|
0
|
733
|
December 17, 2022
|
Using Padding for ASR models
|
|
0
|
329
|
December 16, 2022
|
I want a model that takes in a tweet and outputs about 5 modifications to the tweet
|
|
2
|
320
|
December 16, 2022
|
Negative "cross entropy" loss function
|
|
0
|
1545
|
December 15, 2022
|
MMBTForClassification to torchscript
|
|
0
|
238
|
December 12, 2022
|
Expected workflow -100 and padding in labels in seq2seq?
|
|
0
|
756
|
December 12, 2022
|
Run crash with all GPU's and success with less
|
|
0
|
421
|
December 12, 2022
|
Incremental Training using run_mlm.py
|
|
0
|
304
|
December 12, 2022
|
I need help with a transformers error
|
|
14
|
9613
|
December 12, 2022
|
Incremental training on unlabeled data using MLM
|
|
0
|
635
|
December 10, 2022
|
Train T5/BART to convert a string into multiple strings
|
|
1
|
1678
|
December 10, 2022
|
Continuing training masked LM: loss going up, performance going down
|
|
0
|
737
|
December 9, 2022
|
How to use pipeline for Custom token-classification model
|
|
0
|
670
|
December 9, 2022
|
RuntimeError: you can only change requires_grad flags of leaf variables
|
|
2
|
2861
|
December 8, 2022
|
Streaming dataset freezes with multi-gpu
|
|
2
|
1695
|
December 8, 2022
|
T5 for Q&A (truncated sentences and long answers)
|
|
0
|
844
|
December 8, 2022
|
Trainer API not pushing checkpoints to HUB
|
|
0
|
308
|
December 7, 2022
|
Training GPT2 Text generation model with classification labels
|
|
0
|
642
|
December 7, 2022
|
Text generation confidence
|
|
1
|
1336
|
December 7, 2022
|
Transformers for regression
|
|
0
|
568
|
December 7, 2022
|
Setup for Deepspeed Multi GPU Training
|
|
2
|
8027
|
December 7, 2022
|
Export logs while training
|
|
1
|
1261
|
December 6, 2022
|
New Trainer Doc no some properties but Old Doc have (n_gpu, parallel_mode)
|
|
3
|
302
|
December 6, 2022
|
How is padding masking considered in the Attention Head of a Transformer?
|
|
0
|
2754
|
December 6, 2022
|
Saving Models in Active Learning setting
|
|
1
|
641
|
December 6, 2022
|