PyTorchBenchmark pickle local object error
|
|
0
|
327
|
June 3, 2023
|
Bert-base-uncased performs badly in next sentence prediction (bookcorpus)
|
|
0
|
342
|
June 2, 2023
|
Forward() got an unexpected keyword argument 'attention_mask' in Whisper Tutorial
|
|
1
|
4602
|
June 2, 2023
|
How to use Adaptive Learning rate during training?
|
|
4
|
1618
|
June 2, 2023
|
Trainer gives error after 1st epoch and evaluation
|
|
4
|
4751
|
June 2, 2023
|
Unexpected results wth XLMR trasformer models
|
|
0
|
193
|
June 2, 2023
|
Logging_steps=1 => ValueError
|
|
0
|
327
|
June 2, 2023
|
AssertionError: Torch not compiled with CUDA enabled
|
|
0
|
2954
|
June 1, 2023
|
Stopping generation before max_new_tokens
|
|
0
|
803
|
June 1, 2023
|
FP-16 training producing nans on t5-large/flan-t5-xl
|
|
0
|
728
|
June 1, 2023
|
MLM Using AlBert - No loss error
|
|
0
|
361
|
June 1, 2023
|
Continuing model training takes seconds in next round
|
|
3
|
1429
|
June 1, 2023
|
Fail predict using Falcon-7B-Instruct
|
|
0
|
660
|
June 1, 2023
|
How to make the Trainer log custom quantities?
|
|
0
|
562
|
May 31, 2023
|
I am getting 0.0 loss value at the very first epoch of training bigscience/mt0-small seq2seq model
|
|
0
|
524
|
May 31, 2023
|
Seq2SeqTrainer with num_beams and generation_config
|
|
0
|
276
|
May 31, 2023
|
How to use a custom embedding layer as input in get_encoder function
|
|
0
|
204
|
May 30, 2023
|
Query execution with hugging face pipeline is happening on CPU, even if model is loaded on GPU
|
|
0
|
976
|
May 30, 2023
|
Inference with hugging face pipeline happening on CPU, even if model is loaded on GPU
|
|
0
|
1719
|
May 30, 2023
|
Error in Seq2SeqTrainingArguments
|
|
3
|
948
|
May 30, 2023
|
Pre-Train BERT from scratch
|
|
5
|
15764
|
May 30, 2023
|
The quantization code in the "Gentle Introduction to 8-bit Matrix Multiplication for transformers" blog post yields error
|
|
1
|
727
|
May 29, 2023
|
Trainer.__init__() got an unexpected keyword argument 'model'
|
|
1
|
6216
|
May 29, 2023
|
How can I save vocab for specific language in Model Whisper?
|
|
0
|
290
|
May 29, 2023
|
Finetuning GPT2 using Multiple GPU and Trainer
|
|
14
|
6793
|
May 22, 2023
|
Tokenizer cannot produce correct output once using DistributedDataParallel
|
|
0
|
265
|
May 26, 2023
|
Finetuning using transformers
|
|
0
|
239
|
May 26, 2023
|
Causal language modeling documentation is wrong?
|
|
0
|
171
|
May 26, 2023
|
Training HF transformer models on custom (not text) data
|
|
0
|
213
|
May 26, 2023
|
Unlabelled zero-shot-classification
|
|
1
|
471
|
May 26, 2023
|