Load model from cache or disk not working
|
|
1
|
9257
|
October 16, 2022
|
Is there an equivalent to the `Transformer` class of PyTorch in `transformers` in `flax`?
|
|
0
|
205
|
October 15, 2022
|
What is the `tie_word_embeddings` option exactly doing?
|
|
3
|
13326
|
October 15, 2022
|
Is there a pre-trained model that predict the next letter based on the previous letters?
|
|
0
|
283
|
October 15, 2022
|
How do I finetune the backbone of bert-base-uncased?
|
|
0
|
476
|
October 14, 2022
|
Beam_search bottlenecks inference with only 1 used cpu
|
|
1
|
836
|
October 13, 2022
|
After output of model? Get meaning full information
|
|
0
|
247
|
October 13, 2022
|
What is the preferred way to preprocess punctuation?
|
|
0
|
237
|
October 13, 2022
|
Confused with setting up torch_dtype while using CPU as device
|
|
0
|
2307
|
October 12, 2022
|
How to train a Semantic Segmentation model using transformers tensorflow2 API
|
|
0
|
416
|
October 12, 2022
|
Code about DataCollatorForWholeWordMask in github
|
|
0
|
563
|
October 12, 2022
|
FlaxT5 vs T5X repo
|
|
0
|
550
|
October 11, 2022
|
Getting error in the inference stage of Transformers Model (Hugging Face)
|
|
0
|
783
|
October 11, 2022
|
SetFit for contradictions
|
|
0
|
291
|
October 11, 2022
|
Train with MLM loss augmentation
|
|
0
|
193
|
October 10, 2022
|
Finetuning mBART
|
|
0
|
223
|
October 10, 2022
|
Multiple gpu not properly parallelized during model.generate()
|
|
4
|
1640
|
October 9, 2022
|
Base code of custom transformer models not managed by Huggingface
|
|
0
|
237
|
October 6, 2022
|
StoryWriter ð AI and State of Art for Transformers Based Story Generation
|
|
0
|
2984
|
October 6, 2022
|
Can I place my target variable inside the forward function of the data model class
|
|
0
|
191
|
October 5, 2022
|
Corruption when running the trainer
|
|
2
|
518
|
October 5, 2022
|
OOM error in HF trainer during validation
|
|
0
|
413
|
October 5, 2022
|
Why past_key_values is not in GreedySearchDecoderOnlyOutput?
|
|
1
|
2047
|
October 4, 2022
|
Give 2 inputs to BERT
|
|
0
|
193
|
October 4, 2022
|
How to Improve inference time of facebook/mbart many to many model?
|
|
5
|
1890
|
October 4, 2022
|
Evaluation using bits per character
|
|
0
|
349
|
October 3, 2022
|
How can I combine encode and padding into __call__ during training if I want to pad to batch longest
|
|
0
|
619
|
October 3, 2022
|
MLFlow with Databricks and transformers
|
|
0
|
868
|
October 3, 2022
|
Is really Trainer class support TPU for faster training?
|
|
2
|
344
|
October 2, 2022
|
Wav2Vec2 pretraining feature extraction during preprocessing as welll as training
|
|
1
|
737
|
October 1, 2022
|