Models

Topic	Replies	Views	Activity
Facebook/opt-30b license	1	2534	May 16, 2022
T5 randomness in generation	4	1247	May 15, 2022
How does the API inference work on models such as Blenderbot?	4	930	May 14, 2022
Wav2vec2-xls-r-2b out of memory issues on A100 (40 GB)	0	685	May 13, 2022
Training issue of a Transformer based Encoder-Decoder model based on pre-trained BanglaBERT	1	746	May 12, 2022
Muril-base-cased has infinity for model_max_length	0	522	May 11, 2022
Model size doubles after finetuning	0	485	May 11, 2022
Regarding the eval batch size for large models	0	1106	May 9, 2022
What is the language modeling loss (for next-token prediction) for HuBERT model?	0	4767	May 9, 2022
Dealing with proper nouns in wav2vec2	0	503	May 8, 2022
Is it possible to load TF2 SavedModel format into HuggingFace models?	0	571	May 7, 2022
Question about Wav2vec2	1	551	May 6, 2022
XLNet recurrence mechanism on long sequences	0	445	May 2, 2022
Using the decoder half of BART for causal generation	4	2802	May 2, 2022
Getting incorrect emotion inferences for sentences from a story using existing models	1	430	May 1, 2022
How can I load large models like google/mt5-xl on a GPU	2	1747	April 30, 2022
Have the `facebook/blenderbot-xxx` checkpoints already been trained on the BST Tasks?	0	636	April 30, 2022
Why is Wav2Vec pretraining loss not decreasing?	10	2665	April 29, 2022
LayoutLMv2 support in colab TPU	0	410	April 29, 2022
DebertaForMaskedLM cannot load the parameters in the MLM head from microsoft/deberta-base	3	1325	April 29, 2022
RoBERTa Index out of range error on relation extraction data	0	575	April 29, 2022
LongFormer - fp16 training without Trainer	1	1095	April 27, 2022
Data privacy using hugging face models	0	1850	April 26, 2022
Difference between CausalLM and LMHeadModel	1	4100	April 25, 2022
Huggingface transformers longformer optimizer warning AdamW	2	9684	April 25, 2022
Fine-tune T5-small but lower performance	0	1415	April 21, 2022
Rare buggy translations when using Helsinki-NLP models	0	582	April 19, 2022
Question about validation and testing loss	6	2338	April 19, 2022
Longformer(LED) map out Global Tokens	0	322	April 18, 2022
SpeechBrain EncoderDecoderASR transcribe_file() Runs out of Memory	0	497	April 17, 2022