Models

Topic	Replies	Views	Activity
Pegasus Model Weights Compression/Pruning	14	4265	February 15, 2023
Model quantization	5	2629	February 15, 2023
How to finetune mt0-xxl-mt(13B parameters) seq2seq_qa with deepspeed	0	798	February 15, 2023
BERT regression & LIME explainer	1	1225	February 14, 2023
LayoutXLM load pretrain with 2048 max_position_embeddings	0	278	February 8, 2023
Online Machine Learning for transformers	1	997	February 7, 2023
RAG: Large number of "generate train split"	0	390	February 5, 2023
Help me find a neural network model	0	282	February 2, 2023
Recreate a 4k image on the browser	0	346	February 2, 2023
Flan-T5 / T5: what is the difference between AutoModelForSeq2SeqLM and T5ForConditionalGeneration	5	7490	February 2, 2023
Can a model's license change over time?	0	487	February 2, 2023
How to decrease inference time of model	0	439	February 2, 2023
Model gives output even for SEP token	0	482	February 1, 2023
Finetunig of wav2vec2-xls-r-300m outputs invalid words for Bengali data	6	689	February 1, 2023
Confusions about how T5 is pretrained on C4 dataset	0	555	January 30, 2023
Does the tokenization in BERT change after fine-tuning?	0	595	January 27, 2023
QA Context Building Strategy	0	276	January 25, 2023
BART generation with shorter input sequences on pre-training task	0	310	January 25, 2023
EncoderDeocoderModel with different checkpoint training	0	359	January 24, 2023
BLOOM parameter '"return_full_text": False' isn't being respected, and the "use_gpu" option doesn't appear to be working	3	2721	January 23, 2023
How to dump huggingface models in pickl file and use it?	2	5362	January 23, 2023
How can i save and load pretrained MIRNet pretrained save models?	0	234	January 22, 2023
Facebook/opt-30b model inferencing	3	2682	January 19, 2023
Replicating SQuAD results on T5	2	817	January 17, 2023
ImportError: cannot import name 'NllbTokenizerFast'	0	448	January 17, 2023
Existing model for changing text from present to past tense?	0	459	January 16, 2023
Is there any reason why GPT-Neo would behave differently (fundamentally) from GPT2?	0	432	January 15, 2023
Transformer model on Time Expression Normalization	0	658	January 13, 2023
Positional embedding in GPT-J when using `past_layer`	0	406	January 13, 2023
How to repurpose a domain specific MLM model for Q&A?	0	206	January 12, 2023