Intermediate

Topic	Replies	Views	Activity
BART: get activation maps for encoder and decoder	0	521	November 3, 2021
Resuming training BERT from scratch with run_mlm.py	2	2210	October 31, 2021
Token Classification Model making mistake outside of training dataset	0	462	October 30, 2021
Longformer seemingly initializing global attention mask for every step	0	732	October 25, 2021
Need advise for fine-tuning BERT on opinion mining	0	486	October 25, 2021
How can I make a Img2Text transformer using the existent modules?	1	823	October 21, 2021
Unable to train a good model after using exclude_from_weight_decay	0	408	October 19, 2021
Stopping `model.generate()` based on custom token	2	4424	October 18, 2021
How to exclude layers in weight decay	1	2991	October 18, 2021
How to train the embedding of special token?	1	4163	October 17, 2021
ByT5: problem with tokenizer.decode()	3	1146	October 15, 2021
How to get a model on patent data for question answering	1	859	October 15, 2021
Torchscript with Encoder-Decoder architecture	0	297	October 11, 2021
Tensorboard support when using optimizer with 2 separate learning rates	0	361	October 9, 2021
Open-sourcing better cross-encoders for STILTS and better IR?	2	908	October 9, 2021
BART summarization token probabilities	0	905	October 8, 2021
How to change BERT attention value during testing	0	410	October 6, 2021
How to frozen the attention map in BERT	0	536	October 6, 2021
Pipelines for mutliple inputs don't produce reliable results	2	436	October 3, 2021
A new dataset for multi-label text classification	1	1059	September 30, 2021
Tabular Data Autoencoder Loss Plateau	0	363	September 28, 2021
DeepSpeed and RayTune	0	553	September 26, 2021
Error while generating more then one Beam output in T5	0	295	September 26, 2021
Fine-tuning translator based on a single language	0	292	September 22, 2021
FlaxGPTNeoForCausalLM generates the same text regardless of seed, temperature, top_k and top_p values	1	392	September 22, 2021
Custom GPT2 Model won't load after training	1	1179	September 15, 2021
Optimal methods to monitor attention matrices when doing training/inference using BERT-type models	2	718	September 11, 2021
Class weights in Trainer() instance	1	720	September 10, 2021
Hyper parameter tuning on Colab?	0	295	September 10, 2021
BART from finetuned BERT	2	476	September 9, 2021