Longformer for sequence classification throwing error regarding data format and shape

rotentomato · June 11, 2024, 10:41am

Thank you @nielsr for the answer, it really helps a lot.
I just wanted to clarify that I’m not trying to do fine-tuning but training from scratch.
I usually load the tokenizer as
tokenizer.from_pretrained(...) and
config = AutoConfig.from_pretrained(...), but the actual model is loaded from the config

AutoModelForSequenceClassification.from_config(config).

I thought this way would ensure to load the model in order to train from scratch and not just pretrain since I’m not loading the model as

checkpoint = "allenai/longformer-base-4096",
AutoModelForSequenceClassification.from_pretrained(checkpoint)

Am I wrong in thinking that this way ensures training from scratch?

My sequences are different lengths but I think in this case I should pad to 1024 given the current checkpoint?

My targets are usually comprised of 1 or 2 tokens after tokenization but I pad them to length=10. Is that wrong?

On another note, what are some other hugging face models with good performance on sequence classification that can be used to train from scratch on my toy synthetic datasets?

Topic		Replies	Views
Longformer for sequenceclassification 🤗Transformers	5	476	October 13, 2020
Strange error when using the Longformer (HuggingFace developers, please reply) 🤗Transformers	8	1800	October 12, 2020
TFLongformer Shape Error 🤗Transformers	2	679	December 31, 2021
Fine-tuned longformer classifies all test samples as False Beginners	0	351	May 19, 2022
Expected scalar type Long but found Float using Trainer for BertForTokenClassification Beginners	6	4003	April 22, 2021

Longformer for sequence classification throwing error regarding data format and shape

Related topics