I’m not familiar with the internal of Funnel sequence classification models. What I only know is it gradually reduces sequence lengths and uses the saved FLOPS for a deeper/wider model.
My config is like the following.
config = FunnelConfig(
block_sizes=[3, 3, 3],
d_model=256,
n_head=4,
d_inner=512,
separate_cls=False,
vocab_size=trained_tokenizer.vocab_size,
)
When I forward a batch of sequences of length 658, I got the following error:
RuntimeError: The size of tensor a (658) must match the size of tensor b (657) at non-singleton dimension 3
.
The error happens here: I got content_score
of length 658, but positional_attn
of length 657.
However, the model works if I truncate the sequence to 657 tokens. (I also tried several other even numbers of length, none of them worked.)
What the problem can be?
thanks