Getting index out of self error

I am trying to do a forward pass on the LEDForConditionalGeneration with the PRIMER checkpoint, but I am getting an error. How do I fix this?

Error Message:

IndexError Traceback (most recent call last)
in <cell line: 12>()
38 global_attention_mask[input_ids == DOCSEP_TOKEN_ID] = 1
39
—> 40 outputs = MODEL(input_ids) # <---------------------------------------------------------------------------------------------- causing a bug
41 # outputs = MODEL(input_ids=input_ids_all, global_attention_mask=global_attention_mask)
42

13 frames
/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py in embedding(input, weight, padding_idx, max_norm, norm_type, scale_grad_by_freq, sparse)
2231 # remove once script supports set_grad_enabled
2232 no_grad_embedding_renorm(weight, input, max_norm, norm_type)
→ 2233 return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
2234
2235

IndexError: index out of range in self

Model Architecture:
LEDForConditionalGeneration(
(led): LEDModel(
(shared): Embedding(50266, 1024, padding_idx=1)
(encoder): LEDEncoder(
(embed_tokens): Embedding(50266, 1024, padding_idx=1)
(embed_positions): LEDLearnedPositionalEmbedding(4096, 1024)
(layers): ModuleList(
(0-11): 12 x LEDEncoderLayer(
(self_attn): LEDEncoderAttention(
(longformer_self_attn): LEDEncoderSelfAttention(
(query): Linear(in_features=1024, out_features=1024, bias=True)
(key): Linear(in_features=1024, out_features=1024, bias=True)
(value): Linear(in_features=1024, out_features=1024, bias=True)
(query_global): Linear(in_features=1024, out_features=1024, bias=True)
(key_global): Linear(in_features=1024, out_features=1024, bias=True)
(value_global): Linear(in_features=1024, out_features=1024, bias=True)
)
(output): Linear(in_features=1024, out_features=1024, bias=True)
)
(self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
(activation_fn): GELUActivation()
(fc1): Linear(in_features=1024, out_features=4096, bias=True)
(fc2): Linear(in_features=4096, out_features=1024, bias=True)
(final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
)
)
(layernorm_embedding): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
)
(decoder): LEDDecoder(
(embed_tokens): Embedding(50266, 1024, padding_idx=1)
(embed_positions): LEDLearnedPositionalEmbedding(1024, 1024)
(layers): ModuleList(
(0-11): 12 x LEDDecoderLayer(
(self_attn): LEDDecoderAttention(
(k_proj): Linear(in_features=1024, out_features=1024, bias=True)
(v_proj): Linear(in_features=1024, out_features=1024, bias=True)
(q_proj): Linear(in_features=1024, out_features=1024, bias=True)
(out_proj): Linear(in_features=1024, out_features=1024, bias=True)
)
(activation_fn): GELUActivation()
(self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
(encoder_attn): LEDDecoderAttention(
(k_proj): Linear(in_features=1024, out_features=1024, bias=True)
(v_proj): Linear(in_features=1024, out_features=1024, bias=True)
(q_proj): Linear(in_features=1024, out_features=1024, bias=True)
(out_proj): Linear(in_features=1024, out_features=1024, bias=True)
)
(encoder_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
(fc1): Linear(in_features=1024, out_features=4096, bias=True)
(fc2): Linear(in_features=4096, out_features=1024, bias=True)
(final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
)
)
(layernorm_embedding): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
)
)
(lm_head): Linear(in_features=1024, out_features=50266, bias=False)
)

Input Shape:
tensor([[ 0, 43070, 26945, …, 4, 50265, 2],
[ 0, 34543, 35, …, 1, 1, 1]])
torch.Size([2, 1047])