BertForSequenceClassification Index Error

Hey Beginner Here, I was using BertForSequenceClassification using this code:

def train(model, optimizer, critertion=nn.BCELoss(),train_loader=train_iter,valid_loader=valid_iter,num_epochs=5
                   ,eval_every = len(train_iter) // 2,file_path = "",best_valid_loss = float("Inf")):
    # initialize running values
    running_loss = 0.0
    valid_running_loss = 0.0
    global_step = 0
    train_loss_list = []
    valid_loss_list = []
    global_steps_list = []
    
    model.train()
    for epoch in range(num_epochs):
        for (labels, title, text, titletext), _ in train_loader:
            labels = labels.type(torch.LongTensor) 
            labels = labels.to(device)
            
            titletext = titletext.type(torch.LongTensor)  
            titletext = titletext.to(device)
            print(labels.shape)
            print(titletext.shape)
            
            output = model(titletext, labels)
            loss, _ = output
            
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()
            
            running_loss += loss.item()
            global_step += 1
#removed other part of the code which was for validation and testing. Error is generated in train loop.

but when the run the code it shows me the following error:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-63-e4474bff9c36> in <module>
      2 optimizer = optim.Adam(model.parameters(), lr=2e-5)
      3 
----> 4 train(model=model, optimizer=optimizer)

<ipython-input-62-e6359dc8788e> in train(model, optimizer, critertion, train_loader, valid_loader, num_epochs, eval_every, file_path, best_valid_loss)
     20             print(titletext.shape)
     21 
---> 22             output = model(titletext, labels)
     23             loss, _ = output
     24 

/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

<ipython-input-59-3d3782128a40> in forward(self, text, label)
      7 
      8     def forward(self, text, label):
----> 9         loss, text_fea = self.encoder(text, labels=label)[:2]
     10 
     11         return loss, text_fea

/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

/opt/conda/lib/python3.7/site-packages/transformers/modeling_bert.py in forward(self, input_ids, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, labels)
   1158             else:
   1159                 loss_fct = CrossEntropyLoss()
-> 1160                 loss = loss_fct(logits.view(-1, self.num_labels), labels.view(-1))
   1161             outputs = (loss,) + outputs
   1162 

/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

/opt/conda/lib/python3.7/site-packages/torch/nn/modules/loss.py in forward(self, input, target)
    930     def forward(self, input, target):
    931         return F.cross_entropy(input, target, weight=self.weight,
--> 932                                ignore_index=self.ignore_index, reduction=self.reduction)
    933 
    934 

/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py in cross_entropy(input, target, weight, size_average, ignore_index, reduce, reduction)
   2315     if size_average is not None or reduce is not None:
   2316         reduction = _Reduction.legacy_get_string(size_average, reduce)
-> 2317     return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
   2318 
   2319 

/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py in nll_loss(input, target, weight, size_average, ignore_index, reduce, reduction)
   2113                          .format(input.size(0), target.size(0)))
   2114     if dim == 2:
-> 2115         ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
   2116     elif dim == 4:
   2117         ret = torch._C._nn.nll_loss2d(input, target, weight, _Reduction.get_enum(reduction), ignore_index)

IndexError: Target 5213 is out of bounds.

meanwhile, the shape of label and titletext is [16] and [16,128] respectively.

  • I tried to unsqueeze label too but that didn’t help.
  • I tried to whether if the target index is missing or the label and target are not of same length but it didn’t help as well.

What can be done to fix this?

Full Code + Dataset : Click here

OK, I had a look at your code - your input data that you feed to the model is very borked. Your labels instead of being {0,1} are row ids converted to float. Always dump at least the first row/batch of your inputs to see that what you feed is what you expect it to be. In your case, for labels in the first batch you get something like:

x = next(iter(train_iter))
x.label
tensor([.169, .254, .512, ...

definitely not 2 categories.

The 2 main errors are that you (1) save the row IDs in the csv files and TabularDataset.splits gives labels the row ids (2) you don’t convert FAKE/REAL strings to {0,1}.

So before you do X_train.to_csv(..., you need to:

news['label'] = news['label'].astype('category').cat.codes

and also you need to drop the row IDs, so the correct code is:

X_train.to_csv("./real-and-fake-news-dataset/train.csv", index=False)
X_test.to_csv("./real-and-fake-news-dataset/test.csv", index=False)
X_valid.to_csv("./real-and-fake-news-dataset/valid.csv", index=False)

and later the fields need to be corrected too (label moved to the end):

fields = [('title', text_field), ('text', text_field), ('titletext', text_field), ('label', label_field),]

alternatively you could keep the row ids and then adjust your train loop to ignore the first field.


I don’t think fix_length=MAX_SEQ_LEN does what you think it does - I think the description of that field is confusing and misleading - as it’s related to padding and not truncating - you get millions of warnings during TabularDataset.splits call.

Token indices sequence length is longer than the specified maximum sequence length for this model (1129 > 512). Running this sequence through the model will result in indexing errors

So I added:

news['titletext'] = news['titletext'].str.slice(0,128)
news['title'] = news['title'].str.slice(0,128)
news['text'] = news['text'].str.slice(0,128)

you have other bugs in your code, but you will sort those out.

I uploaded the partially corrected nb here.

You can see the debug prints I added in the code.

Your csv files were also saved in the wrong dir (not the one it was read from). So I adjusted the dirs too.

You will need to remove the initial dataset truncation I added to make the dev cycle fast (and note that trick for your own future development process).

2 Likes