Inside the Token classification pipeline (PyTorch)

ghadeermobasher · January 18, 2022, 9:40am

Hi,
I have been looking through one of the part 2 course tutorials for Inside the Token classification pipeline (PyTorch) illustrated by @sgugger. I have tried to use the example in (https://colab.research.google.com/github/huggingface/notebooks/blob/master/course/videos/token_pipeline_pt.ipynb) and used sshleifer/tiny-dbmdz-bert-large-cased-finetuned-conll03-english as a model-checkpoint. However, I had the following error,
AttributeError: ‘list’ object has no attribute ‘argmax’ in `import torch

probabilities = torch.nn.functional.softmax(outputs.logits, dim=-1)[0].tolist()
predictions = probabilities.argmax(dim=-1)[0].tolist()
print(predictions)`

Am I missing something?

Best,
Ghadeer

ehalit · January 18, 2022, 9:44am

You are converting the tensor to a Python list which does not have an argmax function. Try it like this:

probabilities = torch.nn.functional.softmax(outputs.logits, dim=-1)[0]
predictions = probabilities.argmax(dim=-1)[0].tolist()
print(predictions)

Which basically delays list conversion until the very end.

ghadeermobasher · January 18, 2022, 9:48am

Thanks, but how to fetch the results,

if the predictions variable isn’t itertable?

ehalit · January 18, 2022, 10:00am

The code snippet from before looks at the prediction for the first token, which makes predictions an int rather than an iterable list. You can get predictions for all the tokens by changing it like this:

probabilities = torch.nn.functional.softmax(outputs.logits, dim=-1)[0]
predictions = probabilities.argmax(dim=-1).tolist()
print(predictions)

ghadeermobasher · January 18, 2022, 10:19am

Yes, I got it. Thanks! However, the final output of this piece of code doesn’t perform as expected. Here is the link to the code.

ehalit · January 18, 2022, 10:41am

You are welcome, I don’t know about the last problem. My output looks like this:

ghadeermobasher · January 18, 2022, 10:54am

Yes, same for me. In the end, we need to merge the B- and I- labels entities. @sgugger any clue?

lewtun · January 21, 2022, 2:37pm

Hey @ghadeermobasher there are several strategies that you can use to merge the entities and my suggestion would be to inspect the implementation in the token-classification pipeline. As you can see in the docstring there are quite some subtleties associated with merging entities correctly (depends on the language, tokenizer etc)!

Topic		Replies	Views
Transform Logits to probabilities doesn't work Beginners	4	9394	February 17, 2022
How to get probabilities per label in finetuning classification task? Beginners	5	5455	February 18, 2022
Decoding the predicted output array in distilbertbase uncased model for NER 🤗Transformers	1	7373	October 11, 2021
Looking for tool class to do predictions 🤗Transformers	3	551	October 9, 2020
Inconsistency in Model Output [ Token Classification] 🤗Transformers	0	334	April 12, 2023

Inside the Token classification pipeline (PyTorch)

Related topics