Hi,
I have been looking through one of the part 2 course tutorials for Inside the Token classification pipeline (PyTorch) illustrated by @sgugger. I have tried to use the example in (https://colab.research.google.com/github/huggingface/notebooks/blob/master/course/videos/token_pipeline_pt.ipynb) and used sshleifer/tiny-dbmdz-bert-large-cased-finetuned-conll03-english as a model-checkpoint. However, I had the following error,
AttributeError: ‘list’ object has no attribute ‘argmax’ in `import torch
The code snippet from before looks at the prediction for the first token, which makes predictions an int rather than an iterable list. You can get predictions for all the tokens by changing it like this:
Hey @ghadeermobasher there are several strategies that you can use to merge the entities and my suggestion would be to inspect the implementation in the token-classification pipeline. As you can see in the docstring there are quite some subtleties associated with merging entities correctly (depends on the language, tokenizer etc)!