Chapter 2 questions

Probably a little pitfall:

Hello, I believe there is a little pitfall in the instruction of “Behind the pipeline”, looks as the following:

Negative score should be “0.11%”, instead of “0.,11%”. This might cause some confusing.

2973622476

"The problem is that we sent a single sequence to the model, whereas :hugs: Transformers models expect multiple sentences by default. " - this does not seem to be a problem with Tensorflow code. Tensorflow accepts single sequence also without need for batch dimension unlike PyTorch which requires the sequence to be given with additional batch dimension

same here!
in fact without adding special tokens , I’ve got better results outputted from SoftMax function.

Hi Sylvain
First of all, thanks a tonne for this detailed course! :pray:
A bit confused why I am not getting this error?

What is the difference between AutoTokenizer and BertTokenizer

Hello Sylvain, I am just wondering if there’s a comprehensive list of tasks/pipelines anywhere? I thought it might be here Tasks - Hugging Face , but in the tutorial for this page, it mentions the “sentiment-analysis” task, which is not available on that link.

1 Like

How do I know which tokenizer to choose?
Example 1.
"The dog’s ran into the church. "
model 1: [ The, dog’s, ran, into, the, church]

model 2: [ The, dog, 's, ran, into, the, church]

This provides 2 different meaning to a model. How do I know to choose a tokenizer that store the whole word or breaks down the parts of a word?

I don’t know why this error appear.

Both are quite the same with the only difference being that bert tokenizer is specifically focused for the bert ecosystem and It handles special tokens like [CLS] (classification token), [SEP] (separator token), and [MASK] (mask token) that are specific to BERT.
Whereas the auto tokenizer is more general and versatile and can be used for multiple other models