Probably a little pitfall:
Hello, I believe there is a little pitfall in the instruction of “Behind the pipeline”, looks as the following:
Negative score should be “0.11%”, instead of “0.,11%”. This might cause some confusing.
Probably a little pitfall:
Hello, I believe there is a little pitfall in the instruction of “Behind the pipeline”, looks as the following:
Negative score should be “0.11%”, instead of “0.,11%”. This might cause some confusing.
"The problem is that we sent a single sequence to the model, whereas Transformers models expect multiple sentences by default. " - this does not seem to be a problem with Tensorflow code. Tensorflow accepts single sequence also without need for batch dimension unlike PyTorch which requires the sequence to be given with additional batch dimension
same here!
in fact without adding special tokens , I’ve got better results outputted from SoftMax function.
Hi Sylvain
First of all, thanks a tonne for this detailed course!
A bit confused why I am not getting this error?
What is the difference between AutoTokenizer and BertTokenizer
Hello Sylvain, I am just wondering if there’s a comprehensive list of tasks/pipelines anywhere? I thought it might be here Tasks - Hugging Face , but in the tutorial for this page, it mentions the “sentiment-analysis” task, which is not available on that link.
How do I know which tokenizer to choose?
Example 1.
"The dog’s ran into the church. "
model 1: [ The, dog’s, ran, into, the, church]
model 2: [ The, dog, 's, ran, into, the, church]
This provides 2 different meaning to a model. How do I know to choose a tokenizer that store the whole word or breaks down the parts of a word?
Both are quite the same with the only difference being that bert tokenizer is specifically focused for the bert ecosystem and It handles special tokens like [CLS]
(classification token), [SEP]
(separator token), and [MASK]
(mask token) that are specific to BERT.
Whereas the auto tokenizer is more general and versatile and can be used for multiple other models