Chapter 2 questions

sgugger · June 14, 2021, 2:51pm

Use this topic for any question about Chapter 2 of the course.

amitness · June 15, 2021, 11:15am

In the Handling multiple sequences page of Chapter 2, there is a bug in the code under Attention masks section.

Page: Handling multiple sequences - Hugging Face Course

The PyTorch toggle is on, but the code uses Tensorflow’s tf.constant function.

Screenshot from 2021-06-15 16-57-42970×796 70.6 KB
There is a typo on https://huggingface.co/course/chapter2/6?fw=pt

Screenshot from 2021-06-15 17-12-37917×347 39.9 KB
Isn’t wordpiece a subword algorithm as well?

image976×517 38.6 KB

sgugger · June 15, 2021, 12:07pm

Thanks for flagging all of this, will push a fix in the morning!

harish3110 · June 16, 2021, 2:48am

In Chapter 2, under the Putting it all together page, the above code snippet should include padding=True
Under Wrapping up: From tokenizer to model, the last line of code snippet should be changed to output = model(**tokens)

Page: Putting it all together

@sgugger I’m absolutely loving this course! A great refresher to the library with really intuitive videos and tutorials to wade through and understand the Hugging Face Library. I honestly wished I had this resource when I started out. Can’t wait for the next part of the course!

lewtun · June 16, 2021, 11:54am

thanks for reporting the bugs and suggested fixes @harish3110! will push a fix this afternoon

realjanpaulus · June 24, 2021, 9:20am

I found a small typo in the section Behind the pipeline - Postprocessing the output.

error: [0.9946, 0.0544]
correction: [0.9995, 0.0005]

The following block gives a little bit more context (see the section Postprocessing the output)

tensor([[4.0195e-02, 9.5980e-01],
        [9.9946e-01, 5.4418e-04]], grad_fn=<SoftmaxBackward>)

Now we can see that the model predicted [0.0402, 0.9598] for the first sentence and [0.9946, 0.0544] for the second one. These are recognizable probability scores.

sgugger · June 24, 2021, 12:08pm

Thanks for reporting, will fix this morning!

d4niel92 · June 24, 2021, 2:45pm

In the Handling multiple sequences notebook (for Tensorflow) of Chapter 2, there is a bug in the code under Tokenization section.

sgugger · June 24, 2021, 2:47pm

No that is not a bug, the course explicitly says this doesn’t work and explains why.

d4niel92 · June 24, 2021, 2:48pm

Okay sorry for that, have to go over it again. Thanks!

sgugger · June 24, 2021, 2:50pm

No worries, I understand why you’d be suprised. The notebooks are auto-generated, so I can’t add some comments in Markdown cells but I can add comments in the code!

lincht · August 8, 2021, 2:20pm

In Handling multiple sequences - Attention masks - Try it out, there is a caveat that may be good to mention in case someone encounters the same question.

Here, by manually tokenizing the two sentences, creating attention masks, and passing them through the model, we should be able to reproduce the same logits as in section 2, which are:

<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
    array([[-1.5606991,  1.6122842],
           [ 4.169231 , -3.3464472]], dtype=float32)>

However, if you do it this way:

import tensorflow as tf
from transformers import AutoTokenizer, TFAutoModelForSequenceClassification

checkpoint = "distilbert-base-uncased-finetuned-sst-2-english"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = TFAutoModelForSequenceClassification.from_pretrained(checkpoint)

raw_inputs = [
    "I've been waiting for a HuggingFace course my whole life.", 
    "I hate this so much!",
]

batched_tokens = [tokenizer.tokenize(r) for r in raw_inputs]
batched_ids = [tokenizer.convert_tokens_to_ids(t) for t in batched_tokens]

# Add padding
batched_ids = [ids + [tokenizer.pad_token_id] * (max(map(len, batched_ids))-len(ids))
               for ids in batched_ids]
attention_mask = [[0 if x == tokenizer.pad_token_id else 1 for x in ids]
                  for ids in batched_ids]

print(model(tf.constant(batched_ids), attention_mask=tf.constant(attention_mask)).logits)

The results will be different from section 2:

tf.Tensor(
[[-2.7276204  2.878937 ]
 [ 3.1930914 -2.668523 ]], shape=(2, 2), dtype=float32)

This is because tokenizer.tokenize does not add the special tokens [CLS] and [SEP] by default, whereas the high-level API tokenizer() does.

If we change this line in the code to:

batched_tokens = [tokenizer.tokenize(r, add_special_tokens=True) for r in raw_inputs]

the result is indeed the same:

tf.Tensor(
[[-1.5606964  1.6122813]
 [ 4.169231  -3.3464475]], shape=(2, 2), dtype=float32)

fredguth · October 6, 2021, 1:41pm

Just following the code (for PyTorch) for the course leads to a small error:

outputs = model(**inputs)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
...

TypeError: forward() got an unexpected keyword argument 'token_type_ids'

I removed the problematic token_type_ids key and it worked:

outputs = model(** { k: inputs[k] for k in ['input_ids', 'attention_mask'] })

But, then, the result is not what we expected:

import torch
from transformers import AutoModelForSequenceClassification

checkpoint = "distilbert-base-uncased-finetuned-sst-2-english"
model = AutoModelForSequenceClassification.from_pretrained(checkpoint)
raw_inputs = [
    "I've been waiting for a HuggingFace course my whole life.", 
    "I hate this so much!",
]
inputs = tokenizer(raw_inputs, padding=True, truncation=True, return_tensors="pt")
outputs = model(**{ k: inputs[k] for k in ['input_ids', 'attention_mask'] })
predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
print(predictions)
-----
tensor([[0.9618, 0.0382],
        [0.9350, 0.0650]], grad_fn=<SoftmaxBackward>)

sgugger · October 6, 2021, 6:33pm

Are you sure you are using the right tokenizer? It doesn’t seem so since you have those token_type_ids added.

StoriedPerson · September 21, 2022, 7:14am

There’s a typo in this illustration:

makram · October 16, 2022, 12:50am

I want to use behind the pipeline as a way to use batches of dataset, is there a way? and is it possible to apply batches using pipeline as we can specify batch size and give the model a dataset instead of examples?

pallavi176 · October 26, 2022, 2:06pm

from transformers import TFAutoModel
checkpoint = “distilbert-base-uncased-finetuned-sst-2-english”
model = TFAutoModel.from_pretrained(checkpoint)
outputs = model(inputs)
print(outputs.last_hidden_state.shape)
The output generated from above step is not being used in the next step.

from transformers import TFAutoModelForSequenceClassification
checkpoint = “distilbert-base-uncased-finetuned-sst-2-english”
model = TFAutoModelForSequenceClassification.from_pretrained(checkpoint)
outputs = model(inputs)
print(outputs.logits.shape)

Then why are we generating output from TFAutoModel

vsrinivas · November 2, 2022, 8:07am

Hi, I regret to mention that I am finding it very difficult to follow Mr.Sylvia’s pronunciation. The subtitles seem to be only in French; if they are made available in English too, it will be easier to follow/ understand.

Dongli · November 15, 2022, 4:56am

Hi Sylvain,

There is a typo in the mask output in Preprocessing with a tokenizer, Behind the pipeline.

sgugger · November 15, 2022, 2:03pm

You should make a PR with your fix!

Topic		Replies	Views
Ai Agents course error in running the Smolagent example Course	14	1261	June 2, 2025
Avoiding the usage of HfApiModel and using local model - `smolagents` Beginners	7	808	May 2, 2025
Payment Required huggingface...Qwen2.5-Coder-32B-Instruct Beginners	2	183	April 21, 2025
Invalid credentials in Authorization header - HfApiModel Course	4	2102	March 24, 2025
Function/tool calling using Transformer models 🤗Transformers	5	847	July 17, 2025

Chapter 2 questions

Related topics