Chapter 1 questions

Thanks for reporting! Can you please share the full stack trace with the error?

Thanks for reporting @luojianguang ! We’ll fix this on our side :slight_smile:

Edit: will be fixed by this PR Fix transformers architecture image by lewtun · Pull Request #382 · huggingface/course · GitHub

1 Like

Hi, I have a novice question about the parameter of a function.
In the section “Transformers, what can they do?”, there is a “Try It Out”


And I am curious what other parameters can I pass to a TextGenerationPipeline, so I went to the API document of TextGenerationPipeline’s call . But I cannot find parameters called num_return_sequences or max_length. So I went to the source code of TextGenerationPipeline’s call, but still cannot find the parameters called num_return_sequences or max_length.

Can someone explain why there is no such two parameters but the code still works? And what is a standard way for a user to know which parameters can he passed to a function?

Appropriate models for POS tagging?

One exercise in the “Transformers, what can they do” section suggests:

:pencil2: Try it out! Search the Model Hub for a model able to do part-of-speech tagging (usually abbreviated as POS) in English. What does this model predict for the sentence in the example above?

But I’m having trouble getting any of the English POS tagger models I can find to work. For instance, I found “flair/pos-english”, but when I try to use it I get an error:

>>> pos = pipeline("ner", model="flair/pos-english", grouped_entities=True)
>>> output = pos("My name is Brian and I work at Waymo in San Francisco.")
OSError: flair/pos-english does not appear to have a file named config.json. Checkout 'https://huggingface.co/flair/pos-english/main' for available files.

Can you provide any pointers to what I’m doing wrong? Is there a better model to use?

Many thanks in advance!

Good afternoon. I made subtitles in Russian for my video Welcome to the Hugging Face course - YouTube. I made subtitles in Russian for the video Welcome to the Hugging Face course - YouTube. Can you attach them to Youtube videos? To be able to watch videos with subtitles in Russian.

Hello, I’m seeing similar issues with using a Python virtualenvrionment.

RuntimeError: At least one of TensorFlow 2.0 or PyTorch should be installed. To install TensorFlow 2.0, read the instructions at https://www.tensorflow.org/install/ To install PyTorch, read the instructions at https://pytorch.org/.

I installed PyTorch, (pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cpu for me) and the code worked - with a change to capture the result of the classifer() call and print it

from transformers import pipeline

classifier = pipeline("sentiment-analysis")
fred = classifier("I've been waiting for a HuggingFace course my whole life.")
print(fred)

According to the Summary section “A key aspect is that you can use the full architecture or only the encoder or decoder, depending on what kind of task you aim to solve.”. If Decoder-only architectures(e.g. GPT-2/GPT-3) are designed for Text Generation tasks, then how are they used for Classification tasks in these papers (https://arxiv.org/pdf/2102.09690.pdf, https://arxiv.org/pdf/2005.14165.pdf), without even fine-tuning any parameters (e.g. the head)? I mean how can the same model (e.g. GPT-3) be used for various tasks (e.g. text classification or text generation) without any fine-tuning?

I think if you want to use this model you have to follow the Demo section here is the link README.md · flair/pos-english at main

Hi, why in the list of the part “Transformers: what can they do” speech recognition is not listed ?

1 Like

Hello,

I’m a total beginner to :hugs: library and community, so please forgive me if this post is too basic.

When running the codes in Chapter 1 such as the following

from transformers import pipeline

ner = pipeline("ner", grouped_entities=True)
ner("My name is Sylvain and I work at Hugging Face in Brooklyn.")

while the codes does work, I get warnings such as the following:

No model was supplied, defaulted to dbmdz/bert-large-cased-finetuned-conll03-english and revision f2482bf (https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english).
Using a pipeline without specifying a model name and revision in production is not recommended.

Now when I supply the code with model='dbmdz/bert-large-cased-finetuned-conll03-english', again, the code works but then I get the following warning:

Some layers from the model checkpoint at dbmdz/bert-large-cased-finetuned-conll03-english were not used when initializing TFBertForTokenClassification: ['dropout_147']
- This IS expected if you are initializing TFBertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing TFBertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
All the layers of TFBertForTokenClassification were initialized from the model checkpoint at dbmdz/bert-large-cased-finetuned-conll03-english.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFBertForTokenClassification for predictions without further training.

Are these normal? I didn’t see the lectures mention them. Same issue recurs in running most other codes in Chapter 1 on Jupyter Notebook.

2 Likes

Hi, I think that it is a mistake in the third question on the transformer quizz. The question is about a transformer pipeline with the fill-mask task. The correct answerr acepted is using [MASK] but I think that have to be using . I make this quizz in spanish, I don’t know if is a mistake in spanish only or in any lenguaje.

Hello, what is the benefit to using a decoder-only model versus sequence-to-sequence models? Is it primarily when your input sequence size is expected to be different from your output sequence size, like during summarization?

Great lectures, by the way :hugs:

1 Like

Hi, Is there any better way that I can ask directly my question to this portal? I have been having hard time to find where and how I can submit my question?

My question is: In this Chapter 1 Transformers training, you have been using specific pipeline, how do you know which model is used for which task? As an example, I was trying to find the model card for “fill-mask” example you have shown in the hugging models page, but could not locate it.

The example given:
from transformers import pipeline
unmasker = pipeline(“fill-mask”)
unmasker(“This course will teach you all about models.”, top_k=2).
None of these 15 models give the example query type. Which model is used in above-given example? This questions apply to some other models used in the examples.

thanks

Hey, from what I understand(still a noob). Decoder-only models are good at generating fluent and diverse text for tasks like language generation, while Seq2Seq (sequence-to-sequence) models are better suited for tasks like translation and summarization because they can handle input and output sequences of different lengths and capture semantic and syntactic relationships. The choice of which model to use depends on the specific task and input/output data.

I noticed that too considering that a key advantage of transformers is their ability to model long-range dependencies in a sequence, making them well-suited for tasks that require processing of sequences of variable length, such as speech recognition.

2 Likes

The introduction page has a bullet saying “Chapters 9 to 12 go beyond NLP…”, but the course currently appears to end with a Chapter 9 on Gradio, even in the github repo. Is there a chapter on Computer Vision Transformers somewhere else?

2 Likes

Based on the tutorial, we know that GPT and GPT-2 are good at text generation (completing an uncompleted sentence). However, when we use ChatGPT, we use it for summarization, translation, and generative question answering. Does it mean that ChatGPT uses an encoder-decoder model instead of GPT-like decoder model? Thank you.

I did some research the answer is no. All the summarization, translation, and generative question answering can be treated as a text generation task. The ChatGPT just generates new text that makes sense as an answer to the question.

I am trying to run HuggingFace course first notebook locally.
I’ve created a brand new conda env (python 3.10) and I’ve installed both Tensorflow 2 and PyTorch.
I got the following error when trying to run the sentiment analysis pipeline:

RuntimeError: At least one of TensorFlow 2.0 or PyTorch should be installed

Am I missing something? Thanks in advance for your help.

Hello,
I am using spyder IDE to run the code at HF.co
When I run this code nothing happens i.e. nothing shows up in the console.
For the first run it tells me that models are downloading but then nothing gets printed in the console

from transformers import pipeline

classifier = pipeline(“sentiment-analysis”)
classifier(“I’ve been waiting for a HuggingFace course my whole life.”)

When I run this in colab I see the output.
What do I need to do to run this properly in spyder?

Thanks