Chapter 1 questions

KaiquanMah · May 21, 2023, 1:08am

Beyond the encoder-decoder architecture shared in Sequence-to-sequence models:

Is this how the dot product cross attention mechanism works?

During ‘dot product cross attention’,

a decoder layer performs a dot product of each encoder layer’s hidden state and the current decoder layer’s hidden state
decoder normalises and performs a softmax on the dot product → this gives us a ‘probability distribution’ of the encoder hidden states → Question 1: This ‘probability distribution’ corresponds to how similar the decoder’s Q and the encoder’s K are?
decoder weighs the probabilities of the encoder hidden states to get a context vector → Question 2: Where do the ‘weights’ for this step come from?
decoder uses the context vector to generate the next output token

chem1 · May 21, 2023, 8:49am

I solved it by using a print command like this:

from transformers import pipeline
classifier = pipeline(“sentiment-analysis”)
result = classifier(“I’ve been waiting for a HuggingFace course my whole life.”)
print (result)

prince2kml · May 22, 2023, 10:23am

How to post a question here?
I wanted to ask what is the meaning of token key in the output here: Transformers, what can they do? - Hugging Face NLP Course ?

Yortom · May 29, 2023, 9:04am

(post deleted by author)

Yortom · May 31, 2023, 4:25pm

Hello! A nooby over here. I’m having a lot of problems configuring the environment.
I started by installing the requirements:
pip install datasets evaluate transformers[sentencepiece]

Then I have this example:

from transformers import pipeline
classifier = pipeline("sentiment-analysis")
results = classifier([
    "I've been waiting for a HuggingFace course my whole life.",
    "I love using Hugging Face models!",
    "This movie is terrible.",
    "I love metal"
])

I was asked to install pytorch. I installed it.
The a new error appears. I need to install xformers.
I installed it
An now I have this error:

No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
A matching Triton is not available, some optimizations will not be enabled.
Error caught was: No module named 'triton.language'

When I tried to install Triton I got this error:

PS C:\Users\Usuario\Desktop\Hugging face\triton\python> pip install triton
ERROR: Could not find a version that satisfies the requirement triton (from versions: none)
ERROR: No matching distribution found for triton

Any idea about this?

abitabir · June 4, 2023, 9:16pm

Hello, in chapter 1 under the ‘The original architecture’ header, it says

During training, the encoder receives inputs (sentences) in a certain language, while the decoder receives the same sentences in the desired target language.

I was wondering if it is true that ‘the decoder receives the same sentences in the desired target language’, as I thought that the decoder receives the features of the input sentence (as converted by the encoder), and instead produces the same sentence in the desired target language. Am I correct in this understanding or is the quoted sentence correct?

logvinata · June 6, 2023, 12:18pm

Hi!
I just started the NLP course. Most of the links in the first chapter lead to 404 errors, which is “a little bit” annoying. Example: links at the bottom of this page.
It would be great if someone could fix them. I guess the reason is some reorganization on the Hugging Face site

Elokim · June 14, 2023, 2:22pm

האם יש באתר שלכם צ’אט ג’י פי טי 4 אני רוצה שתהיה כאילו אתה מפעיל של האתר

jagad · June 18, 2023, 5:49am

how to post a comment in this forum. which chapter consists of question and answering using nlp?

michael-eniolade · June 22, 2023, 6:24pm

Hey fellows, please I will like to be considered for translating the chapters into my native language (Yoruba)

jiehu3456 · July 8, 2023, 9:42pm

I have a question about the transformer architecture. I think there are multiple decoder stacks and multiple encoder stacks in the architecture. Does every decoder stack receive information from the encoders? And if so, do they all receive the same information from the last encoder? Or if not, perhaps only the first decoder receive information from the last encoder?

kausthab88 · July 27, 2023, 2:35am

Not sure what I am doing wrong, but when I replace ‘fill-mask’ with “bert-base-cased” model as suggested in chapter 1, I get the following error

KeyError: “Unknown task bert-base-cased, available tasks are [‘audio-classification’, ‘automatic-speech-recognition’, ‘conversational’, ‘depth-estimation’, ‘document-question-answering’, ‘feature-extraction’, ‘fill-mask’, ‘image-classification’, ‘image-segmentation’, ‘image-to-text’, ‘mask-generation’, ‘ner’, ‘object-detection’, ‘question-answering’, ‘sentiment-analysis’, ‘summarization’, ‘table-question-answering’, ‘text-classification’, ‘text-generation’, ‘text2text-generation’, ‘token-classification’, ‘translation’, ‘video-classification’, ‘visual-question-answering’, ‘vqa’, ‘zero-shot-audio-classification’, ‘zero-shot-classification’, ‘zero-shot-image-classification’, ‘zero-shot-object-detection’, ‘translation_XX_to_YY’]”

mkarlos · August 15, 2023, 10:10pm

Hi. I have a quick question regarding sequence-to-sequence models. At the end of the video, it shows that these models can be constructed by combining encoder models(e.g. BERT) and decoder models(e.g. GPT).
I was wondering, how can RoBERTa (encoder-only model) be used both as an encoder and decoder?

dmitrijsk · August 17, 2023, 11:59am

In the encoder-decoder architecture, the decoder looks only backwards, i.e., at the preceding tokens, as does the decoder-only arch. The encoder-decoder arch seems to be more powerful just because there’s a whole additional component (the encoder). If this is true then why not using encoder-decoder arch for everything that is currently done by the decoder-only arch (e.g., text generation)?

dmlea · August 29, 2023, 2:09pm

I noticed today that the error described in the link below still hasn’t been corrected. bhagerty is correct; “perspires” can only mean “sweats,” so the word should be “persists” instead. Chapter 1 questions - #14 by bhagerty

vivekkishore · August 30, 2023, 5:26pm

Similar to the given use case of NLP, I want to build a model which can query a Excel file having multiple rows and columns including text and numbers and give answers from any part.
For example : Excel can have HR database and I can ask - Tell me what is the assigned location value for vivek.kishore. what steps should I take to build this?
A brief steps will be helpful. Shall I use vector db and llama 2?

AGile1704 · September 3, 2023, 8:42am

i am facing the issue in understanding, how the below sentense is negative.

myhfaccount · September 11, 2023, 7:52pm

In masking, why do some models use [mask] while others models use <mask>? What is the difference between square brackets and angle brackets?

Here is a square bracket example,

from transformers import pipeline

filler = pipeline("fill-mask", model="bert-base-cased")
result = filler("This [MASK] has been waiting for you.")

Here is a angle bracket example,

unmasker = pipeline("fill-mask", model="distilroberta-base")
unmasker("This course will teach you all about <mask> models.")

Thanks so much!

jonalee1 · September 18, 2023, 10:01pm

Hello, I am using the hugging face classroom in teaching MS Data Analytics courses. I require all students to take the course and try to develop the lectures accordingly. May I ask about the slides used in the open NLP course? I hope I can get them if available. Thank you.

bclark288 · September 29, 2023, 1:49pm

I am trying to run the code in the first chapter under text generation. Specifically I ran these lines:

from transformers import pipeline

generator = pipeline(“text-generation”)
generator(“In this course, we will teach you how to”)

The output I got was:

No module named ‘keras.saving.hdf5_format’

Keras is installed on my machine. What am I doing wrong?

Topic		Replies	Views
VisionEncoderDecoder/TrOCR Models	0	702	October 21, 2021
Chapter 3 questions Course	141	10199	June 8, 2025
T5 Model, T5 Encoder Model and T5 Model for Conditional Generation Beginners	1	1294	November 20, 2022
Using an encoder-decoder model for Recognizing Textual Entailment (GLUE task) Models	0	170	March 31, 2024
Which model of transformers to use if I want to do multiclassification of a pair of sentences containing a questionair 🤗Transformers	0	251	February 18, 2022

Chapter 1 questions

Related topics