Use this topic for any question about Chapter 1 of the course.
From the widget in roberta-large-mnli · Hugging Face, I see the classification is between “CONTRADICTION”, “NEUTRAL”, “ENTAILMENT”
good catch! will push a fix this afternoon
At the end of the colab notebook, you need to install sentencepiece to complete the translation task.
!pip install sentencepiece
thanks for reporting this! i’ll push a fix this afternoon
Hi. I wanted to understand a bit deeper on difference between encoder vs encoder-decoder models for question-answering. ModelHub currently has both BERT and T5 models for question answering. The encoder lecture mentions BERT is good for QA tasks. Could someone please point me with more reading material to understand the nuances/differences between the two.
Hi there! So Ecoder-decoders models will be very good at “generative question answering”, which is generating the answer to the question given the context. It’s different from what an encoder can do (which is “extractive question answering”, which just says the answer to this question is from word xxx to word yyy) in the sense the model answers the question in its own words instead of trying to get a part of the context.
Understood. Thanks for the explanation.
I replied on the topic!
Beginner Beginner query:
I’m working my way through to try and use the transformers for classification, and potentially question answering, and sentence forming. Is it possible to use say of the bert models to encode the text for zero-shot classification, and use the same model to interact with responses and formulate the stored data based on interaction. I’m only up to chapter 2 so no doubt (hopefully) it will come up. But I’m looking ahead to optimise so I don’t use more than the necessary models, and encoders/decoders.
edit: as an example cause on re-read it might not be clear.
I have a bot that classifies context in a conversation chain i.e. “we are going to use bootstrap”. From this I think using zero shot and my labels it will classify it. I then would like to interact with Questions i.e. “did you want to store this” followed by creating a sentence to store the data. I’ll need more than one model I think…
hi @LJay welcome to the forum!
your use case sounds quite tricky to solve end-to-end with a single model, so you’re right to think that you’ll need multiple components / models.
for instance, BERT models are not typically designed for text generation, so a decoder-only model like GPT-2 would be a better choice for sentence generation - although it’s worth noting that generating high-quality text is a form of art
my suggestion would be to first build a simple end-to-end pipeline based on rules (where possible) so you can get a feel for how the inputs are converted to outputs. you might find you don’t need machine learning for all the components (which is quite nice if you plan to deploy this in a production environment!)
There seems to be an editing error in the explanation for answer 1 to Question 11 in the Chapter 1 quiz. The explanation says (emphasis added):
Correct! When applying Transfer Learning, the bias in the pretrained model used perspires in the fine-tuned model.
I think “perspires” should be something like “persists.” I’m pretty sure bias cannot sweat. Unless “perspire” is being used in some metaphorical way I am not familiar with (which would surprise me, as a pretty educated native English speaker), this looks like a mistake.
I’m going through Chapter 1 using a Python virtual environment. The examples work until I get to the Text-Generation section, when I see the following:
Python 3.8.10 (default, Sep 15 2021, 10:14:58) [GCC 9.3.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> from transformers import pipeline >>> generator = pipeline("text-generation") No model was supplied, defaulted to gpt2 (https://huggingface.co/gpt2) >>> generator("In this course, we will teach you how to") Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation. Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/user/.pyenv/versions/3.8.10/lib/python3.8/site-packages/transformers/pipelines/text_generation.py", line 175, in __call__ return super().__call__(text_inputs, **kwargs) File "/home/user/.pyenv/versions/3.8.10/lib/python3.8/site-packages/transformers/pipelines/base.py", line 1026, in __call__ return self.run_single(inputs, preprocess_params, forward_params, postprocess_params) File "/home/user/.pyenv/versions/3.8.10/lib/python3.8/site-packages/transformers/pipelines/base.py", line 1033, in run_single model_outputs = self.forward(model_inputs, **forward_params) File "/home/user/.pyenv/versions/3.8.10/lib/python3.8/site-packages/transformers/pipelines/base.py", line 943, in forward model_outputs = self._forward(model_inputs, **forward_params) File "/home/user/.pyenv/versions/3.8.10/lib/python3.8/site-packages/transformers/pipelines/text_generation.py", line 213, in _forward generated_sequence = self.model.generate(input_ids=input_ids, **generate_kwargs) # BS x SL File "/home/user/.pyenv/versions/3.8.10/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context return func(*args, **kwargs) File "/home/user/.pyenv/versions/3.8.10/lib/python3.8/site-packages/transformers/generation_utils.py", line 1310, in generate return self.sample( File "/home/user/.pyenv/versions/3.8.10/lib/python3.8/site-packages/transformers/generation_utils.py", line 1963, in sample next_tokens = torch.multinomial(probs, num_samples=1).squeeze(1) RuntimeError: Inplace update to inference tensor outside InferenceMode is not allowed.You can make a clone to get a normal tensor before doing inplace update.See https://github.com/pytorch/rfcs/pull/17 for more details.
Any ideas what may be causing this?
Here’s some version info from
pip freeze for my install:
$ grep -E "transform|torch" pip-freeze.txt pytorch-lightning==1.4.2 -e git+https://github.com/CompVis/taming-transformers.git@24268930bf1dce879235a7fddd0b2355b84d7ea6#egg=taming_transformers torch==1.9.0+cu111 torch-fidelity==0.3.0 torchaudio==0.9.0 torchmetrics==0.6.0 torchvision==0.10.0+cu111 transformers==4.19.2