Use this topic for any question about Chapter 1 of the course.
From the widget in roberta-large-mnli · Hugging Face I see the classification is between “CONTRADICTION”, “NEUTRAL”, “ENTAILMENT”
good catch! will push a fix this afternoon
At the end of the colab notebook, you need to install sentencepiece to complete the translation task.
!pip install sentencepiece
thanks for reporting this! i’ll push a fix this afternoon
Hi. I wanted to understand a bit deeper on difference between encoder vs encoder-decoder models for question-answering. ModelHub currently has both BERT and T5 models for question answering. The encoder lecture mentions BERT is good for QA tasks. Could someone please point me with more reading material to understand the nuances/differences between the two.
Hi there! So Ecoder-decoders models will be very good at “generative question answering”, which is generating the answer to the question given the context. It’s different from what an encoder can do (which is “extractive question answering”, which just says the answer to this question is from word xxx to word yyy) in the sense the model answers the question in its own words instead of trying to get a part of the context.
Understood. Thanks for the explanation.
I replied on the topic!
Beginner Beginner query:
I’m working my way through to try and use the transformers for classification, and potentially question answering, and sentence forming. Is it possible to use say of the bert models to encode the text for zero-shot classification, and use the same model to interact with responses and formulate the stored data based on interaction. I’m only up to chapter 2 so no doubt (hopefully) it will come up. But I’m looking ahead to optimise so I don’t use more than the necessary models, and encoders/decoders.
edit: as an example cause on re-read it might not be clear.
I have a bot that classifies context in a conversation chain i.e. “we are going to use bootstrap”. From this I think using zero shot and my labels it will classify it. I then would like to interact with Questions i.e. “did you want to store this” followed by creating a sentence to store the data. I’ll need more than one model I think…
hi @LJay welcome to the forum!
your use case sounds quite tricky to solve end-to-end with a single model, so you’re right to think that you’ll need multiple components / models.
for instance, BERT models are not typically designed for text generation, so a decoder-only model like GPT-2 would be a better choice for sentence generation - although it’s worth noting that generating high-quality text is a form of art
my suggestion would be to first build a simple end-to-end pipeline based on rules (where possible) so you can get a feel for how the inputs are converted to outputs. you might find you don’t need machine learning for all the components (which is quite nice if you plan to deploy this in a production environment!)