Hi! I would like to understand a bit better how BART handles a multiple sentences.
I got that I can encode two sentences with tokenizer(sent_a, sent_b)
.
My first sentence contains a <mask>
symbol that is to be filled. However, I noticed that the second sentence - as opposed to the first sentence - isn’t part of the output (which is okay in my case, but I wonder why). In addition, it seems that the second sentence doesn’t really have an influence on how the <mask>
token is replaced, so it’s not really considered as a context, and seems to even confuse the model. Can I actually input two sentences if I’m aiming for the mask filling task? Would it make sense to finetune for it?
input_ids = tokenizer.encode(sent_a, sent_b, return_tensors="pt").to('cuda:0')
tokenizer.batch_decode(model.generate(input_ids))
I was wondering something similar. In my case, when I send in two sentences as something like:
<s> sentence_A </s><s> sentence_B </s>
the return is actually reversed:
</s><s> sentence_B </s><s> sentence_A </s>
The masks in the returned sentences have been infilled appropriately, so there isn’t necessarily any problem, per se, but I wasn’t expecting the return to be formatted this way (it actually took a while for me to realize what was going on).
If this is a consistent/intended pattern then I’m happy to just reverse the sentences when I get them back, but I wanted to verify that there isn’t something going wrong in my model that needs to be addressed.
Okay, this is a bit embarrassing… I took the forum examples of how to format “noisy” Bart data too literally and wound up swapping all of my input sentences. So actually Bart was just doing its job!