Creating a custom loss function for token appearance based in BART on the input

Hello!

I am trying to add a custom loss for my BART model that sees if words from the input are used in the output sequence with CausalGeneration. Let’s say that if the input words are [ x, y, z ] converge to phrases that contains the maximum tokens in the set. I have some questions:

1- When creating the custom loss function is there a way to have the input words and compare it to the whole generated phrase? as I understand all models return just one step and not the whole set of tokens that would be returned in the generate function of the model. I’d be glad if this could be explained clearly.

2- Do you have any idea on how to compute this kind of loss if the word is not a token that maps to 1 token_id, being a name, that maps to more than one normally, for example ‘Mariano’ with BART tokenizer is segmented into M, arian, o

Thanks in advance
A