However, I can not do the inference without specifying “decoder_input_ids”, the decoder will produce error about
You have to specify either input_ids or inputs_embeds
So far, I assign source_data["idx"] for decoder_input_ids to avoid the issue, but I feel like it is incorrect cause it will bring inconsistency in inference between labeled and unlabeled data. So, I am wondering how should I do inference for unlabeled data correctly.
Hi
during inference use output = model.generate(**batch) instead of output = model(**batch)
Also during training:
decoder_input_ids != target_data[“inputs_idx”]
labels = target_data[“inputs_idx”]
and decoder_input_ids = shift_to_right(target_data[“inputs_idx”]) - this action is performed automatically in library code, so you can simply omit decoder_input_ids argument
I think I confuse others by using the term “inference.” Here I am doing is to “forward” the model without using decoder_input_ids and labels, cause I’d like to compute some unsupervised loss on unlabeled data. Plus, I don’t want to break the auto-grad graph, so I think model.generate() is not a good choice for my case?
Could you show me where the code snippet or document about automatically doing the shift_to_right thing? I could’t find it myself. Thanks a lot.
In case of training conditional model (e.g BartForConditionalGeneration), when decoder_input_ids is absent it will be created automatically by right-shift of labels:
In case of training bare model (e.g BartModel), when decoder_input_ids is absent it will be created automatically by right-shift of input_ids:
By the way, try "inputs_ids": source_data["inputs_ids"]
instead of "inputs_idx": source_data["inputs_idx"]
I see. I used EncoderDecoderModel before, not the Bart Model, so there is no this feature there. Besides, their behaviors seems different. Now, I’d try to use Bart to see how it goes. I personally think EncoderDecoderModel should also be able to forward without “decoder_input_ids” and “labels”. Anyways, thank you for sharing that.