I would like to instantiate two language models trained on different datasets and combine their predictions during beam search: get the logits from both, merge them with a custom operation and make the next beam prediction based on that.
Is there a way to do this easily with the Transformers library?