Ensemble hugging face models

Hello everyone,
I am a beginner and I am working on a NLP task. I am trying to ensemble two hugging face models but unable find a standard solution to the problem. For example, I load BERT as model1 and RoBERTa as model2 on the same dataset. How can I ensemble these two models for the task. I have tried the following

class FCEnsemble(torch.nn.Module):
    def __init__(self, num_labels, model1, model2):
        super(FCEnsemble, self).__init__()
        self.model1 = model1
        self.model2 = model2
        self.num_labels = num_labels
        self.fc = torch.nn.Linear(2 * num_labels, num_labels)
        self.loss_func = torch.nn.CrossEntropyLoss()

    def forward(self, input_ids, attention_mask, labels):
        outputs_model1 = self.model1(input_ids=input_ids, attention_mask=attention_mask).logits
        outputs_model2 = self.model2(input_ids=input_ids, attention_mask=attention_mask).logits
        combined_outputs = torch.cat((outputs_model1, outputs_model2), dim=-1)
        combined_logits = self.fc(combined_outputs)
        loss = self.loss_func(combined_logits.view(-1, self.num_labels), labels.view(-1))
        return loss

but it is giving the following error,

invalid index of a 0-dim tensor. Use tensor.item() in Python or tensor.item<T>() in C++ to convert a 0-dim tensor to a number

Am I trying the right approach or is there any better approach to this problem. Please help me.

Would you mind posting the full stack trace including which line is causing the error?

Sorry for late reply,
In short I am trying to achieve ensemble training with trainer API. Suppose, If I have two bert models for ner task, how should I ensemble them so that I can train them together, load the ensemble model after training for predictions or fine-tune model again with new data?