Hello everyone,
I am a beginner and I am working on a NLP task. I am trying to ensemble two hugging face models but unable find a standard solution to the problem. For example, I load BERT as model1 and RoBERTa as model2 on the same dataset. How can I ensemble these two models for the task. I have tried the following
class FCEnsemble(torch.nn.Module):
def __init__(self, num_labels, model1, model2):
super(FCEnsemble, self).__init__()
self.model1 = model1
self.model2 = model2
self.num_labels = num_labels
self.fc = torch.nn.Linear(2 * num_labels, num_labels)
self.loss_func = torch.nn.CrossEntropyLoss()
def forward(self, input_ids, attention_mask, labels):
outputs_model1 = self.model1(input_ids=input_ids, attention_mask=attention_mask).logits
outputs_model2 = self.model2(input_ids=input_ids, attention_mask=attention_mask).logits
combined_outputs = torch.cat((outputs_model1, outputs_model2), dim=-1)
combined_logits = self.fc(combined_outputs)
loss = self.loss_func(combined_logits.view(-1, self.num_labels), labels.view(-1))
return loss
but it is giving the following error,
invalid index of a 0-dim tensor. Use tensor.item()
in Python or tensor.item<T>()
in C++ to convert a 0-dim tensor to a number
Am I trying the right approach or is there any better approach to this problem. Please help me.