I want to fine-tune Roberta on MLM task on my own data.
However, for each word, I also have an additional vector with 10 elements.
So whenever I predict a masked token, I want the loss to be:
aMLM loss + bvector_prediction_loss.
How can I do it? didnt find any example or tutorial?
@nielsr Letâs say my labels has two parts: label1 and label2.
What will be the best way to pass label1 and label2 to compute_loss ?
I add another columns to the dataset, such that now my columns are:
I know this is a very late answer but the problem is in the data collator. Probably the collator removes the labels2.
In my case, I wanted to calculate the per-example loss in the compute_loss function and save for later use. Therefore, I needed to pass a âqidâ feature.
Adding the following lines to the default collator fixed my issue:
elif k == "qid":
batch[k] = [f[k] for f in features]
The full collator function:
from collections.abc import Mapping
import numpy as np
def torch_default_data_collator(features):
import torch
if not isinstance(features[0], Mapping):
features = [vars(f) for f in features]
first = features[0]
batch = {}
# Special handling for labels.
# Ensure that tensor is created with the correct type
# (it should be automatically the case, but let's make sure of it.)
if "label" in first and first["label"] is not None:
label = first["label"].item() if isinstance(first["label"], torch.Tensor) else first["label"]
dtype = torch.long if isinstance(label, int) else torch.float
batch["labels"] = torch.tensor([f["label"] for f in features], dtype=dtype)
elif "label_ids" in first and first["label_ids"] is not None:
if isinstance(first["label_ids"], torch.Tensor):
batch["labels"] = torch.stack([f["label_ids"] for f in features])
else:
dtype = torch.long if isinstance(first["label_ids"][0], int) else torch.float
batch["labels"] = torch.tensor([f["label_ids"] for f in features], dtype=dtype)
# Handling of all other possible keys.
# Again, we will use the first element to figure out which key/values are not None for this model.
for k, v in first.items():
if k not in ("label", "label_ids") and v is not None and not isinstance(v, str):
if isinstance(v, torch.Tensor):
batch[k] = torch.stack([f[k] for f in features])
elif isinstance(v, np.ndarray):
batch[k] = torch.tensor(np.stack([f[k] for f in features]))
else:
batch[k] = torch.tensor([f[k] for f in features])
elif k == "qid":
batch[k] = [f[k] for f in features]
return batch