Hello!
I am trying to fine-tune language identification model using SpeechBrain. I followed the notebook [tutorial][1] on fine-tuning ASR model. However, I have trouble adding new label into the model. I would like to add new language their, so I edited label_encoder.txt and add there new line ‘yk: Yakut’ => 107, but during the training process I got the next error:
return torch._C._nn.nll_loss_nd(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
IndexError: Target 107 is out of bound
My language brain class looks like this:
class LanguageBrain(speechbrain.core.Brain):
def on_stage_start(self, stage, epoch):
# enable grad for all modules we want to fine-tune
if stage == speechbrain.Stage.TRAIN:
for module in [self.modules.compute_features, self.modules.mean_var_norm,
self.modules.embedding_model, self.modules.classifier]:
for p in module.parameters():
p.requires_grad = True
def compute_forward(self, batch, stage):
"""Computation pipeline based on a encoder + speaker classifier.
Data augmentation and environmental corruption are applied to the
input speech.
"""
batch = batch.to(self.device)
wavs, lens = batch.sig
feats = self.modules.compute_features(wavs)
feats = self.modules.mean_var_norm(feats, lens)
# Embeddings + speaker classifier
embeddings = self.modules.embedding_model(feats, lens)
outputs = self.modules.classifier(embeddings)
return outputs, lens
def compute_objectives(self, predictions, batch, stage):
"""Computes the loss using speaker-id as label.
"""
predictions, lens = predictions
uttid = batch.id
langid = batch.lang_id_encoded
if stage == speechbrain.Stage.TRAIN:
langid = torch.cat([langid], dim=0)
loss = self.hparams.compute_cost(predictions, langid.unsqueeze(1), lens)
return loss
def on_stage_end(self, stage, stage_loss, epoch=None):
"""Gets called at the end of an epoch."""
stage_stats = {"loss": stage_loss}
checkpointer.save_and_keep_only(
meta={"loss": stage_stats["loss"]},
min_keys=["loss"])
I guess I need to do something with the classification layer. I deleted it and changed to layer with the output features I need:
import torch.nn as nn
class Identity(nn.Module):
def __init__(self):
super(Identity, self).__init__()
def forward(self, x):
return x
language_id.mods.classifier.out.w = Identity()
language_id.mods.classifier.out.w = nn.Linear(512, 108)
However in that case I got another error:
RuntimeError: Error(s) in loading state_dict for Classifier: size mismatch for out.w.weight: copying a param with shape torch.Size([107, 512]) from checkpoint, the shape in current model is torch.Size([108, 512]). size mismatch for out.w.bias: copying a param with shape torch.Size([107]) from checkpoint, the shape in current model is torch.Size([108]).
How could I train the model to predict new label class?