Chapter 7 questions

slavrine · April 20, 2024, 4:10am

Chapter 7, Token classification has a .compute_metric() function that uses argmax() instead of softmax(). I have attached it below. I’m wondering why does this not cause problems with gradient back-propagation? Argmax is non-differentiable, so I would assume there’d be problems, but example clearly works, and I must be missing something here.

link:

code:
import numpy as np

def compute_metrics(eval_preds):
logits, labels = eval_preds
predictions = np.argmax(logits, axis=-1)

# Remove ignored index (special tokens) and convert to labels
true_labels = [[label_names[l] for l in label if l != -100] for label in labels]
true_predictions = [
    [label_names[p] for (p, l) in zip(prediction, label) if l != -100]
    for prediction, label in zip(predictions, labels)
]
all_metrics = metric.compute(predictions=true_predictions, references=true_labels)
return {
    "precision": all_metrics["overall_precision"],
    "recall": all_metrics["overall_recall"],
    "f1": all_metrics["overall_f1"],
    "accuracy": all_metrics["overall_accuracy"],
}

Topic		Replies	Views
Chapter 3 questions Course	151	10656	October 6, 2025
Fine Tuning IMDb tutorial - Unable to reproduce and adapt Beginners	19	8607	August 21, 2020
Transformers v3.0.0 is out! 🤗Transformers	0	1953	July 7, 2020
Seq2SeqTrainer: enabled must be a bool (got NoneType) 🤗Transformers	15	3972	December 5, 2022
Tutorial: Fine-tuning with custom datasets – sentiment, NER, and question answering 🤗Transformers	19	12953	February 12, 2024

Chapter 7 questions

Related topics