SageMaker Debugger Hugging Face Training Report

Hi Team,

Good day!!

We are using Hugging Face estimator to train a BERT model on SageMaker, would like to build a training report like SageMaker Debugger XGBoost Training Report and analyse the Loss vs Step graph, Loss vs Epochs graph and Accuracy vs Epochs graph etc.

Thanks @philschmid for sharing this example - training with custom metrics and we see timestamp vs metric graph but we are interested in above graphs.

Could you please help?

Thanks & Regards,
Vinayak

Hey @Vinayaks117,
If your training script is logging the metrics you want to capture you can add the “regex” pattern to your estimator to capture and then visualize them.

Hello @philschmid

Yeah, we followed your example training with custom metrics and able to plot timestamp vs metric graphs but we want to create following graphs.

  1. Loss vs Step graph
  2. Loss vs Epochs graph
  3. Accuracy vs Epochs graph

As documented in AWS docs, SageMaker debugger does not support the Hugging Face.

I believe we can create above graphs like SageMaker Debugger XGBoost Training Report if SageMaker debugger supports Hugging Face.

Thanks

The SageMaker debugger should be support for HF as well since we are building on top of PyTorch or TensorFlow