Difference between language modeling scripts


I would like to fine-tune a language model (Tensorflow-based) and see that you have two scripts pertaining to that:

  1. https://github.com/huggingface/notebooks/blob/master/examples/language_modeling-tf.ipynb

  2. https://github.com/huggingface/transformers/blob/master/examples/tensorflow/language-modeling/run_mlm.py

I followed the first script and am quite satisfied with the result. However, I am curious if the second one can be also an option for me.

Could you, please, outline what are the differences between these two scripts and under which circumstances each one should be used? Or, alternatively, could you, please, provide a link to documentation, if such exists, that can answer my question?

I apologize in advance, if my question sounds ignorant, I am quite a newbie in machine learning and mastering it now with HuggingFace ^^

Thank you!

Hello Lenn,

First one is an end-to-end tutorial on how to train a mask language model, and it shows what you need to do (preprocessing etc) if you want to train an MLM, helps you understand what’s going on in the background, the problem itself etc.
Second one is a more production-y script with no EDA. I used to use those scripts like run_glue.py in my former job to train/fine-tune language model directly, like this:

!python run_glue.py \
  --model_name_or_path "model_checkpoint" \
  --task_name cola \
  --do_train \
  --do_eval \
  --max_seq_length 128 \
  --per_gpu_train_batch_size 32 \
  --per_gpu_eval_batch_size 32 \
  --learning_rate 2e-5 \
  --save_steps 2000 \
  --num_train_epochs 50 \
  --output_dir 'output_dir_here'