I followed the first script and am quite satisfied with the result. However, I am curious if the second one can be also an option for me.
Could you, please, outline what are the differences between these two scripts and under which circumstances each one should be used? Or, alternatively, could you, please, provide a link to documentation, if such exists, that can answer my question?
I apologize in advance, if my question sounds ignorant, I am quite a newbie in machine learning and mastering it now with HuggingFace ^^
First one is an end-to-end tutorial on how to train a mask language model, and it shows what you need to do (preprocessing etc) if you want to train an MLM, helps you understand what’s going on in the background, the problem itself etc.
Second one is a more production-y script with no EDA. I used to use those scripts like run_glue.py in my former job to train/fine-tune language model directly, like this: