Hey everyone, I’m new to the forums but have been following HuggingFace and Transformers in the last years.
I’m starting my MsC thesis where I’ll be fine-tuning transformer models for text classification. However I’m not sure what is the best way to structure my code.
I plan to run multiple experiments modifying different parameters such as dataset preprocessing, language model head structure and so on. I’m wondering what’s the most efficient way to implement the fine-tuning experiments, maybe a single script with multiple arguments? Or wrap the fine-tuning code in a Python class and create different instances with different parameters?
Then there’s the question of where to run this code. I don’t have physical access to a large machine with GPU, best case I will be getting Google cloud credits for renting a machine instance with GPU/TPU. In this case how should I pack my code for quick access? Maybe upload it to a Github repo and then pull it for every experiment? Maybe create a Docker image to encapsulate everything?
These may be silly questions but I don’t have any close contacts with extensive state of the art NLP experience and many papers don’t go deep into the architecture of their code. Any help will be greatly appreciated!