@sgugger Progress Update Aug 4 -> Aug 19

Following @sshleifer example, here is what I worked on the past few weeks and plan to work in the near future.
).

Model outputs:

Finished cleaning up all model outputs and made sure PyTorch and TensorFlow have the same API.

Documentation:

  • Work on cleaning up and updating docstrings and documentation of the main classes: config, tokenizer, models and pipelines.
  • Automatic conversion of the tutorials into notebooks and added the “open in colab” button.

Trainer:

  • A bit of clean-up to make sure Trainer and TFTrainer have the same API.
  • Exposed the customization points when the user wants to subclass and override.
  • Initial work to add hyperparameter search (see #6576).
  • Initial work to have an easy bridge between :hugs: nlp and Trainer (see #6449).

Repository consistency:

People love the fact each model file is self-contained and the code is not refactored since they can then quickly experiment, but it can be hard to maintain! Added a script that checks all models are tested (by the common tests) and documented.

Funnel Transformer:

Paper - Initial work to understand the implementation and port it to :hugs: Transformers. PyTorch version is almost done.

Plans:

  • Continue the work on Trainer with hyperparameter search and :hugs: nlp interface.
  • Work on refactoring all examples to use Trainer and :hugs: nlp
  • Finish porting Funnel Transformer.
5 Likes

Very excited about Optuna integration and Funnel Transformer! Also absolutely love your code reviews and suggestions !

1 Like

+1 to that. Your code reviews have been super helpful and appreciated.

1 Like

Optuna integration would be ground-breaking, at least for my use case. Now I used to write my own config class that is an iterator over all possible hyperparameter combinations and just loops over them. Tedious to maintain. Being able to rely on a well tested, well supported package would be a load of my mind!

1 Like

Looking forward to Optuna integration.

Great to see that you’re working on the Funnel Transformer implementation :heart:

I’ve already trained a model from scratch and can’t wait to properly evaluate it with Transformers :hugs:

1 Like