OsOne
May 26, 2021, 8:12am
1
Hi HuggingFace Team
We are at the beginning of a new DL project. In this project, we work with transformers in TF2.
Before we start this we’d love to know what are your plans about creating a training framework for TF2.
In our search for answers, we came across a GitHub issue that mentioned that TFtrainer will be changed/removed.
opened 10:38PM - 04 May 21 UTC
closed 09:18PM - 10 Jun 21 UTC
## Environment info
<!-- You can run the command `transformers-cli env` and cop… y-and-paste its output below.
Don't forget to fill out the missing fields in that output! -->
- `transformers` version: 4.5.1
- Platform: ubuntu 18.04
- Python version: 3.6.9
- PyTorch version (GPU?):
- Tensorflow version (GPU?): tensorflow-gpu==2.4.1
- Using GPU in script?: Yes
- Using distributed or parallel set-up in script?: No
### Who can help
<!-- Your issue will be replied to more quickly if you can figure out the right person to tag with @
If you know how to use git blame, that is the easiest way, otherwise, here is a rough guide of **who to tag**.
Please tag fewer than 3 people.
Models:
- albert, bert, xlm: @LysandreJik
- blenderbot, bart, marian, pegasus, encoderdecoder, t5: @patrickvonplaten, @patil-suraj
- longformer, reformer, transfoxl, xlnet: @patrickvonplaten
- fsmt: @stas00
- funnel: @sgugger
- gpt2: @patrickvonplaten, @LysandreJik
- rag: @patrickvonplaten, @lhoestq
- tensorflow: @Rocketknight1
Library:
- benchmarks: @patrickvonplaten
- deepspeed: @stas00
- ray/raytune: @richardliaw, @amogkam
- text generation: @patrickvonplaten
- tokenizers: @LysandreJik
- trainer: @sgugger
- pipelines: @LysandreJik
Documentation: @sgugger
Model hub:
- for issues with a model report at https://discuss.huggingface.co/ and tag the model's creator.
HF projects:
- datasets: [different repo](https://github.com/huggingface/datasets)
- rust tokenizers: [different repo](https://github.com/huggingface/tokenizers)
Examples:
- maintained examples (not research project or legacy): @sgugger, @patil-suraj
- research_projects/bert-loses-patience: @JetRunner
- research_projects/distillation: @VictorSanh
-->
@patil-suraj @Rocketknight1
## Information
I'm using TFT5ForConditionalGeneration for masked language modelling task. During training GPU utilisation is above 95% but as soon as evaluation starts it goes to 0%. Evaluation is slow. Even though [evaluate function is in strategy.scope()](https://github.com/huggingface/transformers/blob/c065025c4720a87783ca504a9018454893d00649/src/transformers/trainer_tf.py#L580). it does not use gpu.
The problem arises when using:
* [ ] the official example scripts: (give details below)
I'm using the official example script of TFTrainer and modified `run_tf_glue.py` a bit for custom data input.
The tasks I am working on is:
* [ ] my own task or dataset: (give details below)
Final train_dataset and eval_dataset (input to TFTrainer) have the form `({"input_ids": , "attention_mask": ,"decoder_attention_mask": }, labels)`
## To reproduce
Steps to reproduce the behavior:
I tried reproducing the error using run_tf_squad.py and run_tf_glue.py but both the scripts gave error as the inputs to the trainer were not compatible. Only MRPC task worked, but it had only 400 examples in evaluation so hard to determine. Rest of them simply didn't work, there was an error.
If possible I would like to contribute to TFTrainer in terms of running evaluation on GPU and processing squad and glue dataset to match dimensions to TFTrainer inputs. Guidance is really appreciated.
<!-- If you have code snippets, error messages, stack traces please provide them here as well.
Important! Use code tags to correctly format your code. See https://help.github.com/en/github/writing-on-github/creating-and-highlighting-code-blocks#syntax-highlighting
Do not use screenshots, as they are hard to read and (more importantly) don't allow others to copy-and-paste your code.-->
## Expected behavior
We were hoping you could shed more light on your plans for the integration of the transformers library with TF2. More concretely -
Do you intend to release a TF Trainer?
Will it be using Keras?
Any date expectations?
Thanks,
Hi there!
Yes the TFTrainer will be deprecated and removed in v5, we will focus on better integrating with Keras (though the means of Keras callbacks if we need to add functionality). Checkout the new classification example for an example of where we are going.