Jointly train two-stage models using Trainer

SodaSu · October 23, 2021, 4:15pm

Hi all,

I want to train a two-stage model containing model1 and model2, in which model2 takes model1’s result as input. For now, I could train model2 properly through Huggingface Seq2SeqTrainer, but I have no clue of how to jointly train model1 and model2 through the Huggingface Trainer. Could someone give me some advice? Thank you very much

sgugger · October 25, 2021, 12:13pm

I don’t think this is possible. For suc a use case, you should definitely check out Accelerate and use your custom training loop.

badranx · January 12, 2023, 7:13pm

Shouldn’t a custom model work with the trainer? a simple model that inherits PreTrainedModel and contains the two models with a custom forward method.

I’m new and researching the possibility of doing customized models within the Huggingface ecosystem. Am I wasting my time?

Example:
https://stackoverflow.com/questions/70814490/uploading-models-with-custom-forward-functions-to-the-huggingface-model-hub

shamanez · March 2, 2023, 11:48am

I have a similar case as well—a model with two towers. Basically, two models were initialized with the from_pretrained method.

I can confirm that the trainer fails when using multiple GPUs.

If we want to do that, we need to follow the CLIP implementation, where re-inventing the wheel.

@sgugger So, do you suggest doing this stuff with the accelerator?

Topic		Replies	Views
Train Mask2Former model using Trainer class 🤗Transformers	0	466	November 29, 2023
\multi-node finetuning with Trainer 🤗Transformers	0	478	July 27, 2022
Multi-GPU support lost when overwriting functions for Custom Trainer Intermediate	1	648	March 5, 2023
Ensemble with `trainer` Beginners	2	790	February 21, 2021
Custom model with two pretrained models fails multi GPU training when using the Trainer 🤗Transformers	0	245	March 2, 2023

Jointly train two-stage models using Trainer

Related topics