Problem with a new Trainer in version 4.2.0

I’m trying to instantiate a trainer like I did before in version 3.0.2:

trainer = MyTrainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    compute_metrics=compute_metrics,
    callbacks=[EarlyStoppingCallback(3, 0.5)]
)

where MyTrainer is:

class MyTrainer(Trainer):

    def __init__(
            self,
            model: PreTrainedModel,
            args: TrainingArguments,
            data_collator: Optional[DataCollator] = None,
            train_dataset: Optional[Dataset] = None,
            eval_dataset: Optional[Dataset] = None,
            compute_metrics: Optional[Callable[[EvalPrediction], Dict]] = None,
            prediction_loss_only=False,
            tb_writer: Optional["SummaryWriter"] = None,
            callbacks: Optional[List[TrainerCallback]] = None,
            optimizers: Tuple[torch.optim.Optimizer, torch.optim.lr_scheduler.LambdaLR] = (None, None)
    ):
        super().__init__(model, args, data_collator, train_dataset, eval_dataset, compute_metrics, prediction_loss_only,
                         tb_writer, callbacks, optimizers)

when I try to train the model

train_result = trainer.train(
        model_path=model_args.model_name_or_path if os.path.isdir(model_args.model_name_or_path) else None
)

I get the following error

TypeError: False is not a callable object

The model_path is None as also in version 3.0.2, but here it doesn’t work anymore.
The value of args.model_name_or_path is bert-base-uncased specified in the file run.sh.

How can I do to solve this problem?
Thanks! :slight_smile:

Hi there!

prediction_loss_only was deprecated in v3.x and has been removed in v4. In its place you have a model_init, that’s why you get this error. You should change your signature to match:

def __init__(
        self,
        model: Union[PreTrainedModel, torch.nn.Module] = None,
        args: TrainingArguments = None,
        data_collator: Optional[DataCollator] = None,
        train_dataset: Optional[Dataset] = None,
        eval_dataset: Optional[Dataset] = None,
        tokenizer: Optional["PreTrainedTokenizerBase"] = None,
        model_init: Callable[[], PreTrainedModel] = None,
        compute_metrics: Optional[Callable[[EvalPrediction], Dict]] = None,
        callbacks: Optional[List[TrainerCallback]] = None,
        optimizers: Tuple[torch.optim.Optimizer, torch.optim.lr_scheduler.LambdaLR] = (None, None),
    ):

In general, when passing keyword arguments from one method to another, you should always use the syntax name=value (so here data_collator=data_collator, train_dataset=train_dataset…) because functions can have some keyword arguments added from one version to another. If you rely just on the order of their arguments like here, you can get some mismatches that create errors down the line.

3 Likes

Hi @sgugger ,

thanks a lot, I managed to solve almost everything. The only thing that is not clear to me is why the value model_max_len of tokenizer is so big:

PreTrainedTokenizer(name_or_path='PreTrainedTokenizerBase', vocab_size=32005, model_max_len=1000000000000000019884624838656)

Do you have any idea?

I also use --max_seq_length 200.

The max_seq_length does not change the tokenizer model max length, this is something inherent to the tokenizer config. It’s set to a very large number when the mode is technically capable to get a sentence of any length.

1 Like

ok thanks a lot, then i don’t worry!