AttributeError Traceback (most recent call last)
in <cell line: 0>()
1 # Initialize RewardTrainer
----> 2 trainer = RewardTrainer(
3 model=model,
4 args=training_args,
5 tokenizer=tokenizer,
1 frames
/usr/local/lib/python3.11/dist-packages/trl/trainer/reward_trainer.py in init(self, model, args, data_collator, train_dataset, eval_dataset, processing_class, model_init, compute_metrics, callbacks, optimizers, preprocess_logits_for_metrics, peft_config)
167
168 # Disable dropout in the model
ā 169 if args.disable_dropout:
170 disable_dropout_in_model(model)
171
AttributeError: āTrainingArgumentsā object has no attribute ādisable_dropoutā
1 Like
Instead of TrainingArguments
, use RewardTrainingArguments
from trl
:
from trl import RewardTrainingArguments
training_args = RewardTrainingArguments(
output_dir="./results",
per_device_train_batch_size=4,
per_device_eval_batch_size=4,
evaluation_strategy="steps",
eval_steps=500,
save_strategy="steps",
save_steps=500,
logging_steps=100,
learning_rate=5e-5,
weight_decay=0.01,
num_train_epochs=3,
disable_dropout=True # Important: This is required
)
trainer = RewardTrainer(
model=model,
args=training_args,
tokenizer=tokenizer,
train_dataset=train_dataset,
eval_dataset=eval_dataset
)
If you still want to use TrainingArguments
, you can manually disable dropout in your model before passing it to RewardTrainer
:
from trl.trainer.utils import disable_dropout_in_model
disable_dropout_in_model(model) # Manually disable dropout
However, using RewardTrainingArguments
is the recommended approach.
1 Like
Thank you, but what happened here:
1 Like
Actually when I run the given code from hugging face, it has some error.
I just copy paste.
1 Like
Is it possible that you have an old version of trl?
pip install -U trl transformers peft accelerate huggingface_hub
1 Like
hello , use this snippet
from trl import RewardConfig, RewardTrainer
training_args = RewardConfig(output_dir="Qwen2.5-0.5B-Reward", per_device_train_batch_size=2)
reference_link: https://github.com/huggingface/trl
2 Likes