How to fine-tune T5-base model?

I want to fine-tune T5 model. but there is issue in running this script.

my code:-

!python /content/transformers/examples/language-modeling/run_clm.py
–model_name_or_path t5-base
–train_file /content/train.txt
–do_train
–learning_rate=1e-4
–per_device_train_batch_size=4
–output_dir /tmp/test-clm

This code is not working. How can I fine-tune the t5-base model?

2 Likes

Please use Preformatted text block for your codes, you can create this block automatically by pressing Ctrl+E:

!python /content/transformers/examples/language-modeling/run_clm.py
–model_name_or_path t5-base
–train_file /content/train.txt
–do_train
–learning_rate=1e-4
–per_device_train_batch_size=4
–output_dir /tmp/test-clm

Can you share any kind of error or unexpected output with us?

1 Like
!python '/content/gdrive/MyDrive/Colab Notebooks/T5_generation/transformers/examples/pytorch/language-modeling/run_clm.py' \
-–model_name_or_path 't5-base' \
–-train_file /content/train.txt \
–-do_train \
–-learning_rate=1e-4 \
–-per_device_train_batch_size=4 \
–-output_dir /tmp/test-clm \

Traceback (most recent call last):
  File "/content/gdrive/MyDrive/Colab Notebooks/T5_generation/transformers/examples/pytorch/language-modeling/run_clm.py", line 31, in <module>
    import datasets
ModuleNotFoundError: No module named 'datasets'

I got this error when I run this code.

You need to install datasets package too.

pip install datasets
1 Like

Thank you.

Traceback (most recent call last):
  File "/content/gdrive/MyDrive/Colab Notebooks/T5_generation/transformers/examples/pytorch/language-modeling/run_clm.py", line 31, in <module>
    import datasets
  File "/usr/local/lib/python3.7/dist-packages/datasets/__init__.py", line 33, in <module>
    from .arrow_dataset import Dataset, concatenate_datasets
  File "/usr/local/lib/python3.7/dist-packages/datasets/arrow_dataset.py", line 42, in <module>
    from datasets.tasks.text_classification import TextClassification
  File "/usr/local/lib/python3.7/dist-packages/datasets/tasks/__init__.py", line 3, in <module>
    from ..utils.logging import get_logger
  File "/usr/local/lib/python3.7/dist-packages/datasets/utils/__init__.py", line 21, in <module>
    from .download_manager import DownloadManager, GenerateMode
  File "/usr/local/lib/python3.7/dist-packages/datasets/utils/download_manager.py", line 26, in <module>
    from .file_utils import (
  File "/usr/local/lib/python3.7/dist-packages/datasets/utils/file_utils.py", line 27, in <module>
    from tqdm.contrib.concurrent import thread_map
ModuleNotFoundError: No module named 'tqdm.contrib.concurrent'

I got his error and don’t know how to fix it.

The last line is showing the exact Error and other lines are just the traceback of the error, whenever you face an error saying

ModuleNotFoundError: No module named 'xyz'

It means in the code you’re trying to run, you used a module named xyz which cannot be found, and you can solve it by installing the module.
For this particular case installing tqdm should solve your problem.

pip install tqdm
1 Like

In this case it’s because you need a better version of tqdm, so might need to do

pip install tqdm --upgrade
2 Likes

@SMMousavi @sgugger Thank you so much. I solved that issue with the help of you.

1 Like

I think T5 model don’t support fine-tune using “language-modeling/run_clm.py”

cmd = '''
python '/content/gdrive/MyDrive/Colab Notebooks/T5_generation/transformers/examples/pytorch/language-modeling/run_clm.py' \
    --model_name_or_path t5-base \
    --train_file {0} \
    --do_train \
    --num_train_epochs 3 \
    --overwrite_output_dir \
    --per_device_train_batch_size 2 \
    --output_dir {1}
'''.format(file_name, weights_dir)

!{cmd}

This script was working but this occurs this error.

2021-07-22 07:43:57.168589: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
07/22/2021 07:43:59 - WARNING - __main__ - Process rank: -1, device: cpu, n_gpu: 0distributed training: False, 16-bits training: False
07/22/2021 07:43:59 - INFO - __main__ - Training/evaluation parameters TrainingArguments(
_n_gpu=0,
adafactor=False,
adam_beta1=0.9,
adam_beta2=0.999,
adam_epsilon=1e-08,
dataloader_drop_last=False,
dataloader_num_workers=0,
dataloader_pin_memory=True,
ddp_find_unused_parameters=None,
debug=[],
deepspeed=None,
disable_tqdm=False,
do_eval=False,
do_predict=False,
do_train=True,
eval_accumulation_steps=None,
eval_steps=None,
evaluation_strategy=IntervalStrategy.NO,
fp16=False,
fp16_backend=auto,
fp16_full_eval=False,
fp16_opt_level=O1,
gradient_accumulation_steps=1,
greater_is_better=None,
group_by_length=False,
ignore_data_skip=False,
label_names=None,
label_smoothing_factor=0.0,
learning_rate=5e-05,
length_column_name=length,
load_best_model_at_end=False,
local_rank=-1,
log_level=-1,
log_level_replica=-1,
log_on_each_node=True,
logging_dir=output/runs/Jul22_07-43-59_f9ae03178b49,
logging_first_step=False,
logging_steps=500,
logging_strategy=IntervalStrategy.STEPS,
lr_scheduler_type=SchedulerType.LINEAR,
max_grad_norm=1.0,
max_steps=-1,
metric_for_best_model=None,
mp_parameters=,
no_cuda=False,
num_train_epochs=3.0,
output_dir=output,
overwrite_output_dir=True,
past_index=-1,
per_device_eval_batch_size=8,
per_device_train_batch_size=2,
prediction_loss_only=False,
push_to_hub=False,
push_to_hub_model_id=output,
push_to_hub_organization=None,
push_to_hub_token=None,
remove_unused_columns=True,
report_to=['tensorboard'],
resume_from_checkpoint=None,
run_name=output,
save_on_each_node=False,
save_steps=500,
save_strategy=IntervalStrategy.STEPS,
save_total_limit=None,
seed=42,
sharded_ddp=[],
skip_memory_metrics=True,
tpu_metrics_debug=False,
tpu_num_cores=None,
use_legacy_prediction_loop=False,
warmup_ratio=0.0,
warmup_steps=0,
weight_decay=0.0,
)
07/22/2021 07:43:59 - WARNING - datasets.builder - Using custom data configuration default-d468d4eee4ec0b5d
07/22/2021 07:43:59 - INFO - datasets.builder - Generating dataset text (/root/.cache/huggingface/datasets/text/default-d468d4eee4ec0b5d/0.0.0/e16f44aa1b321ece1f87b07977cc5d70be93d69b20486d6dacd62e12cf25c9a5)
Downloading and preparing dataset text/default (download: Unknown size, generated: Unknown size, post-processed: Unknown size, total: Unknown size) to /root/.cache/huggingface/datasets/text/default-d468d4eee4ec0b5d/0.0.0/e16f44aa1b321ece1f87b07977cc5d70be93d69b20486d6dacd62e12cf25c9a5...
100% 1/1 [00:00<00:00, 2647.92it/s]
07/22/2021 07:43:59 - INFO - datasets.utils.download_manager - Downloading took 0.0 min
07/22/2021 07:43:59 - INFO - datasets.utils.download_manager - Checksum Computation took 0.0 min
100% 1/1 [00:00<00:00, 128.73it/s]
07/22/2021 07:43:59 - INFO - datasets.utils.info_utils - Unable to verify checksums.
07/22/2021 07:43:59 - INFO - datasets.builder - Generating split train
07/22/2021 07:43:59 - INFO - datasets.utils.info_utils - Unable to verify splits sizes.
Dataset text downloaded and prepared to /root/.cache/huggingface/datasets/text/default-d468d4eee4ec0b5d/0.0.0/e16f44aa1b321ece1f87b07977cc5d70be93d69b20486d6dacd62e12cf25c9a5. Subsequent calls will reuse this data.
100% 1/1 [00:00<00:00, 714.78it/s]
07/22/2021 07:43:59 - WARNING - datasets.builder - Using custom data configuration default-d468d4eee4ec0b5d
07/22/2021 07:43:59 - INFO - datasets.builder - Overwrite dataset info from restored data version.
07/22/2021 07:43:59 - INFO - datasets.info - Loading Dataset info from /root/.cache/huggingface/datasets/text/default-d468d4eee4ec0b5d/0.0.0/e16f44aa1b321ece1f87b07977cc5d70be93d69b20486d6dacd62e12cf25c9a5
07/22/2021 07:43:59 - WARNING - datasets.builder - Reusing dataset text (/root/.cache/huggingface/datasets/text/default-d468d4eee4ec0b5d/0.0.0/e16f44aa1b321ece1f87b07977cc5d70be93d69b20486d6dacd62e12cf25c9a5)
07/22/2021 07:43:59 - INFO - datasets.info - Loading Dataset info from /root/.cache/huggingface/datasets/text/default-d468d4eee4ec0b5d/0.0.0/e16f44aa1b321ece1f87b07977cc5d70be93d69b20486d6dacd62e12cf25c9a5
07/22/2021 07:43:59 - WARNING - datasets.builder - Using custom data configuration default-d468d4eee4ec0b5d
07/22/2021 07:43:59 - INFO - datasets.builder - Overwrite dataset info from restored data version.
07/22/2021 07:43:59 - INFO - datasets.info - Loading Dataset info from /root/.cache/huggingface/datasets/text/default-d468d4eee4ec0b5d/0.0.0/e16f44aa1b321ece1f87b07977cc5d70be93d69b20486d6dacd62e12cf25c9a5
07/22/2021 07:43:59 - WARNING - datasets.builder - Reusing dataset text (/root/.cache/huggingface/datasets/text/default-d468d4eee4ec0b5d/0.0.0/e16f44aa1b321ece1f87b07977cc5d70be93d69b20486d6dacd62e12cf25c9a5)
07/22/2021 07:43:59 - INFO - datasets.info - Loading Dataset info from /root/.cache/huggingface/datasets/text/default-d468d4eee4ec0b5d/0.0.0/e16f44aa1b321ece1f87b07977cc5d70be93d69b20486d6dacd62e12cf25c9a5
[INFO|configuration_utils.py:545] 2021-07-22 07:43:59,790 >> loading configuration file https://huggingface.co/t5-base/resolve/main/config.json from cache at /root/.cache/huggingface/transformers/91e9fe874e06c44883b535d6c950b8b89d6eaa3298d8e7fb3b2c78039e9f8b7b.66b9637a52aa11e9285cdd6e668cc0df14b3bcf0b6674cf3ba5353c542649637
[INFO|configuration_utils.py:581] 2021-07-22 07:43:59,790 >> Model config T5Config {
  "architectures": [
    "T5WithLMHeadModel"
  ],
  "d_ff": 3072,
  "d_kv": 64,
  "d_model": 768,
  "decoder_start_token_id": 0,
  "dropout_rate": 0.1,
  "eos_token_id": 1,
  "feed_forward_proj": "relu",
  "gradient_checkpointing": false,
  "initializer_factor": 1.0,
  "is_encoder_decoder": true,
  "layer_norm_epsilon": 1e-06,
  "model_type": "t5",
  "n_positions": 512,
  "num_decoder_layers": 12,
  "num_heads": 12,
  "num_layers": 12,
  "output_past": true,
  "pad_token_id": 0,
  "relative_attention_num_buckets": 32,
  "task_specific_params": {
    "summarization": {
      "early_stopping": true,
      "length_penalty": 2.0,
      "max_length": 200,
      "min_length": 30,
      "no_repeat_ngram_size": 3,
      "num_beams": 4,
      "prefix": "summarize: "
    },
    "translation_en_to_de": {
      "early_stopping": true,
      "max_length": 300,
      "num_beams": 4,
      "prefix": "translate English to German: "
    },
    "translation_en_to_fr": {
      "early_stopping": true,
      "max_length": 300,
      "num_beams": 4,
      "prefix": "translate English to French: "
    },
    "translation_en_to_ro": {
      "early_stopping": true,
      "max_length": 300,
      "num_beams": 4,
      "prefix": "translate English to Romanian: "
    }
  },
  "transformers_version": "4.9.0.dev0",
  "use_cache": true,
  "vocab_size": 32128
}

[INFO|tokenization_auto.py:432] 2021-07-22 07:43:59,816 >> Could not locate the tokenizer configuration file, will try to use the model config instead.
[INFO|configuration_utils.py:545] 2021-07-22 07:43:59,841 >> loading configuration file https://huggingface.co/t5-base/resolve/main/config.json from cache at /root/.cache/huggingface/transformers/91e9fe874e06c44883b535d6c950b8b89d6eaa3298d8e7fb3b2c78039e9f8b7b.66b9637a52aa11e9285cdd6e668cc0df14b3bcf0b6674cf3ba5353c542649637
[INFO|configuration_utils.py:581] 2021-07-22 07:43:59,842 >> Model config T5Config {
  "architectures": [
    "T5WithLMHeadModel"
  ],
  "d_ff": 3072,
  "d_kv": 64,
  "d_model": 768,
  "decoder_start_token_id": 0,
  "dropout_rate": 0.1,
  "eos_token_id": 1,
  "feed_forward_proj": "relu",
  "gradient_checkpointing": false,
  "initializer_factor": 1.0,
  "is_encoder_decoder": true,
  "layer_norm_epsilon": 1e-06,
  "model_type": "t5",
  "n_positions": 512,
  "num_decoder_layers": 12,
  "num_heads": 12,
  "num_layers": 12,
  "output_past": true,
  "pad_token_id": 0,
  "relative_attention_num_buckets": 32,
  "task_specific_params": {
    "summarization": {
      "early_stopping": true,
      "length_penalty": 2.0,
      "max_length": 200,
      "min_length": 30,
      "no_repeat_ngram_size": 3,
      "num_beams": 4,
      "prefix": "summarize: "
    },
    "translation_en_to_de": {
      "early_stopping": true,
      "max_length": 300,
      "num_beams": 4,
      "prefix": "translate English to German: "
    },
    "translation_en_to_fr": {
      "early_stopping": true,
      "max_length": 300,
      "num_beams": 4,
      "prefix": "translate English to French: "
    },
    "translation_en_to_ro": {
      "early_stopping": true,
      "max_length": 300,
      "num_beams": 4,
      "prefix": "translate English to Romanian: "
    }
  },
  "transformers_version": "4.9.0.dev0",
  "use_cache": true,
  "vocab_size": 32128
}

[INFO|tokenization_utils_base.py:1730] 2021-07-22 07:44:00,015 >> loading file https://huggingface.co/t5-base/resolve/main/spiece.model from cache at /root/.cache/huggingface/transformers/684a47ca6257e4ca71f0037771464c5b323e945fbc58697d2fad8a7dd1a2f8ba.3b69006860e7b5d0a63ffdddc01ddcd6b7c318a6f4fd793596552c741734c62d
[INFO|tokenization_utils_base.py:1730] 2021-07-22 07:44:00,015 >> loading file https://huggingface.co/t5-base/resolve/main/tokenizer.json from cache at /root/.cache/huggingface/transformers/90de37880b5ff5ac7ab70ff0bd369f207e9b74133fa153c163d14c5bb0116207.8627f1bd5d270a9fd2e5a51c8bec3223896587cc3cfe13edeabb0992ab43c529
[INFO|tokenization_utils_base.py:1730] 2021-07-22 07:44:00,015 >> loading file https://huggingface.co/t5-base/resolve/main/added_tokens.json from cache at None
[INFO|tokenization_utils_base.py:1730] 2021-07-22 07:44:00,015 >> loading file https://huggingface.co/t5-base/resolve/main/special_tokens_map.json from cache at None
[INFO|tokenization_utils_base.py:1730] 2021-07-22 07:44:00,015 >> loading file https://huggingface.co/t5-base/resolve/main/tokenizer_config.json from cache at None
[INFO|configuration_utils.py:545] 2021-07-22 07:44:00,043 >> loading configuration file https://huggingface.co/t5-base/resolve/main/config.json from cache at /root/.cache/huggingface/transformers/91e9fe874e06c44883b535d6c950b8b89d6eaa3298d8e7fb3b2c78039e9f8b7b.66b9637a52aa11e9285cdd6e668cc0df14b3bcf0b6674cf3ba5353c542649637
[INFO|configuration_utils.py:581] 2021-07-22 07:44:00,044 >> Model config T5Config {
  "architectures": [
    "T5WithLMHeadModel"
  ],
  "d_ff": 3072,
  "d_kv": 64,
  "d_model": 768,
  "decoder_start_token_id": 0,
  "dropout_rate": 0.1,
  "eos_token_id": 1,
  "feed_forward_proj": "relu",
  "gradient_checkpointing": false,
  "initializer_factor": 1.0,
  "is_encoder_decoder": true,
  "layer_norm_epsilon": 1e-06,
  "model_type": "t5",
  "n_positions": 512,
  "num_decoder_layers": 12,
  "num_heads": 12,
  "num_layers": 12,
  "output_past": true,
  "pad_token_id": 0,
  "relative_attention_num_buckets": 32,
  "task_specific_params": {
    "summarization": {
      "early_stopping": true,
      "length_penalty": 2.0,
      "max_length": 200,
      "min_length": 30,
      "no_repeat_ngram_size": 3,
      "num_beams": 4,
      "prefix": "summarize: "
    },
    "translation_en_to_de": {
      "early_stopping": true,
      "max_length": 300,
      "num_beams": 4,
      "prefix": "translate English to German: "
    },
    "translation_en_to_fr": {
      "early_stopping": true,
      "max_length": 300,
      "num_beams": 4,
      "prefix": "translate English to French: "
    },
    "translation_en_to_ro": {
      "early_stopping": true,
      "max_length": 300,
      "num_beams": 4,
      "prefix": "translate English to Romanian: "
    }
  },
  "transformers_version": "4.9.0.dev0",
  "use_cache": true,
  "vocab_size": 32128
}

Traceback (most recent call last):
  File "/content/gdrive/MyDrive/Colab Notebooks/T5_MWP_generation/transformers/examples/pytorch/language-modeling/run_clm.py", line 515, in <module>
    main()
  File "/content/gdrive/MyDrive/Colab Notebooks/T5_MWP_generation/transformers/examples/pytorch/language-modeling/run_clm.py", line 344, in main
    use_auth_token=True if model_args.use_auth_token else None,
  File "/usr/local/lib/python3.7/dist-packages/transformers/models/auto/auto_factory.py", line 386, in from_pretrained
    f"Unrecognized configuration class {config.__class__} for this kind of AutoModel: {cls.__name__}.\n"
ValueError: Unrecognized configuration class <class 'transformers.models.t5.configuration_t5.T5Config'> for this kind of AutoModel: AutoModelForCausalLM.
Model type should be one of RoFormerConfig, BigBirdPegasusConfig, GPTNeoConfig, BigBirdConfig, CamembertConfig, XLMRobertaConfig, RobertaConfig, BertConfig, OpenAIGPTConfig, GPT2Config, TransfoXLConfig, XLNetConfig, XLMConfig, CTRLConfig, ReformerConfig, BertGenerationConfig, XLMProphetNetConfig, ProphetNetConfig, BartConfig, MBartConfig, PegasusConfig, MarianConfig, BlenderbotConfig, BlenderbotSmallConfig, MegatronBertConfig.

I think T5-base model can’t fine-tune using “transformers/examples/pytorch/language-modeling/run_clm.py” .
Is there a way to fine tune T5-base model for text generation?

Hi,

T5 is an encoder-decoder model. You can only use the run_mlm.py and run_clm.py scripts for encoder-only models like BERT and RoBERTA.

You can fine-tune T5 for text generation with the run_summarization.py script (for summarization) or the run_translation.py script (for translation).

2 Likes

@nielsr Thank you very much.