I think T5 model don’t support fine-tune using “language-modeling/run_clm.py”
cmd = '''
python '/content/gdrive/MyDrive/Colab Notebooks/T5_generation/transformers/examples/pytorch/language-modeling/run_clm.py' \
--model_name_or_path t5-base \
--train_file {0} \
--do_train \
--num_train_epochs 3 \
--overwrite_output_dir \
--per_device_train_batch_size 2 \
--output_dir {1}
'''.format(file_name, weights_dir)
!{cmd}
This script was working but this occurs this error.
2021-07-22 07:43:57.168589: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
07/22/2021 07:43:59 - WARNING - __main__ - Process rank: -1, device: cpu, n_gpu: 0distributed training: False, 16-bits training: False
07/22/2021 07:43:59 - INFO - __main__ - Training/evaluation parameters TrainingArguments(
_n_gpu=0,
adafactor=False,
adam_beta1=0.9,
adam_beta2=0.999,
adam_epsilon=1e-08,
dataloader_drop_last=False,
dataloader_num_workers=0,
dataloader_pin_memory=True,
ddp_find_unused_parameters=None,
debug=[],
deepspeed=None,
disable_tqdm=False,
do_eval=False,
do_predict=False,
do_train=True,
eval_accumulation_steps=None,
eval_steps=None,
evaluation_strategy=IntervalStrategy.NO,
fp16=False,
fp16_backend=auto,
fp16_full_eval=False,
fp16_opt_level=O1,
gradient_accumulation_steps=1,
greater_is_better=None,
group_by_length=False,
ignore_data_skip=False,
label_names=None,
label_smoothing_factor=0.0,
learning_rate=5e-05,
length_column_name=length,
load_best_model_at_end=False,
local_rank=-1,
log_level=-1,
log_level_replica=-1,
log_on_each_node=True,
logging_dir=output/runs/Jul22_07-43-59_f9ae03178b49,
logging_first_step=False,
logging_steps=500,
logging_strategy=IntervalStrategy.STEPS,
lr_scheduler_type=SchedulerType.LINEAR,
max_grad_norm=1.0,
max_steps=-1,
metric_for_best_model=None,
mp_parameters=,
no_cuda=False,
num_train_epochs=3.0,
output_dir=output,
overwrite_output_dir=True,
past_index=-1,
per_device_eval_batch_size=8,
per_device_train_batch_size=2,
prediction_loss_only=False,
push_to_hub=False,
push_to_hub_model_id=output,
push_to_hub_organization=None,
push_to_hub_token=None,
remove_unused_columns=True,
report_to=['tensorboard'],
resume_from_checkpoint=None,
run_name=output,
save_on_each_node=False,
save_steps=500,
save_strategy=IntervalStrategy.STEPS,
save_total_limit=None,
seed=42,
sharded_ddp=[],
skip_memory_metrics=True,
tpu_metrics_debug=False,
tpu_num_cores=None,
use_legacy_prediction_loop=False,
warmup_ratio=0.0,
warmup_steps=0,
weight_decay=0.0,
)
07/22/2021 07:43:59 - WARNING - datasets.builder - Using custom data configuration default-d468d4eee4ec0b5d
07/22/2021 07:43:59 - INFO - datasets.builder - Generating dataset text (/root/.cache/huggingface/datasets/text/default-d468d4eee4ec0b5d/0.0.0/e16f44aa1b321ece1f87b07977cc5d70be93d69b20486d6dacd62e12cf25c9a5)
Downloading and preparing dataset text/default (download: Unknown size, generated: Unknown size, post-processed: Unknown size, total: Unknown size) to /root/.cache/huggingface/datasets/text/default-d468d4eee4ec0b5d/0.0.0/e16f44aa1b321ece1f87b07977cc5d70be93d69b20486d6dacd62e12cf25c9a5...
100% 1/1 [00:00<00:00, 2647.92it/s]
07/22/2021 07:43:59 - INFO - datasets.utils.download_manager - Downloading took 0.0 min
07/22/2021 07:43:59 - INFO - datasets.utils.download_manager - Checksum Computation took 0.0 min
100% 1/1 [00:00<00:00, 128.73it/s]
07/22/2021 07:43:59 - INFO - datasets.utils.info_utils - Unable to verify checksums.
07/22/2021 07:43:59 - INFO - datasets.builder - Generating split train
07/22/2021 07:43:59 - INFO - datasets.utils.info_utils - Unable to verify splits sizes.
Dataset text downloaded and prepared to /root/.cache/huggingface/datasets/text/default-d468d4eee4ec0b5d/0.0.0/e16f44aa1b321ece1f87b07977cc5d70be93d69b20486d6dacd62e12cf25c9a5. Subsequent calls will reuse this data.
100% 1/1 [00:00<00:00, 714.78it/s]
07/22/2021 07:43:59 - WARNING - datasets.builder - Using custom data configuration default-d468d4eee4ec0b5d
07/22/2021 07:43:59 - INFO - datasets.builder - Overwrite dataset info from restored data version.
07/22/2021 07:43:59 - INFO - datasets.info - Loading Dataset info from /root/.cache/huggingface/datasets/text/default-d468d4eee4ec0b5d/0.0.0/e16f44aa1b321ece1f87b07977cc5d70be93d69b20486d6dacd62e12cf25c9a5
07/22/2021 07:43:59 - WARNING - datasets.builder - Reusing dataset text (/root/.cache/huggingface/datasets/text/default-d468d4eee4ec0b5d/0.0.0/e16f44aa1b321ece1f87b07977cc5d70be93d69b20486d6dacd62e12cf25c9a5)
07/22/2021 07:43:59 - INFO - datasets.info - Loading Dataset info from /root/.cache/huggingface/datasets/text/default-d468d4eee4ec0b5d/0.0.0/e16f44aa1b321ece1f87b07977cc5d70be93d69b20486d6dacd62e12cf25c9a5
07/22/2021 07:43:59 - WARNING - datasets.builder - Using custom data configuration default-d468d4eee4ec0b5d
07/22/2021 07:43:59 - INFO - datasets.builder - Overwrite dataset info from restored data version.
07/22/2021 07:43:59 - INFO - datasets.info - Loading Dataset info from /root/.cache/huggingface/datasets/text/default-d468d4eee4ec0b5d/0.0.0/e16f44aa1b321ece1f87b07977cc5d70be93d69b20486d6dacd62e12cf25c9a5
07/22/2021 07:43:59 - WARNING - datasets.builder - Reusing dataset text (/root/.cache/huggingface/datasets/text/default-d468d4eee4ec0b5d/0.0.0/e16f44aa1b321ece1f87b07977cc5d70be93d69b20486d6dacd62e12cf25c9a5)
07/22/2021 07:43:59 - INFO - datasets.info - Loading Dataset info from /root/.cache/huggingface/datasets/text/default-d468d4eee4ec0b5d/0.0.0/e16f44aa1b321ece1f87b07977cc5d70be93d69b20486d6dacd62e12cf25c9a5
[INFO|configuration_utils.py:545] 2021-07-22 07:43:59,790 >> loading configuration file https://huggingface.co/t5-base/resolve/main/config.json from cache at /root/.cache/huggingface/transformers/91e9fe874e06c44883b535d6c950b8b89d6eaa3298d8e7fb3b2c78039e9f8b7b.66b9637a52aa11e9285cdd6e668cc0df14b3bcf0b6674cf3ba5353c542649637
[INFO|configuration_utils.py:581] 2021-07-22 07:43:59,790 >> Model config T5Config {
"architectures": [
"T5WithLMHeadModel"
],
"d_ff": 3072,
"d_kv": 64,
"d_model": 768,
"decoder_start_token_id": 0,
"dropout_rate": 0.1,
"eos_token_id": 1,
"feed_forward_proj": "relu",
"gradient_checkpointing": false,
"initializer_factor": 1.0,
"is_encoder_decoder": true,
"layer_norm_epsilon": 1e-06,
"model_type": "t5",
"n_positions": 512,
"num_decoder_layers": 12,
"num_heads": 12,
"num_layers": 12,
"output_past": true,
"pad_token_id": 0,
"relative_attention_num_buckets": 32,
"task_specific_params": {
"summarization": {
"early_stopping": true,
"length_penalty": 2.0,
"max_length": 200,
"min_length": 30,
"no_repeat_ngram_size": 3,
"num_beams": 4,
"prefix": "summarize: "
},
"translation_en_to_de": {
"early_stopping": true,
"max_length": 300,
"num_beams": 4,
"prefix": "translate English to German: "
},
"translation_en_to_fr": {
"early_stopping": true,
"max_length": 300,
"num_beams": 4,
"prefix": "translate English to French: "
},
"translation_en_to_ro": {
"early_stopping": true,
"max_length": 300,
"num_beams": 4,
"prefix": "translate English to Romanian: "
}
},
"transformers_version": "4.9.0.dev0",
"use_cache": true,
"vocab_size": 32128
}
[INFO|tokenization_auto.py:432] 2021-07-22 07:43:59,816 >> Could not locate the tokenizer configuration file, will try to use the model config instead.
[INFO|configuration_utils.py:545] 2021-07-22 07:43:59,841 >> loading configuration file https://huggingface.co/t5-base/resolve/main/config.json from cache at /root/.cache/huggingface/transformers/91e9fe874e06c44883b535d6c950b8b89d6eaa3298d8e7fb3b2c78039e9f8b7b.66b9637a52aa11e9285cdd6e668cc0df14b3bcf0b6674cf3ba5353c542649637
[INFO|configuration_utils.py:581] 2021-07-22 07:43:59,842 >> Model config T5Config {
"architectures": [
"T5WithLMHeadModel"
],
"d_ff": 3072,
"d_kv": 64,
"d_model": 768,
"decoder_start_token_id": 0,
"dropout_rate": 0.1,
"eos_token_id": 1,
"feed_forward_proj": "relu",
"gradient_checkpointing": false,
"initializer_factor": 1.0,
"is_encoder_decoder": true,
"layer_norm_epsilon": 1e-06,
"model_type": "t5",
"n_positions": 512,
"num_decoder_layers": 12,
"num_heads": 12,
"num_layers": 12,
"output_past": true,
"pad_token_id": 0,
"relative_attention_num_buckets": 32,
"task_specific_params": {
"summarization": {
"early_stopping": true,
"length_penalty": 2.0,
"max_length": 200,
"min_length": 30,
"no_repeat_ngram_size": 3,
"num_beams": 4,
"prefix": "summarize: "
},
"translation_en_to_de": {
"early_stopping": true,
"max_length": 300,
"num_beams": 4,
"prefix": "translate English to German: "
},
"translation_en_to_fr": {
"early_stopping": true,
"max_length": 300,
"num_beams": 4,
"prefix": "translate English to French: "
},
"translation_en_to_ro": {
"early_stopping": true,
"max_length": 300,
"num_beams": 4,
"prefix": "translate English to Romanian: "
}
},
"transformers_version": "4.9.0.dev0",
"use_cache": true,
"vocab_size": 32128
}
[INFO|tokenization_utils_base.py:1730] 2021-07-22 07:44:00,015 >> loading file https://huggingface.co/t5-base/resolve/main/spiece.model from cache at /root/.cache/huggingface/transformers/684a47ca6257e4ca71f0037771464c5b323e945fbc58697d2fad8a7dd1a2f8ba.3b69006860e7b5d0a63ffdddc01ddcd6b7c318a6f4fd793596552c741734c62d
[INFO|tokenization_utils_base.py:1730] 2021-07-22 07:44:00,015 >> loading file https://huggingface.co/t5-base/resolve/main/tokenizer.json from cache at /root/.cache/huggingface/transformers/90de37880b5ff5ac7ab70ff0bd369f207e9b74133fa153c163d14c5bb0116207.8627f1bd5d270a9fd2e5a51c8bec3223896587cc3cfe13edeabb0992ab43c529
[INFO|tokenization_utils_base.py:1730] 2021-07-22 07:44:00,015 >> loading file https://huggingface.co/t5-base/resolve/main/added_tokens.json from cache at None
[INFO|tokenization_utils_base.py:1730] 2021-07-22 07:44:00,015 >> loading file https://huggingface.co/t5-base/resolve/main/special_tokens_map.json from cache at None
[INFO|tokenization_utils_base.py:1730] 2021-07-22 07:44:00,015 >> loading file https://huggingface.co/t5-base/resolve/main/tokenizer_config.json from cache at None
[INFO|configuration_utils.py:545] 2021-07-22 07:44:00,043 >> loading configuration file https://huggingface.co/t5-base/resolve/main/config.json from cache at /root/.cache/huggingface/transformers/91e9fe874e06c44883b535d6c950b8b89d6eaa3298d8e7fb3b2c78039e9f8b7b.66b9637a52aa11e9285cdd6e668cc0df14b3bcf0b6674cf3ba5353c542649637
[INFO|configuration_utils.py:581] 2021-07-22 07:44:00,044 >> Model config T5Config {
"architectures": [
"T5WithLMHeadModel"
],
"d_ff": 3072,
"d_kv": 64,
"d_model": 768,
"decoder_start_token_id": 0,
"dropout_rate": 0.1,
"eos_token_id": 1,
"feed_forward_proj": "relu",
"gradient_checkpointing": false,
"initializer_factor": 1.0,
"is_encoder_decoder": true,
"layer_norm_epsilon": 1e-06,
"model_type": "t5",
"n_positions": 512,
"num_decoder_layers": 12,
"num_heads": 12,
"num_layers": 12,
"output_past": true,
"pad_token_id": 0,
"relative_attention_num_buckets": 32,
"task_specific_params": {
"summarization": {
"early_stopping": true,
"length_penalty": 2.0,
"max_length": 200,
"min_length": 30,
"no_repeat_ngram_size": 3,
"num_beams": 4,
"prefix": "summarize: "
},
"translation_en_to_de": {
"early_stopping": true,
"max_length": 300,
"num_beams": 4,
"prefix": "translate English to German: "
},
"translation_en_to_fr": {
"early_stopping": true,
"max_length": 300,
"num_beams": 4,
"prefix": "translate English to French: "
},
"translation_en_to_ro": {
"early_stopping": true,
"max_length": 300,
"num_beams": 4,
"prefix": "translate English to Romanian: "
}
},
"transformers_version": "4.9.0.dev0",
"use_cache": true,
"vocab_size": 32128
}
Traceback (most recent call last):
File "/content/gdrive/MyDrive/Colab Notebooks/T5_MWP_generation/transformers/examples/pytorch/language-modeling/run_clm.py", line 515, in <module>
main()
File "/content/gdrive/MyDrive/Colab Notebooks/T5_MWP_generation/transformers/examples/pytorch/language-modeling/run_clm.py", line 344, in main
use_auth_token=True if model_args.use_auth_token else None,
File "/usr/local/lib/python3.7/dist-packages/transformers/models/auto/auto_factory.py", line 386, in from_pretrained
f"Unrecognized configuration class {config.__class__} for this kind of AutoModel: {cls.__name__}.\n"
ValueError: Unrecognized configuration class <class 'transformers.models.t5.configuration_t5.T5Config'> for this kind of AutoModel: AutoModelForCausalLM.
Model type should be one of RoFormerConfig, BigBirdPegasusConfig, GPTNeoConfig, BigBirdConfig, CamembertConfig, XLMRobertaConfig, RobertaConfig, BertConfig, OpenAIGPTConfig, GPT2Config, TransfoXLConfig, XLNetConfig, XLMConfig, CTRLConfig, ReformerConfig, BertGenerationConfig, XLMProphetNetConfig, ProphetNetConfig, BartConfig, MBartConfig, PegasusConfig, MarianConfig, BlenderbotConfig, BlenderbotSmallConfig, MegatronBertConfig.
I think T5-base model can’t fine-tune using “transformers/examples/pytorch/language-modeling/run_clm.py” .
Is there a way to fine tune T5-base model for text generation?