Hi,
I have just finished a 4-day training of the T5 model, and I would like to test how it performs. However, all the tests I have conducted so far have yielded unsatisfactory results. Could someone please tell me whether the issue lies with the testing scripts or with the model itself? I trained the model following the tutorial available here (https://github.com/huggingface/transformers/blob/main/examples/flax/language-modeling/run_t5_mlm_flax.py), and I ran the script using the following command:
python3 prepare_tokenizer.py
python3 run_t5_mlm_flax.py --output_dir="./result" --model_type="t5" --config_name="./t5_test" --tokenizer_name="./t5_test" --dataset_name="Lipa1919/wikidumps-oscar-pl" --dataset_config_name="" --max_seq_length="512" --per_device_train_batch_size="32" --per_device_eval_batch_size="32" --adafactor --learning_rate="0.005" --weight_decay="0.001" --warmup_steps="2000" --overwrite_output_dir --logging_steps="1000" --save_steps="10000" --eval_steps="100000" --push_to_hub="True" --hub_model_id="Lipa1919/PolishT5-wikioscar" --hub_token="<mytoken>" --do_train="True"
And here my testing scripts:
from transformers import FlaxT5ForConditionalGeneration, T5TokenizerFast
model = FlaxT5ForConditionalGeneration.from_pretrained('Lipa1919/PolishT5-wikidumps')
tokenizer = T5TokenizerFast.from_pretrained('Lipa1919/PolishT5-wikidumps')
print(tokenizer.__class__.__name__)
texts = ["testowanie modelu po treningu", "zdanie dla modelu"]
inputs = tokenizer(texts, truncation=True, padding=True, return_tensors="jax")
outputs = model.generate(**inputs)
decoded_texts = [tokenizer.decode(sequence, skip_special_tokens=True) for sequence in outputs.sequences]
for text, decoded_text in zip(texts, decoded_texts):
print(f"Tekst: {text}")
print(f"Przewidziany tekst: {decoded_text}")
print()
from transformers import FlaxT5ForConditionalGeneration, T5TokenizerFast
import jax.numpy as jnp
model = FlaxT5ForConditionalGeneration.from_pretrained("Lipa1919/PolishT5-wikioscar")
tokenizer = T5TokenizerFast.from_pretrained("Lipa1919/PolishT5-wikioscar")
sentence = "zdanie dla modelu"
inputs = tokenizer(sentence, return_tensors="jax")
inputs = tokenizer.encode(sentence, return_tensors='jax')
outputs = model.generate(inputs, max_length=40, num_beams=4, early_stopping=True)
# sequences attribute in the FlaxBeamSearchOutput object contains the generated token ids
decoded = tokenizer.decode(outputs.sequences[1].tolist())
print(decoded)
Thank you in advance for your assistance.
For anyone interested, the model and tokenizer are available on my public repository Lipa1919.