Hi,
looking at the files: Ayham/roberta_gpt2_summarization_cnn_dailymail at main
It indeed looks like only the weights (pytorch_model.bin) and model configuration (config.json) are uploaded, but not the tokenizer files.
You can upload the tokenizer files programmatically using the huggingface_hub library. First, make sure you have installed git-LFS and are logged into your HuggingFace account. In Colab, this can be done as follows:
!sudo apt-get install git-lfs
!git config --global user.email "your email"
!git config --global user.name "your username"
!huggingface-cli login
Next, you can do the following:
from transformers import RobertaTokenizer
from huggingface_hub import Repository
repo_url = "https://huggingface.co/Ayham/roberta_gpt2_summarization_cnn_dailymail"
repo = Repository(local_dir="tokenizer_files", # note that this directory must not exist already
clone_from=repo_url,
git_user="Niels Rogge",
git_email="niels.rogge1@gmail.com",
use_auth_token=True,
)
tokenizer = RobertaTokenizer.from_pretrained("roberta-base")
tokenizer.save_pretrained("tokenizer_files")
repo.push_to_hub(commit_message="Upload tokenizer files")
Note that the Trainer can actually automatically push all files during/after training to the hub for you as seen here.