I’m building a pytorch lightning model that uses a tokenizer and model from T5Tokenizer/T5ForConditionalGeneration with from_pretrained(‘google/flan-t5-small’). When I ran my code for the first time, my OS raised an exception and I got the following message from Windows Security:
file: [some of my directories]\.cache\huggingface\hub\models–google–flan-t5-small\blobs\4a8c0174fa3ee6fe2ffd0f6e21992d4ca4ad1e9b12bd14155b57479e27f56292
I saw similar posts on this forum about trojans, and it seems like fragments of malware code appear in datasets from places like github, but they are contained in jsons and trained on - not run (see https://discuss.huggingface.co/t/trojan-in-common-voice-dataset/18155/). In this case however, I think I’m just downloading pretrained model weights/structure and a tokenizer, so I don’t see why sections of data would be included.
The files in the blobs folder above don’t have any extension and I’m not sure if it’s safe to open them with a text editor to look inside, I also couldn’t tell where they came from.
Any help would be appreciated, thanks.