So every time I try to train my model (google/vit-base-patch16-224-in21k) on my dataset for image classification, it always gives a FileNotFound error (see below).
Is there a way to go through the entire ‘Photos’ column and delete/identify whichever rows have an image url that doesn’t exist? Is there a python/huggingface function for that?
You can simply skip it as follows:
urls = examples['Photo']
images = 
for url in urls:
image = Image.open(requests.get(url, stream=True).raw)
examples['image'] = images
dataset = load_dataset("TheNoob3131/mosquito-data")
dataset = dataset.map(to_pillow, batched=True)