Is there a way to increase Storage Quota to avoid Evicted?

I created a space where the download of the dataset which I convert to a pandas dataframe began downloading on compile.

It ran in the log repeatedly appearing to download multiple times… I believe that part to be actually loading it to different cache or parquet files. Is that correct?

Next when that step completed it appears to copy to an image - I’m thinking that is the Git LFS push image copy of what might be needed to start the space. Is that correct?

Last when it finished the step it says Storage Exceeded - Space Evicted. Makes sense maybe dataset was too large.

What are guidelines on what my max storage is per space for loading a dataset? I know a 2B param Dataset will not work because it exceeds 2 GB and that is Git LFS limit I read.

I am trying a smaller one now. Sorry if still too big. I will know in a few minutes…

If I would like to pay for additional storage for a single space - is there a rate chart for that on what I can pay? Does that go up with the other choices on hardware/GPU when a new space is created? I guess I should read that in detail.

If I want to demonstrate a large file for users at a pro rate - what is maybe max limit I should do? I know the 200MB datasets I have work fine.

Below is my relatively uneducated first attempts source code:

import pandas as pd
import gradio as gr
from datasets import load_dataset
#dataset = load_dataset(“laion/laion2B-en-joined”) too big space evicted
dataset = load_dataset(“laion/laion-coco”) # try smaller?
print (type(dataset))
df = pd.DataFrame(dataset)

Thanks again! You rock as a platform capable of the best way to handle large datasets and models. Any answers will be passed on to many curious students I am teaching on Thursdays… Much appreciated!

Check out the hugging_face.glb in this 2d to 3d space example. This is the only platform that you can do something like this. so cool. 🎨3DfromImg.GLB🎈 - a Hugging Face Space by awacke1

–Aaron

Here is what I missed. Since dataset retrieval is cached, each time it failed it retried. 1.*GB was too big. Once I set “Full Restart” in settings it tried again with the smaller dataset file and now is working!! Woot.

It is coming up strong now :wink:

App link: LionImageSearch - a Hugging Face Space by awacke1

Thanks! – Aaron

Hi @awacke1 you should have about 50GB storage available for default hardware on spaces. We experimented a bit of pressure on our infra leading to these evictions that might be unrelated to your actual storage usage.
Additionally yes the available storage change regarding the hardware you use but the values are not documented yet, still need to tailor the hardware specifications to the actual usage.

1 Like

Thanks. I’ve learned quite a bit about how to do what I need to including being a bit more dynamic with the dataset splits, and test/train local with debugging.

I’m having the exact same problem out of nowhere after a restart. I’ve been using about 5GB for months with no problem but now it is giving me the “Space evicted, storage limit exceeded (200M)” is it cached info? Restarting and Factory restart have no effect

I am encountering the same exact error.

1 Like