With the upcoming changes to private storage billing, I’m going through and trying to free up space from older no-longer-needed revisions of my datasets.
The docs recommend using super squash to free up the LFS objects.
But super squash does not have an effect on the storage reported in the settings for my datasets.
My current suspicion is that the refs/convert/parquet branch is holding references to those LFS objects even through I’ve squashed the history on main, because reading around it looks like the parquet bot doesn’t actually do any conversion if the dataset is already in parquet and just takes a reference instead.
Unfortunately, it doesn’t look like super squash works on refs/convert/parquet (or refs/convert/duckdb for that matter).
Can someone from the huggingface team confirm if my suspicions are correct, and what options I have to clean these LFS files up that don’t involve manual deletion of each LFS file in the UI?