Loading div2k from super-image into Pytorch

Hi @lhoestq,

Yes, the features are “lr” and “hr”. augmented_dataset comes in as the dataset.Dataset class once I load them. This is what augmented_dataset looks like

Dataset({
    features: ['hr', 'lr'],
    num_rows: 4000
})

If I try using numbers instead,

augmented_dataset.set_format(type='torch', columns= [0,1])

I get this errorr

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-4-2a3051e19cd6> in <module>
----> 1 augmented_dataset.set_format(type='torch', columns= [0,1])

~\anaconda3\lib\site-packages\datasets\fingerprint.py in wrapper(*args, **kwargs)
    395             # Call actual function
    396 
--> 397             out = func(self, *args, **kwargs)
    398 
    399             # Update fingerprint of in-place transforms + update in-place history of transforms

~\anaconda3\lib\site-packages\datasets\arrow_dataset.py in set_format(self, type, columns, output_all_columns, **format_kwargs)
   1349             columns = [columns]
   1350         if columns is not None and any(col not in self._data.column_names for col in columns):
-> 1351             raise ValueError(
   1352                 "Columns {} not in the dataset. Current columns in the dataset: {}".format(
   1353                     list(filter(lambda col: col not in self._data.column_names, columns)), self._data.column_names

ValueError: Columns [0, 1] not in the dataset. Current columns in the dataset: ['hr', 'lr']

I’ve reached the point where I should set columns=['hr', 'lr'] but also I can’t use string type to represent them. This seems a bit contradicting.