How to add a new column to a dataset

How can I add a new column to my dataset?

I am working on Cosmos QA dataset and need to add a new column of the following format:
Value(dtype=‘string’, id=None)

The current dataset has the following features:
Dataset({
features: [‘id’, ‘context’, ‘question’, ‘answer0’, ‘answer1’, ‘answer2’, ‘answer3’, ‘label’],
num_rows: 25262
})

Thank you in advance!

Hi ! You can use the add_column method:

from datasets import load_dataset

ds = load_dataset("cosmos_qa", split="train")

new_column = ["foo"] * len(ds)
ds = ds.add_column("new_column", new_column)

and you get a dataset

Dataset({
    features: ['id', 'context', 'question', 'answer0', 'answer1', 'answer2', 'answer3', 'label', 'new_column'],
    num_rows: 25262
})
1 Like

Thank you for your reply!

Unfortunately, when I ran the above code, I got the following error:
AttributeError: ‘Dataset’ object has no attribute 'add_column’

Just FYI, Datasets version = 1.1.2

Thank you!

You need to update your datasets library to the latest version for this.

1 Like