Percent slicing and rounding + Stratify

I want to incorporate 4 fold cross validation, however how do I ensure the train and validation are stratified?

from datasets import load_dataset

The first 75% of dataset

train_75_25pct_ds = load_dataset(‘dataset’, split=‘train[:75%]’)

The last 25% of dataset

validation_75_25pct_ds = load_dataset(‘dataset’, split=‘train[-25%:]’)

You can do this as follows:

train_ds = load_dataset(‘dataset’, split=‘train’)
ddict  = train_ds.train_test_split(train_size=0.25, stratify_by_column="col_name")
train_ds, validation_ds = ddict["train"], ddict["test"]