or this can also works
squad = load_dataset('squad')['train'].train_test_split(train_size=800, test_size=200)