ValueError: The test_size = 43 should be greater or equal to the number of classes = 47

Hi, trying to train a text classification model and I am getting the following error.

Initially, I thought that I might not have enough examples in some classes, since the error logs indicate that the error is thrown when trying to train_test_split() in preprocessing. I’ve tried removing classes with too few examples – first I removed classes with only 1 example, and then upon still receiving the error, tried removing classes with two examples (the error message depicted), and have not found success.

EDIT: I understand the error now – the size of the test set itself (not the number of classes in the test set) is less than the number of classes. I have 213 rows of data, and 43 rows in the test set, implying a split of .2 – what is the parameter name to adjust this number a bit higher?

As an aside to the above, the link to find params does not work and the widget for help doesn’t display anything: “No help available for this element.”

Check your “Batch size” parameter and add more rows to your dataset. I have the same problem when my data has 10 rows and my “Batch size” is 8. It takes 8 rows to train and then there are 2 rows left. So I have an error like “test_size = 2 should be greater than or equal to the number of classes = 3”. I added more rows to make my dataset have 16 rows (multiple of 8) and it worked out.

Hope this help!!!