Got wrong row number of dataset viewer

Hi,
I’m experiencing a strange problem. I have a dataset with the type DatasetDict, here is its basic info:

DatasetDict({
    train: Dataset({
        features: ['subject', 'grade', 'skill', 'pic_choice', 'pic_prob', 'problem', 'problem_pic', 'choices', 'choices_pic', 'answer_idx'],
        num_rows: 1000
    })
    valid: Dataset({
        features: ['subject', 'grade', 'skill', 'pic_choice', 'pic_prob', 'problem', 'problem_pic', 'choices', 'choices_pic', 'answer_idx'],
        num_rows: 1000
    })
    test: Dataset({
        features: ['subject', 'grade', 'skill', 'pic_choice', 'pic_prob', 'problem', 'problem_pic', 'choices', 'choices_pic', 'answer_idx'],
        num_rows: 1000
    })
})

I use the following code to upload it into the hub:

dataset_small.push_to_hub(
    hub_path, private=False, commit_message="Upload example dataset."
)

After it is converted into parquet and uploaded into the hub, the dataset viewer shows the wrong number of rows:

But if I use the following code to download the dataset:

from datasets import load_dataset
dataset_demo = load_dataset(hub_path)
print(dataset_demo)

I will get the correct info (1k/1k/1k for each split and 3k in total).

Is there something wrong with my usage of datasets and huggingface hub?

Update:
I also got the wrong number when I used the datasets-server API in Get the number of rows and the size in bytes to get the dataset size. It seems that the backend of the datasets-server API has some bugs.

Looking forward to any suggestions and thanks a lot in advance!

well spotted, thanks for opening the issue The API returns the wrong row number · Issue #2581 · huggingface/datasets-server · GitHub, we’re on it

fixed. Thanks again for the investigation!

Thank you for the reply and fix~