Not able to use where in dataset filter

Hi,
Iā€™m trying to filter rows in a dataset, but when I run

curl "https://datasets-server.huggingface.co/filter?dataset=aborruso/open_cup_complessivo&config=default&split=train&where=CUP=B76G16005640001&length=2" \
-X GET -H "Authorization: Bearer ${API_TOKEN}"

I get "error":"Parameter 'where' is invalid"}.

Itā€™s strange because here where is the main parameter.

What am I doing wrong?

Thank you

Hi! It might seem counter-intuitive, but the string has to be quoted:

https://datasets-server.huggingface.co/filter?dataset=aborruso/open_cup_complessivo&config=default&split=train&where=CUP=ā€˜B76G16005640001ā€™&length=2

We even highlighted it in the docs :blush:

Itā€™s not counter-intuitive, I am the one who is stupid :slight_smile:

Thank you very much

1 Like

@severo I ask you a related question here.

In the data source I have 1586 record in which COMUNE='VILLAROSA'.

If I apply the same filter rows I have "num_rows_total": 177.

I think on the fact that it depends that the database is big and there is a limit. The search is probably done only on the first xxx records.
Is that the case? Where can I find something in the documentation to help me understand?

Thank you very much

Exactly, this is the reason, see the field partial: true

Capture dā€™eĢcran 2024-03-08 aĢ€ 11.20.57

As mentioned at the end of the /filter doc page, we run the filter on the first 5GB of dataā€¦

1 Like

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.