I am deploying models using inference endpoints and I can successfully get responses from it. However, in the Testing playground you can set a timeout and threshold, also the ObjectDetectionPipeline has a threshold and timeout argument. However, the predictions appear to be identical no matter what threshold I set.
I also could not find documentation on how to give a timeout/threshold in the CURL request:
curl "https://Something.us-east-1.aws.endpoints.huggingface.cloud" \
-X POST \
--data-binary '@cats.jpg' \
-H "Accept: application/json" \
-H "Authorization: Bearer hf_XXXXX" \
-H "Content-Type: image/jpeg" \
How do I control the threshold of prediction via such a CURL request? And if it is always fixed, why does it seem it only returns predictions with a score >0.9?