Inference Endpoint for batch jobs

Hello,

I am just starting with inference endpoints. So far, all the examples I saw are what I know as real-time endpoints. My need is mostly for batch jobs.

More specifically, I would like to deploy a sequence classification model and use it to predict on larger datasets. Somewhere between 1k and 1 mio records.

On Azure ML, I would choose a batch endpoint. I would split my dataset into d chunks, put those d files on blob storage of the azure ml workspace, and make an asynchrone POST request with input = URI to the storage location. The batch endpoint will process all files and store outputs again on the workspace blob storage.

Does Huggingface Inference Endpoints cover this well? If so, can somebody point me to the right resources/examples for a headstart?

Many thanks,
Paul