Given that SageMaker Hugging Face Inference Toolkit builds on top of the pipeline feature, I took a look at the pipeline documentation for ASR, and it seems to me that parameters like chunk_length_s
and stride_length_s
are specified when creating the pipeline, not at every inference request. I don’t have enough experience with ASR to say if that makes sense or not, but that’s what it looks like to me.
Now, how to fix your problem with that information? Again, I have very little experience with ASR workloads, but at the very least I would think you could create a custom inference script, create and use the ASR pipeline in that script and pass the parameters to the endpoint when creating it with the deploy()
method via the env
dictionary. Should be sth along the lines of
model.deploy(..., env={'chunk_length_s': 5, 'stride_length_s': 10}, ...)
and in the inference script:
def model_fn():
pipe = pipeline("automatic-speech-recognition", chunk_length_s=os.environ('chunk_length_s'), ...
I haven’t tested this but maybe give it a try? Or, at the very least, I hope it sparks some other ideas how to go about this
Cheers
Heiko