Thanks for the input @marshmellow77 ! I managed to get a few steps further in deploying the model, using the examples you linked. For those interested, here’s my current deployment code snippet:
# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
model_data=s3_location, # path to your model and script
role=role, # iam role with permissions to create an Endpoint
transformers_version='4.17.0',
pytorch_version='1.10.2',
py_version='py38', # python version used
env={
'HF_MODEL_ID':'openai/whisper-large',
'HF_TASK':'automatic-speech-recognition'
}
)
# deploy the endpoint endpoint
predictor = huggingface_model.deploy(
initial_instance_count=1,
instance_type="ml.p2.xlarge"
)
Here, the s3_location
variable contains the location of the model archive:
repository = "openai/whisper-large"
model_id=repository.split("/")[-1]
s3_location=f"s3://{sess.default_bucket()}/custom_inference/{model_id}/model.tar.gz"
This model archive is created by the following bash commands (executed from a Sagemaker notebook):
!git lfs install
!git clone https://huggingface.co/$repository
!cp -r code/ $model_id/code/
%cd $model_id
!tar zcvf model.tar.gz *
!aws s3 cp model.tar.gz $s3_location
Note that you need to add a requirements.txt
file with transformers==4.23.1
in the code
folder after you execute the git clone
command for this to work.
The endpoint is now able to load the Whisper model, which of course a big step forward, but I’m as of now not yet able to properly call the endpoint. For example, if I load a small audio file and try to predict it as follows,
from transformers.pipelines.automatic_speech_recognition import ffmpeg_read
SAMPLING_RATE = 1000
with open("some_audio_file.mp3", "rb") as file:
audio_file = file.read()
audio_nparray = ffmpeg_read(audio_file, SAMPLING_RATE)
predictor.predict({
"raw": audio_nparray,
"sampling_rate": SAMPLING_RATE
})
the following error is raised:
{
"code": 400,
"type": "InternalServerException",
"message": "expected np.ndarray (got list)"
}
Even though audio_nparray
is actually a NumPy array:
type(audio_nparray)
numpy.ndarray