Serveless memory problem when deploy Wav2Vec2 with custom inference code

I’m providing the model via s3

from sagemaker.huggingface.model import HuggingFaceModel
from sagemaker.serializers import DataSerializer

# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
	transformers_version='4.17.0',
	pytorch_version='1.10.2',
	py_version='py38',
	model_data='s3://sagemaker-us-east-2-094463604469/model.tar.gz',
	role=role, 
)