Serveless memory problem when deploy Wav2Vec2 with custom inference code

No, i mean the use of a language model to boosting wav2vec2 decoding as described by @patrickvonplaten here How to create Wav2Vec2 With Language model, but in Amazon Sagemaker (serveless). In this topic @philschmid suggested using custom inference script, but i’m having problems as mentioned above.

There is another option to use a language model without a custom inference script?