Serveless memory problem when deploy Wav2Vec2 with custom inference code

@diegoseto is there a particular reason why you are creating a inference.py script? You can directly provide your HF_API_TOKEN in the hub configuration next to you model id and task. See HF_API_TOKEN