Has anyone here deployed a transformers model on Google Cloud using AI Platform?


I have a fine tuned distilgpt2 model that I want to deploy using GCP ai-platform.

I’ve followed all the documentation for deploying a custom prediction routine on GCP but when creating the model I get the error:

Create Version failed. Bad model detected with error: Model requires more memory than allowed. Please try to decrease the model size and re-deploy.

Here is my setup.py file:

from setuptools import setup


I then create a model version using:

gcloud beta ai-platform versions create v1 --model my_model \
 --origin=gs://my_bucket/model/ \
 --python-version=3.7 \
 --runtime-version=2.3 \
 --package-uris=gs://my_bucket/packages/gpt2-0.1.tar.gz,gs://cloud-ai-pytorch/torch-1.3.1+cpu-cp37-cp37m-linux_x86_64.whl \

I have tried every suggested route and cant get this to work and I’m still getting the above error. I’m using the smallest gpt2 model and am well within memory.

Can anyone who have successfully deployed to GCP please give some insight here.

Thank you


Hi @farazk86,

Any updates about this? Did you manage to use AI platform to serve your model’s predictions?

Unfortunately, I could not. There were too many issues and I eventually gave up on the project.