Hi,
I have a fine tuned distilgpt2
model that I want to deploy using GCP ai-platform.
I’ve followed all the documentation for deploying a custom prediction routine on GCP but when creating the model I get the error:
Create Version failed. Bad model detected with error: Model requires more memory than allowed. Please try to decrease the model size and re-deploy.
Here is my setup.py
file:
from setuptools import setup
setup(
name="generator_package",
version="0.2",
include_package_data=True,
scripts=["generator_class.py"],
install_requires=['transformers==2.8.0']
)
I then create a model version using:
gcloud beta ai-platform versions create v1 --model my_model \
--origin=gs://my_bucket/model/ \
--python-version=3.7 \
--runtime-version=2.3 \
--package-uris=gs://my_bucket/packages/gpt2-0.1.tar.gz,gs://cloud-ai-pytorch/torch-1.3.1+cpu-cp37-cp37m-linux_x86_64.whl \
--prediction-class=model_prediction.CustomModelPrediction
I have tried every suggested route and cant get this to work and I’m still getting the above error. I’m using the smallest gpt2 model and am well within memory.
Can anyone who have successfully deployed to GCP please give some insight here.
Thank you