Predict function ignore parameters

I’m trying to deploy a Hugging Face Model (GPT-Neo) on the SageMaker endpoint. I followed the official example and this forum. but it seems that the generate function is totally ignoring my parameters (it generates just one word despite setting the min length to 10000!) Any idea what is wrong?

My code:

from sagemaker.huggingface import HuggingFaceModel
import sagemaker

role = sagemaker.get_execution_role()
# Hub Model configuration.
hub = {

# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(

# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
	initial_instance_count=1, # number of instances
	instance_type='ml.g4dn.xlarge' # ec2 instance type

prompt = "Some prompt"

gen_tex = predictor.predict({
	"inputs": prompt,
    "parameters" : {"min_length":10,}


Could you please update to the latest version? which would be transformers_version="4.12" and pytorch_version="1.9" and test it again?

What is the output of gen_tex[0]['generated_text']? Is the , in the “parameters” on purpose?

Updated to the latest version but it still ignore the parameter:

    'inputs': "Can you please let us know more details about your",
    'parameters': {"min_length":1000}


[{'generated_text': 'Can you please let us know more details about your account?\n\nHello,\nI am interested in the above-mentioned company and I have read some very interesting articles about it. I am interested in starting the work. Please let me know if'}]```

I guess it’s only for min_length since it works fine for max_length=3

[{'generated_text': 'Can you please let us know more details about your research'}]

And when you run model.generate with your parameter in a colab or other environment with the same Transformers version it works?

Yes, it works fine when I use it locally with from_pretrained model.

Hi Ali, what happens if you set the min_length and the max_length parameters explicitly? I’m asking because I believe the text generator uses the max_length parameter from the model configuration if you don’t set it explicitly. And if the max_length parameter in the config file is smaller than you min_length the output would be truncated. So, just wondering if setting both (e.g. min_length=1000 and max_length=2000) helps?


Thank you both for your help. That solved the problem @marshmellow77.

Awesome, glad it worked. It’d be great if you could mark this thread as Answered/Solved - it would make it easier and quicker in the future for other users with the same problem to find the solution :slight_smile:


1 Like