Error in generating model output: InferenceClient.chat_completion() got an unexpected keyword argument 'last_input_token_count'

from smolagents import CodeAgent, InferenceClientModel

agent = CodeAgent(tools=, model=model)

alfred_agent = agent.from_hub(‘sergiopaniego/AlfredAgent’, token=hf_token, trust_remote_code=True)

alfred_agent.run(“Give me best playlist for a party at the Wayne’s mansion. The party idea is a ‘villain masquerade’ theme”)

In the above code I am using a llm model from openai in the agent instead of the Qwen model, but in the alfred_agent I am using the default hugging face provided model. When running the code I am getting an error saying "AgentGenerationError: Error in generating model output:
InferenceClient.chat_completion() got an unexpected keyword argument ‘last_input_token_count’
" and “TypeError: InferenceClient.chat_completion() got an unexpected keyword argument ‘last_input_token_count’”.

What can I do so that I can use gpt model for the llm part (usage limit with the model provided by hugginface). Can I use the agent provide by the huggingface without any issue or is there any other solution.

1 Like

For example, if you want to use OpenAI’s LLM, I think you need to use OpenAIServerModel instead of InferenceClientModel.

Maybe like this.

from smolagents import CodeAgent, OpenAIServerModel

model = OpenAIServerModel(model_id="gpt-4o", api_base="https://api.openai.com/v1", api_key=os.getenv("OPENAI_API_KEY", None))
agent = CodeAgent(tools=[], model=model)