Problem arise after one week: Bad request: "Task not found for this model"

davnas · December 9, 2024, 9:54am

Hello, sorry for the unusual request. I’m working on an assignment where we need to fine-tune a model using Unsloath and then publish a UI on Hugging Face. I created my model called "davnas/Italian_Cuisine_1.2" using the Unsloath Colab, and I successfully uploaded it to Hugging Face.

Last week, everything was working fine. However, yesterday, after switching from my friend’s model back to mine, I started encountering the following error:

Bad request: Task not found for this model

Even after attempting to revert to the original setup, the problem persists.

Could someone please help me troubleshoot this issue? Apologies for any mistakes in the description—I’m still learning and just getting started. Thank you!

I usually perform inference with

from transformers import AutoModel, AutoTokenizer

max_seq_length = 2048  # Choose any! We auto support ROPE scaling internally!
dtype = None  # None for auto detection. Float16 for Tesla T4, V100, bFloat16 for Ampere+

model_name_or_path =  "davnas/Italian_Cousine_1.2"

from unsloth import FastLanguageModel
import torch

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name=model_name_or_path,
    max_seq_length=max_seq_length,
    dtype=dtype,
    load_in_4bit=True,
    # token = "hf_...", #se il nostro modello non è public
    # Use one if using gated models like meta-llama/Llama-2-7b-hf
)

and


from unsloth import FastLanguageModel

FastLanguageModel.for_inference(model)  # Enable native 2x faster inference

messages = [
    {"role": "user", "content": "How can I cook an smoothie?"},
]

inputs = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,  # Must add for generation
    return_tensors="pt",
).to("cuda")

from transformers import TextStreamer
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

model.generate(input_ids=inputs, streamer=text_streamer, max_new_tokens=128,
               use_cache=True, temperature=1.5, min_p=0.1)

the code used for training the model can be found here

github.com

davidenascivera/LAB2_Scalable/blob/main/FineTunedItalianCousineReview.ipynb

{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "provenance": [],
      "gpuType": "T4"
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    },
    "language_info": {
      "name": "python"
    },
    "accelerator": "GPU",
    "widgets": {
      "application/vnd.jupyter.widget-state+json": {
        "38eae818bb1747cb9d6b0ee75a693794": {
          "model_module": "@jupyter-widgets/controls",

This file has been truncated. show original

lastly the full error is:

===== Application Startup at 2024-12-07 09:20:41 =====

/usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:228: UserWarning: The 'tuples' format for chatbot messages is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
  warnings.warn(
* Running on local URL:  http://0.0.0.0:7860, with SSR ⚡

To create a public link, set `share=True` in `launch()`.
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/huggingface_hub/utils/_http.py", line 406, in hf_raise_for_status
    response.raise_for_status()
  File "/usr/local/lib/python3.10/site-packages/requests/models.py", line 1024, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://api-inference.huggingface.co/models/davnas/Italian_Cousine_1.2/v1/chat/completions

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 622, in process_events
    response = await route_utils.call_process_api(
  File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
    output = await app.get_blocks().process_api(
  File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2016, in process_api
    result = await self.call_function(
  File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1581, in call_function
    prediction = await utils.async_iteration(iterator)
  File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
    return await anext(iterator)
  File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 796, in asyncgen_wrapper
    response = await iterator.__anext__()
  File "/usr/local/lib/python3.10/site-packages/gradio/chat_interface.py", line 667, in _stream_fn
    first_response = await async_iteration(generator)
  File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
    return await anext(iterator)
  File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
    return await anyio.to_thread.run_sync(
  File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
    return await future
  File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
    result = context.run(func, *args)
  File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
    return next(iterator)
  File "/home/user/app/app.py", line 30, in respond
    for message in client.chat_completion(
  File "/usr/local/lib/python3.10/site-packages/huggingface_hub/inference/_client.py", line 842, in chat_completion
    data = self.post(model=model_url, json=payload, stream=stream)
  File "/usr/local/lib/python3.10/site-packages/huggingface_hub/inference/_client.py", line 305, in post
    hf_raise_for_status(response)
  File "/usr/local/lib/python3.10/site-packages/huggingface_hub/utils/_http.py", line 460, in hf_raise_for_status
    raise _format(BadRequestError, message, response) from e
huggingface_hub.errors.BadRequestError: (Request ID: BwTVVHXLmrcuAlpoxR2bH)

Bad request:
Task not found for this model
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/huggingface_hub/utils/_http.py", line 406, in hf_raise_for_status
    response.raise_for_status()
  File "/usr/local/lib/python3.10/site-packages/requests/models.py", line 1024, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://api-inference.huggingface.co/models/davnas/Italian_Cousine_1.2/v1/chat/completions

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 622, in process_events
    response = await route_utils.call_process_api(
  File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
    output = await app.get_blocks().process_api(
  File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2016, in process_api
    result = await self.call_function(
  File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1581, in call_function
    prediction = await utils.async_iteration(iterator)
  File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
    return await anext(iterator)
  File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 796, in asyncgen_wrapper
    response = await iterator.__anext__()
  File "/usr/local/lib/python3.10/site-packages/gradio/chat_interface.py", line 667, in _stream_fn
    first_response = await async_iteration(generator)
  File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
    return await anext(iterator)
  File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
    return await anyio.to_thread.run_sync(
  File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
    return await future
  File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
    result = context.run(func, *args)
  File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
    return next(iterator)
  File "/home/user/app/app.py", line 30, in respond
    for message in client.chat_completion(
  File "/usr/local/lib/python3.10/site-packages/huggingface_hub/inference/_client.py", line 842, in chat_completion
    data = self.post(model=model_url, json=payload, stream=stream)
  File "/usr/local/lib/python3.10/site-packages/huggingface_hub/inference/_client.py", line 305, in post
    hf_raise_for_status(response)
  File "/usr/local/lib/python3.10/site-packages/huggingface_hub/utils/_http.py", line 460, in hf_raise_for_status
    raise _format(BadRequestError, message, response) from e
huggingface_hub.errors.BadRequestError: (Request ID: GueVsTq3CZ9CNO2c2jVEg)

Bad request:
Task not found for this model

John6666 · December 9, 2024, 10:42am

The Serverless Inference API is currently being degraded, and all but the most famous models have been turned off. Since unsloth is relatively famous, it may still be usable in some cases…
This is probably the cause.

There has also been a problem since the beginning that it does not work properly without a README.md. This is the configuration problem mentioned below.

github.com/huggingface/chat-ui

Bad request: Task not found for this model

opened 09:33AM - 18 Aug 24 UTC

NITHISH-Projects

support

Hi all, I am facing the following issue when using HuggingFaceEndpoint for my c…ustom finetuned model in my repository "Nithish-2001/RAG-29520hd0-1-chat-finetune" which is public with gradio. llm_name: Nithish-2001/RAG-29520hd0-1-chat-finetune Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_errors.py", line 304, in hf_raise_for_status response.raise_for_status() File "/usr/local/lib/python3.10/dist-packages/requests/models.py", line 1024, in raise_for_status raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://api-inference.huggingface.co/models/Nithish-2001/RAG-29520hd0-1-chat-finetune The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/gradio/routes.py", line 763, in predict output = await route_utils.call_process_api( File "/usr/local/lib/python3.10/dist-packages/gradio/route_utils.py", line 288, in call_process_api output = await app.get_blocks().process_api( File "/usr/local/lib/python3.10/dist-packages/gradio/blocks.py", line 1931, in process_api result = await self.call_function( File "/usr/local/lib/python3.10/dist-packages/gradio/blocks.py", line 1516, in call_function prediction = await anyio.to_thread.run_sync( # type: ignore File "/usr/local/lib/python3.10/dist-packages/anyio/to_thread.py", line 33, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread return await future File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 807, in run result = context.run(func, *args) File "/usr/local/lib/python3.10/dist-packages/gradio/utils.py", line 826, in wrapper response = f(*args, **kwargs) File "<ipython-input-7-4e46265a5151>", line 90, in conversation response = qa_chain.invoke({"question": message, "chat_history": formatted_chat_history}) File "/usr/local/lib/python3.10/dist-packages/langchain/chains/base.py", line 164, in invoke raise e File "/usr/local/lib/python3.10/dist-packages/langchain/chains/base.py", line 154, in invoke self._call(inputs, run_manager=run_manager) File "/usr/local/lib/python3.10/dist-packages/langchain/chains/conversational_retrieval/base.py", line 169, in _call answer = self.combine_docs_chain.run( File "/usr/local/lib/python3.10/dist-packages/langchain_core/_api/deprecation.py", line 170, in warning_emitting_wrapper return wrapped(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/langchain/chains/base.py", line 603, in run return self(kwargs, callbacks=callbacks, tags=tags, metadata=metadata)[ File "/usr/local/lib/python3.10/dist-packages/langchain_core/_api/deprecation.py", line 170, in warning_emitting_wrapper return wrapped(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/langchain/chains/base.py", line 381, in __call__ return self.invoke( File "/usr/local/lib/python3.10/dist-packages/langchain/chains/base.py", line 164, in invoke raise e File "/usr/local/lib/python3.10/dist-packages/langchain/chains/base.py", line 154, in invoke self._call(inputs, run_manager=run_manager) File "/usr/local/lib/python3.10/dist-packages/langchain/chains/combine_documents/base.py", line 138, in _call output, extra_return_dict = self.combine_docs( File "/usr/local/lib/python3.10/dist-packages/langchain/chains/combine_documents/stuff.py", line 257, in combine_docs return self.llm_chain.predict(callbacks=callbacks, **inputs), {} File "/usr/local/lib/python3.10/dist-packages/langchain/chains/llm.py", line 316, in predict return self(kwargs, callbacks=callbacks)[self.output_key] File "/usr/local/lib/python3.10/dist-packages/langchain_core/_api/deprecation.py", line 170, in warning_emitting_wrapper return wrapped(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/langchain/chains/base.py", line 381, in __call__ return self.invoke( File "/usr/local/lib/python3.10/dist-packages/langchain/chains/base.py", line 164, in invoke raise e File "/usr/local/lib/python3.10/dist-packages/langchain/chains/base.py", line 154, in invoke self._call(inputs, run_manager=run_manager) File "/usr/local/lib/python3.10/dist-packages/langchain/chains/llm.py", line 126, in _call response = self.generate([inputs], run_manager=run_manager) File "/usr/local/lib/python3.10/dist-packages/langchain/chains/llm.py", line 138, in generate return self.llm.generate_prompt( File "/usr/local/lib/python3.10/dist-packages/langchain_core/language_models/llms.py", line 750, in generate_prompt return self.generate(prompt_strings, stop=stop, callbacks=callbacks, **kwargs) File "/usr/local/lib/python3.10/dist-packages/langchain_core/language_models/llms.py", line 944, in generate output = self._generate_helper( File "/usr/local/lib/python3.10/dist-packages/langchain_core/language_models/llms.py", line 787, in _generate_helper raise e File "/usr/local/lib/python3.10/dist-packages/langchain_core/language_models/llms.py", line 774, in _generate_helper self._generate( File "/usr/local/lib/python3.10/dist-packages/langchain_core/language_models/llms.py", line 1508, in _generate self._call(prompt, stop=stop, run_manager=run_manager, **kwargs) File "/usr/local/lib/python3.10/dist-packages/langchain_community/llms/huggingface_endpoint.py", line 265, in _call response = self.client.post( File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/inference/_client.py", line 273, in post hf_raise_for_status(response) File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_errors.py", line 358, in hf_raise_for_status raise BadRequestError(message, response=response) from e huggingface_hub.utils._errors.BadRequestError: (Request ID: JoV91bXHMHzsDi4vyFyXj) Bad request: Task not found for this model. Kindly please help me with this issue. THank you.

kyubert · December 28, 2024, 8:22pm

So, what are our options then? I’m also very new and got here through a tutorial. I’m not sure what the alternative is to the Inference API, or if I would have to migrate my project somewhere else.

John6666 · December 29, 2024, 3:41pm

To be honest, there is no substitute for the Inference API. There are plenty of paid services online, but we’ll exclude them this time.

What you can do for free with HF is use the Inference API from Gradio, and this is still relatively easy to use even for new models created by the user, as long as the model size is within 10GB (there is a limit here).
For more details, please see below.

Also, if you want to use high-speed inference, the Pro subscription for $9 a month comes with 10 units of Zero GPU space, which is convenient, but it’s quite tricky to use, and there is also a 25-minute GPU usage limit per day. It’s not as easy as the Inference API. However, it can do a lot of things and is extremely powerful…

kyubert · December 29, 2024, 4:32pm

Thanks for all the help. I’m kind of stuck on Gradio though because I’m getting this error. Not sure if there’s something wrong with how I made the original repo.

kyubert · December 29, 2024, 7:16pm

Nvm got it working. Apparently in the app.py the hf_token parameter needs to be replaced with token

kyubert · December 29, 2024, 7:48pm

Is there any point when we can expect the Inference API to be back up? I tried getting it to work with the Gradio space but it didn’t seem to help much

John6666 · December 30, 2024, 2:18am

I don’t think it will be restored, because the problem is caused by a shortage of shared resources.

kyubert · December 30, 2024, 2:57am

Well… Darn. I guess I’ll just have to either figure out how to modify my code to work with the Gradio space or figure out how to work with the Pro features. Thanks for the help either way.

Topic		Replies	Views
Bad request: Task not found for this model Beginners	4	198	January 11, 2025
ValueError: Model not supported for task text-generation (Llama-3.1-8B-Instruct with featherless-ai) Models	1	92	June 19, 2025
I am getting this error on langchain Beginners	15	1333	May 25, 2025
Why is the Inference API not working for the model I uploaded? Models	3	188	January 18, 2025
Perhaps your features (`output` in this case) have excessive nesting (inputs type `list` where type `int` is expected) 🤗Transformers	19	592	January 20, 2025

Problem arise after one week: Bad request: "Task not found for this model"

Related topics