Hi,
I have model that was finetuned using unsloth, I’m trying to host the model on an inference endpoint, but i end up getting the following error when trying to initialise the model on a t4 GPU with 64 gb ram
8:50.284772Z","level":"INFO","fields":{"message":"Waiting for shard to gracefully shutdown"},"target":"text_generation_launcher","span":{"rank":1,"name":"shard-manager"},"spans":[{"rank":1,"name":"shard-manager"}]} {"timestamp":"2024-05-20T05:28:50.286334Z","level":"INFO","fields":{"message":"Terminating shard"},"target":"text_generation_launcher","span":{"rank":2,"name":"shard-manager"},"spans":[{"rank":2,"name":"shard-manager"}]} {"timestamp":"2024-05-20T05:28:50.286333Z","level":"INFO","fields":{"message":"Terminating shard"},"target":"text_generation_launcher","span":{"rank":3,"name":"shard-manager"},"spans":[{"rank":3,"name":"shard-manager"}]} {"timestamp":"2024-05-20T05:28:50.286883Z","level":"INFO","fields":{"message":"Waiting for shard to gracefully shutdown"},"target":"text_generation_launcher","span":{"rank":2,"name":"shard-manager"},"spans":[{"rank":2,"name":"shard-manager"}]} {"timestamp":"2024-05-20T05:28:50.286915Z","level":"INFO","fields":{"message":"Waiting for shard to gracefully shutdown"},"target":"text_generation_launcher","span":{"rank":3,"name":"shard-manager"},"spans":[{"rank":3,"name":"shard-manager"}]} {"timestamp":"2024-05-20T05:28:50.688669Z","level":"INFO","fields":{"message":"shard terminated"},"target":"text_generation_launcher","span":{"rank":3,"name":"shard-manager"},"spans":[{"rank":3,"name":"shard-manager"}]} {"timestamp":"2024-05-20T05:28:50.879857Z","level":"INFO","fields":{"message":"shard terminated"},"target":"text_generation_launcher","span":{"rank":0,"name":"shard-manager"},"spans":[{"rank":0,"name":"shard-manager"}]} {"timestamp":"2024-05-20T05:28:50.886106Z","level":"INFO","fields":{"message":"shard terminated"},"target":"text_generation_launcher","span":{"rank":1,"name":"shard-manager"},"spans":[{"rank":1,"name":"shard-manager"}]} {"timestamp":"2024-05-20T05:28:51.088917Z","level":"INFO","fields":{"message":"shard terminated"},"target":"text_generation_launcher","span":{"rank":2,"name":"shard-manager"},"spans":[{"rank":2,"name":"shard-manager"}]} Error: WebserverFailed
here is a link to the repository
Is it due to erroneous settings in the inference endpoint?, or was the repository uploaded missing something? Any help would be appreciated.