ShardCannotStart Error when launching a dedicated endpoint

I was trying to launch my fine-tuned mixtral8x7b model as a dedicated endpoint. I got this error. How do I fix this? The model is a merged trained LoRA adapter and the base model.




Server message:Endpoint failed to start. ver/models/flash_mistral.py", line 335, in init\n model = model_cls(config, weights)\n\n File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/custom_modeling/flash_mixtral_modeling.py", line 812, in init\n self.model = MixtralModel(config, weights)\n\n File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/custom_modeling/flash_mixtral_modeling.py", line 749, in init\n [\n\n File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/custom_modeling/flash_mixtral_modeling.py", line 750, in \n MixtralLayer(\n\n File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/custom_modeling/flash_mixtral_modeling.py", line 684, in init\n self.self_attn = MixtralAttention(\n\n File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/custom_modeling/flash_mixtral_modeling.py", line 224, in init\n self.query_key_value = load_attention(config, prefix, weights)\n\n File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/custom_modeling/flash_mixtral_modeling.py", line 117, in load_attention\n return _load_gqa(config, prefix, weights)\n\n File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/custom_modeling/flash_mixtral_modeling.py", line 144, in _load_gqa\n assert list(weight.shape) == [\n\nAssertionError: [3145728, 1] != [3072, 4096]\n"},“target”:“text_generation_launcher”,“span”:{“rank”:0,“name”:“shard-manager”},“spans”:[{“rank”:0,“name”:“shard-manager”}]} {“timestamp”:“2024-03-13T02:22:11.292493Z”,“level”:“INFO”,“fields”:{“message”:“Shard terminated”},“target”:“text_generation_launcher”,“span”:{“rank”:3,“name”:“shard-manager”},“spans”:[{“rank”:3,“name”:“shard-manager”}]} {“timestamp”:“2024-03-13T02:22:11.431820Z”,“level”:“INFO”,“fields”:{“message”:“Shard terminated”},“target”:“text_generation_launcher”,“span”:{“rank”:2,“name”:“shard-manager”},“spans”:[{“rank”:2,“name”:“shard-manager”}]} Error: ShardCannotStart