I am trying to deploy a HuggingFace model into sagemaker but getting below error:
AttributeError: ‘TensorParallelColumnLinear’ object has no attribute ‘base_layer’
Hi, I’m also trying to do the same. I’m using the instruction here to define my LORA_ADAPTERS variable
# First I define my adapters
LORA_ADAPTERS=myadapter=/some/path/to/adapter,myadapter2=/another/path/to/adapter
# Then I run the following command
docker run --gpus '"device=0,1"'
--shm-size 1g -p 8000:80
-v /opt/:/data ghcr.io/huggingface/text-generation-inference:3.0.0
--model-id /path/to/Mixtral-8x7B-v0.1
--lora-adapters $LORA_ADAPTERS
--num-shard 2
--max-input-length 24500
--max-total-tokens 32000
--max-batch-prefill-tokens 32000 &
I am getting the same error. Curious if you tried a different docker run command?
I would include the error trace here but the logging from TGI is so verbose that I cannot retrieve the error trace. All I see are a bunch of floats from tensors that surfaced to console.
I wonder if there’s an outdated package that I need to update or something trivial like that.