I have the Sentence Transformer model which would generates embedding to the user query, the ST model works fine in local but when i build docker image and run it from docker it was not generating the embedding.
I don’t see any error in docker logs as well, it just simply fails without any error. I have shared the code down below
model_name = "sentence-transformers/all-MiniLM-L6-v2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModel.from_pretrained(model_name)
# tokenizer = BertTokenizer.from_pretrained('bert-large-uncased-whole-word-masking-finetuned-squad')
# model = BertForQuestionAnswering.from_pretrained('bert-large-uncased-whole-word-masking-finetuned-squad')
inputs = tokenizer(sentences, return_tensors="pt", padding=True, truncation=True).to("cpu")
# with torch.no_grad():
embeddings = model(**inputs).last_hidden_state.mean(dim=1) # Average pooling
I have tried giving privileged access to the docker image, allocating max resources and still fails to run this in docker. I am completely skeptical of why this not working.
Can any one of you please help me in this.
Thanks
1 Like
I will give you some tips for your debugging.
1. Verify Docker Configuration
- Python Environment: Ensure the Python version inside the container matches the one used locally.
- Installed Dependencies: Confirm that all required dependencies (like
transformers
, torch
, etc.) are installed in the Docker environment. Use a requirements.txt
file to match your local setup.
- Device Compatibility: If running on a GPU, ensure CUDA drivers and
nvidia-container-runtime
are configured properly. For CPU-only, confirm no GPU-specific configurations are interfering.
2. Code Adjustments
- Device Specification: In Docker, if you’re using a CPU-only environment, explicitly set the device to
cpu
. Example:device = "cpu"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModel.from_pretrained(model_name).to(device)
inputs = tokenizer(sentences, return_tensors="pt", padding=True, truncation=True).to(device)
embeddings = model(**inputs).last_hidden_state.mean(dim=1)
- Model Loading: If the model is not loading properly, try forcing the download of pre-trained weights:
tokenizer = AutoTokenizer.from_pretrained(model_name, cache_dir="/tmp/model_cache")
model = AutoModel.from_pretrained(model_name, cache_dir="/tmp/model_cache")
Add cache_dir
to prevent issues with missing files inside the container.
3. Check for Silent Failures
- Wrap your embedding code in a
try-except
block to catch potential silent errors:try:
inputs = tokenizer(sentences, return_tensors="pt", padding=True, truncation=True).to("cpu")
embeddings = model(**inputs).last_hidden_state.mean(dim=1)
print(embeddings)
except Exception as e:
print(f"Error: {e}")
4. Ensure Docker Permissions
Hope this help!
1 Like