I am unable to docker run text-embeddings-inference docker images (I have tried several) in my local Docker environment.
Running H-100 GPUs, 
Ubuntu host, 
recent docker engine install, CUDA 12.4, 
verified “LD_LIBRARY_PATH” exists and that it contains the proper directory, 
verified directory is on PATH, 
checked symbolic link libcuda.so.1 was created and linked to current version libcuda.so.550.127.05. 
Symbolic ink and file confirmed to exist in proper directory (/usr/lib/x86_64-linux-gnu). 
Able to access the file through the symbolic link with a non-elevated account. 
Permissions on libcuda.so.550.127.05 are 644. 
Used text-embeddings-inference:89-1.2 and 1.5 (and others). 
 
Error received:  “error while loading shared libraries: libcuda.so.1 : cannot open shared object file.”
             
            
               
               
              1 Like 
            
            
           
          
            
            
              This is an error that appears when the CUDA toolkit is not installed or the path is not set up, but I think it’s installed… 
Maybe the path reference is not working properly in a specific library.
  
  
    
  
  
    
    
      
        opened 04:58PM - 09 Feb 24 UTC 
      
      
     
    
    
   
 
  
    # Prerequisites
Please answer the following questions for yourself before sub… mitting an issue.
- [x] I am running the latest code. Development is very rapid so there are no tagged versions as of now.
- [x] I carefully followed the [README.md](https://github.com/abetlen/llama-cpp-python/blob/main/README.md).
- [x] I [searched using keywords relevant to my issue](https://docs.github.com/en/issues/tracking-your-work-with-issues/filtering-and-searching-issues-and-pull-requests) to make sure that I am creating a new issue that is not already open (or closed).
- [x] I reviewed the [Discussions](https://github.com/abetlen/llama-cpp-python/discussions), and have a new bug or useful enhancement to share.
# Expected Behavior
Expected: Probably loading all necessary files as requested
# Current Behavior
`File "/home/worker/app/.venv/lib/python3.11/site-packages/llama_cpp/llama_cpp.py", line 76, in _load_shared_library
2024-02-09 17:32:34     raise RuntimeError(f"Failed to load shared library '{_lib_path}': {e}")
2024-02-09 17:32:34 RuntimeError: Failed to load shared library '/home/worker/app/.venv/lib/python3.11/site-packages/llama_cpp/libllama.so': libcuda.so.1: cannot open shared object file: No such file or directory`
# Environment and Context
I'm trying to set up privategpt in a Docker enviroment. In the Dockerfile, i specifially reinstalled the "newest" llama-cpp-python version, along with the necessary cuda libraries, to enable GPU Support. As this appears to be specifically a llama-cpp-python issue, i'm posting it here (too).
* Physical (or virtual) hardware you are using, e.g. for Linux:
`$ lscpu`
```
# lscpu
Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         48 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  24
  On-line CPU(s) list:   0-23
Vendor ID:               AuthenticAMD
  Model name:            AMD Ryzen 9 5900X 12-Core Processor
    CPU family:          25
    Model:               33
    Thread(s) per core:  2
    Core(s) per socket:  12
    Socket(s):           1
    Stepping:            2
    BogoMIPS:            7386.18
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_g
                         ood nopl tsc_reliable nonstop_tsc cpuid extd_apicid pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy 
                         svm cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext perfctr_core ssbd ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 invpcid rdseed adx smap clflu
                         shopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves clzero xsaveerptr arat npt nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold v_vmsav
                         e_vmload umip vaes vpclmulqdq rdpid
Virtualization features: 
  Virtualization:        AMD-V
  Hypervisor vendor:     Microsoft
  Virtualization type:   full
Caches (sum of all):     
  L1d:                   384 KiB (12 instances)
  L1i:                   384 KiB (12 instances)
  L2:                    6 MiB (12 instances)
  L3:                    32 MiB (1 instance)
Vulnerabilities:         
  Gather data sampling:  Not affected
  Itlb multihit:         Not affected
  L1tf:                  Not affected
  Mds:                   Not affected
  Meltdown:              Not affected
  Mmio stale data:       Not affected
  Retbleed:              Not affected
  Spec rstack overflow:  Mitigation; safe RET
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl and seccomp
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:            Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP conditional, RSB filling, PBRSB-eIBRS Not affected
  Srbds:                 Not affected
  Tsx async abort:       Not affected
```
* Operating System, e.g. for Linux:
`$ uname -a` => Linux 1de939a0a313 5.15.133.1-microsoft-standard-WSL2
* SDK version, e.g. for Linux:
```
$ python3 --version => 3.11.6
$ make --version => 4.3
$ g++ --version => 12.2.0
``` 
   
   
  
    
    
  
  
 
             
            
               
               
               
            
            
           
          
            
            
              Thank you for your response.  I verified the CUDA toolkit is installed and the path works for a non-root account.  I have two AI servers from Lambda Labs.  The first server is running CUDA compilation tools 12.2.140 and the text-embeddings-inference container starts and runs fine.  The second server is running 12.4.131 and fails to start with the error mentioned in the above post.   Lambda Labs is also investigating.
             
            
               
               
              1 Like 
            
            
           
          
            
              
                GTez  
                
               
              
                  
                    March 4, 2025, 12:29am
                   
                   
              4 
               
             
            
              I was able to resolve this by using this in my docker compose:
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              device_ids:
                - "0"
              capabilities:
                - gpu
 
I believe this directly specified to use the NVidia driver
             
            
               
               
              1 Like