Hello,
I am trying to use the SegFormer and the TF version in particular. The PyTorch model works using the sample code given in here, but in the TF code, the following line:
model = TFSegformerForSemanticSegmentation.from_pretrained("nvidia/segformer-b4-finetuned-ade-512-512")
gives following errors:
2022-08-02 22:06:38.581314: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-02 22:06:38.946033: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 13589 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3080 Ti Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.6
2022-08-02 22:06:40.230678: I tensorflow/stream_executor/cuda/cuda_dnn.cc:384] Loaded cuDNN version 8100
2022-08-02 22:06:40.870359: E tensorflow/core/platform/windows/subprocess.cc:287] Call to CreateProcess failed. Error code: 2, command: '"ptxas.exe" "--version"'
2022-08-02 22:06:40.870862: E tensorflow/core/platform/windows/subprocess.cc:287] Call to CreateProcess failed. Error code: 2, command: '"ptxas.exe" "--version"'
2022-08-02 22:06:40.871431: W tensorflow/stream_executor/gpu/asm_compiler.cc:80] Couldn't get ptxas version string: INTERNAL: Couldn't invoke ptxas.exe --version
2022-08-02 22:06:40.875190: E tensorflow/core/platform/windows/subprocess.cc:287] Call to CreateProcess failed. Error code: 2, command: '"ptxas.exe" "C:\Users\xxx\AppData\Local\Temp\/tempfile-LAPTOP-H8L6H592-37bc-3780-5e546d1063d44" "-o" "C:\Users\xxx\AppData\Local\Temp\/tempfile-LAPTOP-H8L6H592-37bc-3780-5e546d10649e0" "-arch=sm_86" "--warn-on-spills"'
2022-08-02 22:06:40.875526: W tensorflow/stream_executor/gpu/redzone_allocator.cc:314] INTERNAL: Failed to launch ptxas
Relying on driver to perform ptx compilation.
Modify $PATH to customize ptxas location.
This message will be only logged once.
2022-08-02 22:06:41.366578: I tensorflow/stream_executor/cuda/cuda_blas.cc:1786] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.
2022-08-02 22:06:41.425569: I tensorflow/compiler/xla/service/service.cc:170] XLA service 0x27b18d09a10 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2022-08-02 22:06:41.425819: I tensorflow/compiler/xla/service/service.cc:178] StreamExecutor device (0): NVIDIA GeForce RTX 3080 Ti Laptop GPU, Compute Capability 8.6
2022-08-02 22:06:41.449017: E tensorflow/core/platform/windows/subprocess.cc:287] Call to CreateProcess failed. Error code: 2, command: '"ptxas.exe" "--version"'
2022-08-02 22:06:41.449438: E tensorflow/core/platform/windows/subprocess.cc:287] Call to CreateProcess failed. Error code: 2, command: '"ptxas.exe" "--version"'
2022-08-02 22:06:41.450899: W tensorflow/stream_executor/gpu/asm_compiler.cc:80] Couldn't get ptxas version string: INTERNAL: Couldn't invoke ptxas.exe --version
2022-08-02 22:06:41.454757: E tensorflow/core/platform/windows/subprocess.cc:287] Call to CreateProcess failed. Error code: 2, command: '"ptxas.exe" "C:\Users\xxx\AppData\Local\Temp\/tempfile-LAPTOP-H8L6H592-37bc-3780-5e546d10f140c" "-o" "C:\Users\xxx\AppData\Local\Temp\/tempfile-LAPTOP-H8L6H592-37bc-3780-5e546d10f21d1" "-arch=sm_86" "--warn-on-spills"'
2022-08-02 22:06:41.475889: E tensorflow/core/platform/windows/subprocess.cc:287] Call to CreateProcess failed. Error code: 2, command: '"ptxas.exe" "--version"'
2022-08-02 22:06:41.476139: W tensorflow/stream_executor/gpu/asm_compiler.cc:80] Couldn't get ptxas version string: INTERNAL: Couldn't invoke ptxas.exe --version
2022-08-02 22:06:41.478855: E tensorflow/core/platform/windows/subprocess.cc:287] Call to CreateProcess failed. Error code: 2, command: '"ptxas.exe" "C:\Users\xxx\AppData\Local\Temp\/tempfile-LAPTOP-H8L6H592-37bc-3780-5e546d10f772f" "-o" "C:\Users\xxx\AppData\Local\Temp\/tempfile-LAPTOP-H8L6H592-37bc-3780-5e546d10f7fd2" "-arch=sm_86" "--warn-on-spills"'
2022-08-02 22:06:41.479058: W tensorflow/compiler/xla/service/gpu/buffer_comparator.cc:640] INTERNAL: Failed to launch ptxas
Relying on driver to perform ptx compilation.
Setting XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda or modifying $PATH can be used to set the location of ptxas
This message will only be logged once.
2022-08-02 22:06:41.681546: W tensorflow/compiler/xla/service/gpu/nvptx_helper.cc:56] Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice. This may result in compilation or runtime failures, if the program we try to run uses routines from libdevice.
Searched for CUDA in the following directories:
./cuda_sdk_lib
C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.2
/usr/local/cuda
.
You can choose the search directory by setting xla_gpu_cuda_data_dir in HloModule's DebugOptions. For most apps, setting the environment variable XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda will work.
2022-08-02 22:06:41.704345: E tensorflow/core/platform/windows/subprocess.cc:287] Call to CreateProcess failed. Error code: 2, command: '"ptxas.exe" "--version"'
2022-08-02 22:06:41.704644: W tensorflow/stream_executor/gpu/asm_compiler.cc:80] Couldn't get ptxas version string: INTERNAL: Couldn't invoke ptxas.exe --version
2022-08-02 22:06:41.708784: E tensorflow/core/platform/windows/subprocess.cc:287] Call to CreateProcess failed. Error code: 2, command: '"ptxas.exe" "C:\Users\xxx\AppData\Local\Temp\/tempfile-LAPTOP-H8L6H592-37bc-3780-5e546d112f3e6" "-o" "C:\Users\xxx\AppData\Local\Temp\/tempfile-LAPTOP-H8L6H592-37bc-3780-5e546d1130200" "-arch=sm_86" "--warn-on-spills"'
2022-08-02 22:06:41.709103: F tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:456] ptxas returned an error during compilation of ptx to sass: 'INTERNAL: Failed to launch ptxas' If the error message indicates that a file could not be written, please verify that sufficient filesystem space is provided.
after which the script fails.
I have verified that tensorflow has GPU available and with other tf-based code samples CUDA works without problems. I have plenty of experience with TensorFlow but have not seen this behaviour before. I am using transformers=4.21.0, cudatoolkit=11.2.2, cudnn=8.1.0.77 and tensorflow=2.9.1.
Any tips or suggestions? Cheers!