I am trying to run fp8 training with accelerate.
I have set up accelerate config
with fp8 and all default.
then I run accelerate test
and get :
line 87, in setup_fp8_env
values = value.strip("()").split(",")
^^^^^^^^^^^
AttributeError: 'bool' object has no attribute 'strip'
this does not happen when using fp16.
I am using a 4090, any help is appraciated
NVIDIA-SMI 555.58.02 Driver Version: 555.58.02 CUDA Version: 12.5
nvcc -V: 12.8
Pip freeze
accelerate==1.4.0
annotated-types==0.7.0
bitsandbytes==0.45.3
certifi==2025.1.31
charset-normalizer==3.4.1
diffusers==0.32.2
filelock==3.16.1
fsspec==2024.10.0
huggingface-hub==0.29.2
idna==3.10
importlib_metadata==8.6.1
Jinja2==3.1.4
MarkupSafe==2.1.5
mpmath==1.3.0
networkx==3.4.2
numpy==2.1.2
nvidia-cublas-cu12==12.8.3.14
nvidia-cuda-cupti-cu12==12.8.57
nvidia-cuda-nvrtc-cu12==12.8.61
nvidia-cuda-runtime-cu12==12.8.57
nvidia-cudnn-cu12==9.7.1.26
nvidia-cufft-cu12==11.3.3.41
nvidia-cufile-cu12==1.13.0.11
nvidia-curand-cu12==10.3.9.55
nvidia-cusolver-cu12==11.7.2.55
nvidia-cusparse-cu12==12.5.7.53
nvidia-cusparselt-cu12==0.6.3
nvidia-nccl-cu12==2.25.1
nvidia-nvjitlink-cu12==12.8.61
nvidia-nvtx-cu12==12.8.55
packaging==24.2
pillow==11.0.0
psutil==7.0.0
pydantic==2.10.6
pydantic_core==2.27.2
pytorch-triton==3.2.0+git4b3bb1f8
PyYAML==6.0.2
regex==2024.11.6
requests==2.32.3
safetensors==0.5.3
sympy==1.13.3
tokenizers==0.21.0
torch==2.7.0.dev20250310+cu128
torchaudio==2.6.0.dev20250310+cu128
torchvision==0.22.0.dev20250310+cu128
tqdm==4.67.1
transformer_engine==1.13.0
transformer_engine_cu12==1.13.0
transformer_engine_torch==1.13.0
transformers==4.49.0
typing_extensions==4.12.2
urllib3==2.3.0
zipp==3.21.0
EDIT: ““solved””
modified accelerate/utils/launch.py
to
86 if arg == "fp8_override_linear_precision":
87 if not isinstance(value, list):
88 value = [value, value, value]
89 values = value
90 current_env[prefix + "FP8_OVERRIDE_FPROP"] = str(values[0])
91 current_env[prefix + "FP8_OVERRIDE_DGRAD"] = str(values[1])
92 current_env[prefix + "FP8_OVERRIDE_WGRAD"] = str(values[2])
No idea if this is is how it’s supposed to be done