Download DeepSeek R1 685B locally for future fine tuneing

Hello,

I want to download deepseek-ai/DeepSeek-R1 for fine tuning. Here is my code and the error I get:

import os

import torch

# Define the model and tokenizer
model_name = "deepseek-ai/DeepSeek-R1"
save_directory = "/Volumes/Flyte3/DeepSeek_V3_R1"

# Create the directory if it doesn't exist
os.makedirs(save_directory, exist_ok=True)


# Download and save the model with quantization config
model = AutoModel.from_pretrained(model_name, trust_remote_code=True)
model.save_pretrained(save_directory)

# Download and save the tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
tokenizer.save_pretrained(save_directory)

print(f"Model and tokenizer saved to {save_directory}")

I get this error:

    model = AutoModel.from_pretrained(model_name, trust_remote_code=True)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/anaconda3/envs/hug/lib/python3.11/site-packages/transformers/models/auto/auto_factory.py", line 559, in from_pretrained
    return model_class.from_pretrained(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/anaconda3/envs/hug/lib/python3.11/site-packages/transformers/modeling_utils.py", line 3640, in from_pretrained
    config.quantization_config = AutoHfQuantizer.merge_quantization_configs(
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/anaconda3/envs/hug/lib/python3.11/site-packages/transformers/quantizers/auto.py", line 181, in merge_quantization_configs
    quantization_config = AutoQuantizationConfig.from_dict(quantization_config)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/anaconda3/envs/hug/lib/python3.11/site-packages/transformers/quantizers/auto.py", line 105, in from_dict
    raise ValueError(
ValueError: Unknown quantization type, got fp8 - supported types are: ['awq', 'bitsandbytes_4bit', 'bitsandbytes_8bit', 'gptq', 'aqlm', 'quanto', 'eetq', 'higgs', 'hqq', 'compressed-tensors', 'fbgemm_fp8', 'torchao', 'bitnet', 'vptq']

Why is this error occurring, I dont want to quantize.

1 Like

This is because DeepSeek’s weight is fp8. This sometimes happens with models that are too large. from_pretrained and save_pretrained are convenient, but they are not for downloading. It is safer to download using the following function.

2 Likes

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.