Can someone help guide how to finetune DeBERTa V3 model?

clasyc · April 7, 2024, 8:24am

I am interested in fine-tuning the following model: mdeberta-v3-base-squad2 on Hugging Face.

As I am totally noob on this topic, I would greatly appreciate any guidance provided.

I attempted to follow the documentation available at Custom Datasets for Question Answering with SQuAD 2.0 on Hugging Face, but encountered difficulties in loading the tokenizer. Below is the code snippet I used:

from transformers import DebertaV2Tokenizer
import torch
from datasets import load_dataset
from torch.utils.data import Dataset

squad_v2 = load_dataset("squad_v2")
tokenizer = DebertaV2Tokenizer.from_pretrained('timpal0l/mdeberta-v3-base-squad2')

However, I encountered the following error, suggesting an issue with locating the vocabulary file:

  File "C:\<...>venv\lib\site-packages\transformers\models\deberta_v2\tokenization_deberta_v2.py", line 130, in __init__
    if not os.path.isfile(vocab_file):                                                                                                                                           
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.3568.0_x64__qbz5n2kfra8p0\lib\genericpath.py", line 30, in isfile                                   
    st = os.stat(path)                                                                                                                                                           
TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType

I am unsure if this issue arises from attempting to use a DeBERTa V3 model with a DeBERTa V2 architecture. The repository’s configuration file mentions “DebertaV2ForQuestionAnswering,” leading me to believe it should be compatible. Could the problem be unrelated to the version discrepancy, or is there another potential issue at play?

Krillinkills · August 25, 2024, 9:12am

you can try
tokenizer = AutoTokenizer.from_pretrained(‘timpal0l/mdeberta-v3-base-squad2’)
this worked for me

Topic		Replies	Views
How to Finetune Deberta Model on SQUAD dataset? 🤗Transformers	2	1165	January 27, 2021
No PreTrainedTokenizerFast for Deberta-V3, no doc_stride 🤗Tokenizers	0	922	July 13, 2022
Cant load deberta tokenizer Beginners	0	678	March 27, 2021
"deberta-v2-xxlarge"-Model not working! Models	2	1526	March 10, 2021
Can't save my finetuned model Beginners	5	216	November 9, 2024

Can someone help guide how to finetune DeBERTa V3 model?

Related topics