TypeError: '>' not supported between instances of 'NoneType' and 'int' - Error while training distill bert

vinaysingh1987 · September 20, 2021, 12:26pm

Hi,
I had an error while finetuning distilbert model.Screen shot is given

Screenshot of Code is(except data preprocessing):

import pandas as pd
import numpy as np
import seaborn as sns
import transformers

from transformers import AutoTokenizer,TFBertModel,TFDistilBertModel, DistilBertConfig
tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')
d_bert = TFDistilBertModel.from_pretrained('distilbert-base-uncased')


# In[72]:


bert = TFBertModel.from_pretrained('bert-base-uncased')


# In[73]:


df_train= X_train.replace("[^0-9a-zA-Z]", " ", regex = True)
df_test = X_test.replace("[^0-9a-zA-Z]", " ", regex = True)
X_train_list = list(df_train['Message'])
X_test_list = list(df_test['Message'])
Y_train_list= list(Y_train)
Y_test_list= list(Y_test)


# In[91]:


# print(X_test_list)


# In[75]:


from transformers import DistilBertTokenizerFast
tokenizer = DistilBertTokenizerFast.from_pretrained('distilbert-base-uncased')


# In[76]:


train_encodings = tokenizer(X_train_list, truncation= True, padding = True)
test_encodings = tokenizer(X_test_list, truncation= True, padding = True)


# In[92]:


# train_encodings


# In[78]:


import tensorflow as tf
train_dataset_sl = tf.data.Dataset.from_tensor_slices((dict(train_encodings), Y_train_list))
test_dataset_sl = tf.data.Dataset.from_tensor_slices((dict(test_encodings), Y_test_list))


# In[79]:


print(train_dataset_sl)


# In[86]:


from transformers import TFDistilBertForSequenceClassification, TFTrainer, TFTrainingArguments
training_args = TFTrainingArguments(
    output_dir= './results',
    num_train_epochs=2,
    per_device_train_batch_size=8,
    per_device_eval_batch_size=16,
    warmup_steps=500,
    weight_decay=0.01,
    logging_dir='./logs',
    logging_steps= 10)


# In[87]:


with training_args.strategy.scope():
    model = TFDistilBertForSequenceClassification.from_pretrained("distilbert-base-uncased", num_labels=6)
trainer= TFTrainer(
    model = model,
    args= training_args,
    train_dataset=train_dataset_sl,
    eval_dataset= test_dataset_sl)
trainer.train()

Number of labels in dataset= 6 (0 to 5)
Can anybody help me out to resolve the issue.
Thanks in advance.

sapeosexual · November 28, 2021, 9:45am

I came across same problem, which seems to be an issue according to a stackoverflow answer:deep learning - HUGGINGFACE TypeError: '>' not supported between instances of 'NoneType' and 'int' - Stack Overflow

Dfbenavidesr · December 12, 2022, 3:00am

Hey I’m having same problem. did you solve it??? I tried changing the transformers’s version but it is not the solution.
I need your help, I am grateful for the collaboration.

Dfbenavidesr · December 12, 2022, 3:31am

I found this solution. You must adding “eval_steps = 10” as an argument in TFTrainingArguments.

avanti · February 1, 2023, 7:54pm

Was anybody able to find a solution to this problem? adding “eval_steps = 10” as an argument in TrainingArguments does not work. The error originates from the optimizer

Manasvin · June 7, 2023, 12:54pm

Actually, this worked!! Thanks a lot!

Dharamanand · April 22, 2024, 12:56am

did you find the solution ??
kindly reply

Topic		Replies	Views
Getting a value Error: Unable to create a tensor because the feature 'text' has excessive nesting and it expects it to be 'int' for some reason, Beginners	0	465	February 1, 2023
0% accuracy when finetuning from certain models. [CLS] token embeddings not learned 🤗Transformers	1	610	November 2, 2023
Model.fit error using distilgpt2 model Intermediate	0	172	February 26, 2024
Unsupported value type BatchEncoding Beginners	9	6221	August 10, 2021
Error for Training job huggingface-sdk-extension-2022-01-24-16-31-30-883: Failed. Reason: AlgorithmError: ExecuteUserScriptError: Models	1	1394	January 25, 2022

TypeError: '>' not supported between instances of 'NoneType' and 'int' - Error while training distill bert

Related topics