Map with batch=True gives ArrowInvalid error for mismatch in a column's expected length

I am tokenizing my dataset with a customized tokenize_function to tokenize 2 different texts and then append them toghether, this is the code:

# Load the datasets
data_files = {
    "train": "train_pair.csv",
    "test": "test_pair.csv",
    "val": "val_pair.csv"
datasets = load_dataset('csv', data_files=data_files)

# tokenize the dataset
def tokenize_function(batch):
    # Get the maximum length from the model configuration
    max_length = 512

    # Tokenize each text separately and truncate to half the maximum length
    tokenized_text1 = tokenizer(batch['text1'], truncation=True, max_length=int(max_length/2), add_special_tokens=True)
    tokenized_text2 = tokenizer(batch['text2'], truncation=True, max_length=int(max_length/2), add_special_tokens=True)

    # Merge the results
    tokenized_inputs = {
        'input_ids': tokenized_text1['input_ids'] + tokenized_text2['input_ids'][1:],  # exclude the [CLS] token from the second sequence
        'attention_mask': tokenized_text1['attention_mask'] + tokenized_text2['attention_mask'][1:]
    return tokenized_inputs

# Tokenize the datasets
tokenized_datasets =, batched=True)

This code is generating this error:

ArrowInvalid: Column 3 named input_ids expected length 1000 but got length 1999

The error is misleading, it suggests that the input_ids length is 1999, while it is impossible for the maximum length of this column to be more than 512. If I set batch=False there is no error.
I also tried with different batch sizes such as 8 or 25 (cause the number of samples is dividable by 25) but it did not work.

@SMMousavi did you solve this issue? If so , could you detail it below.