Trouble batch mapping dataset to tokenizer

I am working on the WMT14 de_en dataset; I have been trying to tokenize it using batch mapping but I seem to be doing something wrong or do not understand how the mapping function with batching works.

The following is my code:

from datasets import load_dataset
from transformers import GPT2TokenizerFast

dataset_de_en = load_dataset("wmt14", "de-en")
tokenizer = GPT2TokenizerFast.from_pretrained("gpt2")

def tokenize(trans_sample):
    src_tokenized = tokenizer(trans_sample['en'])
    trg_tokenized = tokenizer(trans_sample['de'])
    return {'en': src_tokenized,
            'de': trg_tokenized}

tokenized = dataset_de_en['train']['translation'].map(tokenize, batched=True, batch_size=512)

ERROR: essentially trans_sample is a list and I would need a for loop to iterate through it to tokenize which to my understanding would mean I am not using map bathing properly here. Could someone please point me in the right direction as in how to do this properly?

AttributeError: 'list' object has no attribute 'map'

You can find the answer by inspecting the processing function from the official translation notebook: transformers/ at 41a8fa4e14ae14405a69efc65bfc21c9daa71c1a · huggingface/transformers · GitHub