[NEWBY] Creating custom datasets to fine tune an existing model

Hi everyone,

I am trying to create my own dataset starting from my raw dataset jsonl (Lucapro/tx-data · Datasets at Hugging Face) as my first step to train the Helsinki-NLP model and having a working PoC.

What I would like to accomplish is to train the model to translate my first column in my dataset into my second column.

I am struggling on creating my own dataset and referencing it into the run_translation script. I am getting an error on how my datasets gets loaded (you can see here my loaded dataset Lucapro/tx-data-to-decode · Datasets at Hugging Face).

I am for sure missing something and I am a bit stuck, can anyone point me in a good direction to move forward with my PoC?