Programmatic way to Tokenization on Custom Text Columns

I’m having issues with tokenizing in a more programmatic way. I can’t seem to figure out a way to pass multiple arguments to the map function in tokenizers. I also can’t run the map function on a particular column in the dataset. And I can’t use the generic python map function because I can’t figure out how to make it return a transformers dataset object (although that could be my own ignorance). Any thoughts?

1 Like