Converting Input String to List (or Sequence) of Strings


I am working on a Named Entity Recognition project. This is the data that I am working with is Named Entity Recognition (NER) Corpus | Kaggle.

When I try to map the tokenize_and_align_labels function, i get the following error: ArrowInvalid: Could not convert ‘[’ with type str: tried to convert to int64. I am pretty sure it has to do with all of the columns having a dtype of string.

That is okay for the sentence column, but for the two tag columns (POS & tag), they should be a list of strings (or maybe a sequence of strings).

How do I convert just those two columns to lists (or sequences) of strings?



P.S.- If you need any addition code to answer, let me know. This is my first post here!