Json dump format for load_dataset

You can format your data as JSON Lines, so as you said:

  • one record per line (they can be created via json.dumps from the standard lib for example)
  • no square brackets at the beginning/end

Moreover:

  • use “\n” for end of lines in string data - so that each record is on one single line
  • nested fields are supported
2 Likes