how did you go from df
to summarized_data
?
that bit could help explain.
also do you get the same error if you just convert the df
i.e. with Dataset.from_pandas(df)
, that may point to an issue with your input data file/encoding.
I suspect you need to fix the unencoded chars in your input .csv file. Here is a bit more detail on a possible approach