Cant create dataset with encoding

how did you go from df to summarized_data?
that bit could help explain.

also do you get the same error if you just convert the df i.e. with Dataset.from_pandas(df), that may point to an issue with your input data file/encoding.

I suspect you need to fix the unencoded chars in your input .csv file. Here is a bit more detail on a possible approach