why it decided to encode it as float instead
I think that probably made that’s decision based on the first few lines…
It seems that external libraries (Pandas and PyArrow) are used for parsing CSV and JSON, and that’s probably how it works. It seems that things like on_bad_lines=“skip” are also completely thrown over to them.