Parser Error, ERROR: Exception in ASGI application

Hello, when I start the Autotrainer (Using UI) I receive an error (ERROR: Exception in ASGI application). The last couple of lines of the log file read as follows:

UnicodeDecodeError: ‘utf-8’ codec can’t decode byte 0x82 in position 7: invalid start byte
File “parsers.pyx”, line 2053, in pandas._libs.parsers.raise_parser_error
File “parsers.pyx”, line 891, in pandas._libs.parsers.TextReader._check_tokenize_status
File “parsers.pyx”, line 874, in pandas._libs.parsers.TextReader._tokenize_rows
File “parsers.pyx”, line 663, in pandas._libs.parsers.TextReader._get_header
File “parsers.pyx”, line 574, in pandas._libs.parsers.TextReader.cinit
self._reader = parsers.TextReader(src, **kwds

My database is basically a bunch of c++ code in .parquet format. I am only inputting one column as text, running the trainer in LLM SFT. My database is local. There are some backslashes present in the code. What could the possible issue here be? Thank you!

1 Like

The pandas are throwing errors with the character code… it looks troublesome.
If you’re lucky, it might be fixed in an update, but if not, I think the only option is to pre-process the problematic parts of the data set yourself…:sweat_smile:

pip install -U pandas

Sadly that didn’t work either, I’m going to try to process the file by converting it to a csv. Maybe that will work.

1 Like