Tokenizer.batch_encode_plus uses all my RAM

Are you positive it’s actually the encoding that does it and not some other part of your code? Maybe you can show us the traceback?