Character level attention with Longformer for sequence classification

Hey guys :slight_smile: I am trying to figure out how to use the Longformer at the character level. It is mentioned in the paper also. I looked at the docs but I can’t find what I am looking for.

Can I just adjust my pre-processing so instead of tokenising:

“Hello, I like cake!”

the input to be tokenised is something like:

“H” “e” “l” “l” “o” “,” “I” “l” “i” “k” “e” “c” “a” “k” “e” “!”

and then the tokeniser will assign ids to every character?

Thanks.