Hey guys I am trying to figure out how to use the Longformer at the character level. It is mentioned in the paper also. I looked at the docs but I can’t find what I am looking for.
Can I just adjust my pre-processing so instead of tokenising:
“Hello, I like cake!”
the input to be tokenised is something like:
“H” “e” “l” “l” “o” “,” “I” “l” “i” “k” “e” “c” “a” “k” “e” “!”
and then the tokeniser will assign ids to every character?
Thanks.