Hey guys I am trying to figure out how to use the Longformer at the character level. It is mentioned in the paper also. I looked at the docs but I canβt find what I am looking for.
Can I just adjust my pre-processing so instead of tokenising:
βHello, I like cake!β
the input to be tokenised is something like:
βHβ βeβ βlβ βlβ βoβ β,β βIβ βlβ βiβ βkβ βeβ βcβ βaβ βkβ βeβ β!β
and then the tokeniser will assign ids to every character?
Thanks.