Thanks @lewtun and @yusukemori for your help!
I tried the method you mentioned in Jay Alammar’s post – it indeed worked, but had weak performance (was beaten by my “benchmark” of tfidf/logistic regression) - so I will indeed attempt to use the fine-tune with Trainer
. I may try to downsample the majority class, at least at first, to make it run faster.
Thanks again!