Train BERT on time-series data

@clems I am running a similar experiment currently and have posted my thoughts on it here. The largest difference I have to yours is that I don’t have to run the self-supervised training that BERT did, as I have labels. Since it seems you are running the self-supervised training, were you able to obtain any results with your initial suggestion? I’d be interested to hear your findings.