Masking task with BERT on time serires

LeoBigelli · October 21, 2024, 12:16pm

Hi everone,
I started to pre-trained BERT for a masking task in a time series domain. I used a custom tokenizzation (not an usual model) to masking some samples with a special token. But durig the training process the loss is too much constant (like 6.7, I used the SparseCategoricalCrossentropy).
Could anyone help me?

Thank guys.

Topic		Replies	Views
Training a model with custom attention masks in each layer 🤗Transformers	0	667	December 6, 2023
Learning rate for further pretraining BERT on masked language modeling task 🤗Transformers	0	205	September 16, 2021
Is masking still used when finetuning a BERT model? Beginners	1	1322	July 29, 2020
Fill-mask and classification at the same time Beginners	4	803	March 18, 2022
Pretrain own model 🤗Transformers	0	270	October 23, 2023

Masking task with BERT on time serires

Related topics