Fairly new to ML and very new to transformers. Want to make sure I’m doing the right thing … I’m trying to do text classification with a small data set and though this would be a good option (is it?)
Here’s the basics of my code:
texts = ["random text string...", ...] labels = [1, 0, ...] tokenized_sents =  attention_masks =  for sentence in sentences: tokenized_sents.append(tokenizer.encode(sentence, add_special_tokens=True, ...)) input_ids = pad_sequences(tokenized_sents) for sentence in input_ids: att_mask = [int(token_id > 0) for token_id in sentence] attention_masks.append(att_mask) dataset = tf.data.Dataset.from_tensor_slices(input_ids, attention_masks, labels) # copied from https://huggingface.co/transformers/training.html optimizer = tf.keras.optimizers.Adam(learning_rate=3e-5) loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True) model.compile(optimizer=optimizer, loss=loss) model.fit(dataset, epochs=2, steps_per_epoch=115)
I was pretty confident this all worked, but then when I did the following test:
sent = ["I like to watch movies"] sent = tokenizer.encode(sentence, add_special_tokens=True, ...) att_mask = [int(token_id > 0) for token_id in sent] ds = tf.data.Dataset.from_tensor_slices(sent, att_mask) model.predict(ds)
I got a super long array. But the labels can only be 1 or 0 and there’s only one sample, so I was expecting a 1 by 2 array. Any idea why this doesn’t work?
Also, what’s the best way to save this model and use it for predictions later.