T5 multilabel classification using tf

sreraku · March 28, 2023, 4:00am

I am trying to do multilabel classification on a corpus of data which has labels too. When
The data looks like this after adding the tag in the front for each row:

print(texts)
0 multilabel classification: how time changes th…
1 multilabel classification: hawaii has been in …
2 multilabel classification: not all alaskans ar…
3 multilabel classification: you should read rap…
4 multilabel classification: giving stupid kids …

I am trying to tokenize above and since there are multiple rows, I am guessing i have to go in a loop. What I am trying understand is, how do I get the input_ids, attention_mask? should I go in a loop to get for each row or for the entire text?
am I doing it right above by adding the tag multilabel classification: for each row? am totally confused whether my assumption is wrong or whether this is the way to do it.

my code is:

src_tokenized = TOKENIZER.encode_plus(
texts[0],
max_length=SRC_MAX_LENGTH,
pad_to_max_length=True,
truncation=True,
return_attention_mask=True,
return_token_type_ids=False,
return_tensors=‘tf’
)
src_input_ids = src_tokenized[‘input_ids’]
src_attention_mask = src_tokenized[‘attention_mask’]

t5_summary_ids = t5_model.generate(src_input_ids)

am feeling am doing wrong by running it row by row. but am not sure. I googled for it and all i see multilabel classification example is using pytorch not tf.

Appreciate all the help. TIA

Topic		Replies	Views
Finetuning a Tensorflow model for Multilabel classification Beginners	2	885	August 25, 2023
Shape mismatch between labels and logits 🤗Transformers	1	1683	December 27, 2023
T5 decoder predicting tokens even after hitting end of sequence token, i.e </s> 🤗Transformers	4	328	February 26, 2024
Understanding multi-label classification training Beginners	0	820	February 14, 2023
Most efficient multi-label classifier? Beginners	3	12043	September 1, 2022

T5 multilabel classification using tf

Related topics