I am training a fine-tune of codellama using PEFT but not sure how to use the task_type parameter of LoraConfig. Should it be CAUSAL_LM or SEQ_2_SEQ_LM or something else? Does it have any affect?
The goal of my model is to parse an input for independent clauses in a sentence. For example, it would insert a delimiter, such as in this sentence: “the tea was on the stove and was at high temperature” , separating the independent clause from the subordinate clause. My training data is all in a single col and each row looks like this (where the → and are custom tokens I add to the tokenizer vocab and the is the EOS token):
“the tea was on the stove and was at high temperature → the tea was on the stove and was at high temperature ”
Overview of the supported task types:
- SEQ_CLS: Text classification.
- SEQ_2_SEQ_LM: Sequence-to-sequence language modeling.
- Causal LM: Causal language modeling.
- TOKEN_CLS: Token classification.
- QUESTION_ANS: Question answering.
- FEATURE_EXTRACTION: Feature extraction. Provides the hidden states which can be used as embeddings or features
for downstream tasks.