Bloom-560m pipeline task parameter invalid

Hi all,

I’m currently trying to implement the “bigscience/bloom-560m” model as part of a transformers.pipeline for inference. I can’t seem to find the right “task” parameter to get the pipeline running, always getting returns along the lines of “The model ‘BloomModel’ is not supported for […]”.

I’ve tried several different task values with no luck. Anyone know where to find what tasks a model is supporting?

Coincidentally, the model is tagged as “Text Generation”, but that doesn’t work.

from transformers import BloomModel, BloomTokenizerFast, pipeline

class bloom_model:
    def __init__(self):
        self.tokenizer = BloomTokenizerFast.from_pretrained(
            "bigscience/bloom-560m"
            , low_cpu_mem_usage=True
        )
        
        self.model = BloomModel.from_pretrained("bigscience/bloom-560m")
        
        self.generator = pipeline(
            "text-generation",
            model = self.model,
            tokenizer = self.tokenizer
        )

Error:

> The model 'BloomModel' is not supported for text-generation. Supported models are [...]

Interestingly, the following pipeline definition works:

self.generator = pipeline(
            "text-generation",
            model="bigscience/bloom-560m",
            tokenizer = self.tokenizer
        )

Even the pipeline tag as per README (line 52) says “text-generation”. Therefore, might be a bug?!

For future references, I was able to circumvent the described problem by using the ‘BloomForCausalLM’ class from transformers library, resulting in:

class bloom_model:
    def __init__(self):
        self.tokenizer = BloomTokenizerFast.from_pretrained(
            model = "bigscience/bloom-560m"
            , low_cpu_mem_usage = True
        )
        
        self.model = BloomForCausalLM.from_pretrained("bigscience/bloom-560m")
        
        self.generator = TextGenerationPipeline(
            task = 'text-generation',
            model = self.model,
            tokenizer = self.tokenizer
        )

Then, simple add a predict method to use the pipeline for inference, e.g.:

def predict(self, content: str) -> str:
        answer = self.generator(content, max_new_tokens=100)
        return answer

Do you know how to do fine-tuning?