How to make `pipeline` automatically scale?

retylo · June 26, 2021, 8:47pm

Hi, I have trained a model for text classification. When I load it using the transformers’ pipeline, it just works well. The problem comes when I give as input a very big list of sentences: I get a CUDA out of memory error. When I take each example one by one in a for loop, I don’t get this error.

Is there an option to pass when instantiating the pipeline() object that enables to make predictions on a very large sequence automatically (for example by setting a batch size and iterating through the batches)? Or do I have to code this myself?

@sgugger
Thanks.

sgugger · June 28, 2021, 12:04pm

No you will have to code one yourself, the pipeline API is not designed to handle a large number of inputs automatically.

retylo · July 2, 2021, 3:44pm

I see, thanks. I think what I need are optimizations like ONNX Runtime, quantization, etc.

The only problem I have is that the HF ONNX converter can’t convert multi-label sequence classification models yet, AFAIK. Is it planned for a future release?

stevetracvc · July 28, 2021, 5:01pm

I have a simple but fairly awful approach I’m using. My issue is that I can’t figure out how to predict the size of the model in advance, so I can’t automatically determine the appropriate batch size. Any thoughts?

# taken from somewhere online
def chunk_list(lst, n):
    """Yield successive n-sized chunks from lst."""
    for i in range(0, len(lst), n):
        yield lst[i:i + n]


class ZeroShotPipelineMiniBatch():

    def __init__(self, zs_pipeline, max_input_size=40):
        self.MAX_INPUT_SIZE = max_input_size
        self.pipeline = zs_pipeline

    def __call__(self, inputs, label_list, *args, **kwargs):
        batch_size = int(np.floor(self.MAX_INPUT_SIZE / len(label_list)))

        ret = []
        for chunk in chunk_list(inputs, batch_size):
            r = self.pipeline(chunk, label_list, *args, **kwargs)
            if isinstance(r, list):
                ret += r
            else:
                ret += [r]
        return ret

Topic		Replies	Views
How to change the batch size in a pipeline? Beginners	1	1432	December 13, 2021
How to stop at 512 tokens when sending text to pipeline? 🤗Transformers	2	1389	February 7, 2024
Batched pipeline inference has little speed improvement on longer texts Beginners	1	1857	October 27, 2023
Pipeline on GPU Beginners	0	484	October 15, 2023
Batching in "automatic-speech-recognition" pipelines 🤗Transformers	2	2254	April 19, 2024

How to make `pipeline` automatically scale?

Related topics