batch_size=8,
I see, so the pipeline is suspicious. Then, this is about the only thing that is suspicious. There is also the fact that the batch increases the memory required, but it is also simply prone to bugs.
opened 05:00PM - 08 Nov 21 UTC
closed 03:01PM - 18 Dec 21 UTC
I'm using a pipeline with feature extraction and I'm guessing (based on the fact… that it runs fine on the cpu but dies with out of memory on gpu) that the `batch_size` parameter that I pass in is ignored. Can pipeline be used with a batch size and what's the right parameter to use for that?
This is how I use the feature extraction:
```
# use pipelines and feature extraction
feature_extractor = pipeline(
task="feature-extraction",
model=model_args.model_name_or_path,
config = config,
tokenizer = tokenizer,
framework="pt",
batch_size=data_args.batch_size,
truncation=True,
)
....
outputs = feature_extractor(inputs = predict_inputs, truncation=True)
```
@Narsil has been really helpful with pipelines, perhaps he knows the answer?