OwlV2 significantly slower than OwlVit

Husafan · February 14, 2024, 9:05pm

It appears that OwlV2 is much slower than OwlVit:

OwlVit:

pipeline = transformers.pipeline(model="google/owlvit-base-patch32", task="zero-shot-object-detection", device=device)
print("Total tiles:", total_tiles)

def yield_inputs():
    for tile in tiles:
        yield {
            "image": Image.fromarray(tile.tile).convert("RGB"),
            "candidate_labels": text_queries
        }

outputs = pipeline(yield_inputs())
print(len(list(outputs)))

Results in:

Total tiles: 440
CPU times: user 1min 32s, sys: 560 ms, total: 1min 33s
Wall time: 25.6 s

While OwlV2:

pipeline = transformers.pipeline(model="google/owlv2-base-patch16-ensemble", task="zero-shot-object-detection", device=device)
print("Total tiles:", total_tiles)

def yield_inputs():
    for tile in tiles:
        yield {
            "image": Image.fromarray(tile.tile).convert("RGB"),
            "candidate_labels": text_queries
        }

outputs = pipeline(yield_inputs())
print(len(list(outputs)))

Gives:

Total tiles: 440
CPU times: user 4min 11s, sys: 230 ms, total: 4min 11s
Wall time: 3min 3s

It seems like the biggest difference in these two pre-trained checkpoints is the patch size. If I understand correctly, the large patch size means fewer patches are processed, and therefore the inference is faster? Is this correct? Is there a reason OwlV2 doesn’t have a 32x32 model?

Topic		Replies	Views
Owl-vit batch images inference Beginners	2	1117	May 7, 2024
Inference on Multi-GPU/multinode Beginners	4	7482	January 12, 2023
Batched pipeline inference has little speed improvement on longer texts Beginners	1	1883	October 27, 2023
Slow inference using most recent docker image Amazon SageMaker	10	3194	March 21, 2022
Inference speed between pipelines and Heads 🤗Transformers	0	310	April 3, 2023

OwlV2 significantly slower than OwlVit

Related topics