Finding Serverless Inference APIs that support attention outputs (output_attentions = true)

blancaster83 · March 19, 2024, 7:14pm

I’m trying to figure out which Serverless Inference APIs support output of attention matrix data. I know this is determined by the “output_attentions= true”, but I can’t see whether this is enabled for an API/model or not.

Is there a way I can search/filter based on models that have attention outputs pre-configured? Or can you recommend some serverless inference APIs that do support attention outputs?

Thanks!

Topic		Replies	Views
Attentions not returned from transformers ViT model when using output_attentions=True 🤗Transformers	4	842	July 10, 2024
How to use Inference API (serverless) in my model page? Beginners	1	382	June 28, 2024
How to visualize attention of a large encoder-decoder transformer model that isn't a model on hugging face? 🤗Transformers	0	2317	June 28, 2021
Inference API has been turned off for this model Beginners	0	989	June 6, 2023
Customizing GenerationMixin to output attentions Beginners	4	1813	September 10, 2020

Finding Serverless Inference APIs that support attention outputs (output_attentions = true)

Related topics