I am currently using Falcon model (falcon 7b instruct). Its performance is quite satisfactory. But my question is that can we use this model somehow for creating the embedding of any text document like sentence transformers or text-embedding-ada from OpenAI?
Or this model is purely for text generation which means it cannot be used for text embedding purposes?
Same issue here. I think it is a function/interface issue I am new to the library so it is possible I’m doing a tivial mistake but embedding-extraction does not seem to work using a “vanilla” approach.
I am a bit unsure here but the issue may either be with the Falcon tokenizer pad/eos confusion or worse feature-extraction pipeline compatibility. As much as I know falcon does not output embedding directly or trained as a sentence transformer. A bypass I am trying now, is to follow the sequence classification pipeline and take the feature from the eos token instead of passing it to dense classifier layers.