Internal link recommendation and anchor text selection with bert-large-uncased, cosine similarity

dejanseo · July 10, 2023, 4:34am

I’ve used bert-large-uncased to generate similarity matrix and cosine similarity to produce internal link recommendations for 5000 pages that don’t already link to each other (source → target). My next step is to select existing word or a short phrase on “source” page to serve as anchor text for “target” page.

I’ve tried both gpt-4 and text-bison (PaLM2) with various parameter changes including temperature, top k, top p, reduced output tokens… and no matter what I do the anchor text recommendation is often made up. I tried simulating the API call in both PaLM2 playground and used chat GPT-4 to reason with and it would confuse them as well.

I then attempted to use bert-ner (failed) and then some basic methods like n-gram analysis, TF-IDF…etc but recommendations, while often keyword-level accurate made no sense in the context of the whole text of both pages.

Am I blind to an obvious solution here?

Topic		Replies	Views
Refine BERT to pay more attention to key words Intermediate	0	320	November 24, 2023
Need Help Improving Similarity Scores for Follow-up Detection Using BERT or similar 🤗Transformers	1	113	May 26, 2024
Training BERT for basic recommendation Beginners	0	200	May 15, 2024
Restricting BERT scores; Methods to counter high confidence in classification of short non-word-like-phrases to labels Beginners	0	467	May 27, 2021
Fine tuning a sentence-transformer for cosine sim on 500k sentence pairs without labels-- advice 🤗Transformers	2	1196	April 20, 2024

Internal link recommendation and anchor text selection with bert-large-uncased, cosine similarity

Related topics