How to embed relational information in a Transformer?

I am using Transformer model for machine translation.

However, my input data has relational information. This information has semantic information using AMR graph (Language   Abstract Meaning Representation (AMR)). Example below:

This relation needs to be embedded in the Transformer input data.

I want to use Transformer. But then the challenge is how can I embed structural information there? Is there any open source artefact for Relational Transformer that I can use out of the box?

1 Like

Let me start by saying I’m still very much in the process of learning the architecture. I am going to take a stab at explaining my existing understanding, and if someone comes along and thinks I’m wrong, well, I’m certain they will say so, because this is the internet. The fastest route to a more precise answer is sharing your existing understanding and being perceived as ‘wrong’.

The Transformer ‘stores’ data from the data set during training in the weights of the synaptic parameters. This is similar to long-term memory in a brain; it encodes information permanently (unless overwritten later). Any data that can be inferred from the data set will in some way be captured in the relationships between the synaptic parameters.You’ll need to represent it pretty well; information you think is most important to your use case should probably be over-represented, but I’m not certain about that yet.

Transformers use a sliding window of context in a way similar to working memory or short-term potentiation. Again, any information contained in the sliding window can be used for moving along with logical reasoning in an iterative way.

If you are specifically trying to teach it about a certain concept, you’ll need to represent that concept either in the training data set or in the sliding window of context.

Semantic information can and will be inferred from the data set, assuming you have represented that information there. For instance, any given word token has an accepted probability distribution that the network has learned, and this probability distribution can be used to infer how similar that word is to other words in terms of the probability that that word is also another word with a high degree of certainty.

So, basically, you should be able to describe the information you want it to know in plain English and add it to the context. Then ask away for it to infer information about that data. If it has a poor taught understanding of any of the concepts drawn on in the explanation you give it, it will have a larger margin of error when working with the new information.

@Randolph the question is how to embed relational information for Transformer.

Transformer already has a notion of positional embedding which takes into consideration of where a certain word appears in the sentence.

But as I mentioned that’s not enough for my use case and I want to embed more information.

Do you know is there a transformer from huggingface out of the box to achieve this?