Predict next embedding given sequence of embeddings

I’m planning a network that will take a sequence of embedding vectors as input, and produce several vectors as output.
Given a sequence of text embeddings produced by bert, I want to predict the next embedding.
Example input/output:

 # The user first read text with embedding a, 
# and then read text that with embedding b. 
input: [[a1, a2, a3], 
        [b1, b2, b3], ...]

# The network predicts that they will want to read something similar to x or y
# Once x and y are produced
# I will search a database of [text <> embedding] pairs to find relevant text.
output: {[x1, x2, x3], 
         [y1, y2, y3], ...}

The input vectors are produced by sentence-bert
Is HF a natural fit for this task? If so, where should I start? :slight_smile: