Hi. I am using a script much like this example one. I have some attributes which are strings and should be tokenized and padded, but I also have a graph per sample which I am converting into a Data object from PyG which should also be batched.
I would like to know if there is a way I could override the batching process in Seq2SeqTrainer so I could use the PyTorch Geometric (PyG) Data Loader.
If not, I could send the node features and adjacency matrix as tensors, but the adjacency matrix is of varied size. How could I batch it in the preprocess_function
?