How to process dataset for BI-Encoder type models

I am trying to train a bi-encoder model to rank query and its relevant content. As bi-encoder models passes the query and then content through the same model one by one. Unlike traditional modeling pipeline it requires to process two streams of text hence there are twice columns for the dataset. (input_ids, attention_masks, type_ids). Huggingface trainer and datasets library is written to work on one input_id column. things such as dynamic padding and group by length only looks at input_ids columns and does not take into consideration scenario where there can be more then one input_ids column. Is there an existing way to support this. If not I am happy to contribute for this feature.

These are the columns I am processing for my dataset (labels, click_input_ids, click_token_type_ids, click_attention_mask, cand_input_ids, cand_token_type_ids, cand_attention_mask).