How to represent paginated documents as a single training data instance

Note: after doing some research, I created a more well formulated question here: How to represent paginated documents as a single instance of training data for whole document classification?