Click here for Dataset link
Below is the following way, as per my understanding , Is it correct
The columns/features from
DiscoFuse dataset that will be the input to the
encoder will take these four columns as input and encode them into a sequence of hidden states. The
decoder will then take these hidden states as input and decode them into a new sentence that fuses the two original sentences together.
The discourse type, connective_string, has_coref_type_pronoun, and has_coref_type_nominal columns will not be used as input to the encoder or decoder. These columns are used to provide additional information about the dataset, but they are not necessary for the task of sentence fusion.
Please correct me if I am wrong; otherwise, if this understanding is right, how shall I implement this task practically?