I am new to this community and I am in the early stages of my career as a data scientist. I am a data engineer and I want to make the transition, but I have gaps to fill up, so any help from you guys will be appreciated.
My problem is to answer some questions from scientific papers, but I don’t know a high-level approach that I have to follow. The papers have more than 5000 words, so I can’t fit them in a bert-like model, so I was thinking if there is any way to extract the relevant sentences from the whole that are probably cointaing the answer and later put them in a bert model trained on the Boolq dataset.
I don’t know, maybe this approach is not the optimal one, but any ideas or guidelines are more than welcome.
*Till now, I did the data scraping from the papers and I fit it into a data frame (as a data engineer).