Type of model for PubMed article processing

Hi everyone. I am new to NLP systems, but not to machine learning. HuggingFace seemed like a great place to start as I attempt this latest project of mine. For graduate school, I am working on building a system that can take a PubMed article as the input and output all the research questions they asked and the steps they used to complete it. I have been thinking about this for a while, and I think the best system to use would be a summarization transformer, but instead of summarizing the article it would be trained to output the other stuff. I think this would be good, because like a summarization AI, the answer is not written directly in the article, the answer must be extrapolated and recombined. Can anyone offer me some advice on what type of NLP system to use if a summarization one isn’t the best? Or some steps to start solving this problem? I do not have any NLP experience so I really just need some advice on what avenue to start with.

For example:

Mignone, John L., et al. “Neural Stem and Progenitor Cells in Nestin‐Gfp Transgenic Mice.” Wiley Online Library, John Wiley & Sons, Ltd, 12 Jan. 2004, onlinelibrary.wiley.com/doi/10.1002/cne.10964.

"Neural stem cells generate a wide spectrum of cell types in developing and adult nervous systems. These cells are marked by expression of the intermediate filament nestin. We used the regulatory elements of the nestin gene to generate transgenic mice in which neural stem cells of the embryonic and adult brain are marked by the expression of green fluorescent protein (GFP). We used these animals as a reporter line for studying neural stem and progenitor cells in the developing and adult nervous systems. In these nestin-GFP animals, we found that GFP-positive cells reflect the distribution of nestin-positive cells and accurately mark the neurogenic areas of the adult brain. Nestin-GFP cells can be isolated with high purity by using fluorescent-activated cell sorting and can generate multipotential neurospheres. In the adult brain, nestin-GFP cells are ∼1,400-fold more efficient in generating neurospheres than are GFP-negative cells and, despite their small number, give rise to 70 times more neurospheres than does the GFP-negative population. We characterized the expression of a panel of differentiation markers in GFP-positive cells in the nestin-GFP transgenics and found that these cells can be divided into two groups based on the strength of their GFP signal: GFP-bright cells express glial fibrillary acidic protein (GFAP) but not βIII-tubulin, whereas GFP-dim cells express βIII-tubulin but not GFAP. These two classes of cells represent distinct classes of neuronal precursors in the adult mammalian brain, and may reflect different stages of neuronal differentiation. We also found unusual features of nestin-GFP–positive cells in the subgranular cell layer of the dentate gyrus. Together, our results indicate that GFP-positive cells in our transgenic animals accurately represent neural stem and progenitor cells and suggest that these nestin-GFP–expressing cells encompass the majority of the neural stem cells in the adult brain. J. Comp. Neurol. 469:311–324, 2004. © 2004 Wiley-Liss, Inc.

Generation and analysis of transgenic mice

Fragments of the nestin gene (gift from Drs. R. McKay and L. Zimmerman; Zimmerman et al., [1994] Josephson et al., [1998] Yaworsky and Kappen, [1999] were subcloned into the pBSM13+vector. The 5.8-kb fragment of the promoter region and the 1.8-kb fragment containing the second intron were combined with the cDNA of the enhanced version of GFP (EGFP; Clontech, Palo Alto, CA) and polyadenylation sequences from simian virus 40 and cloned into the pBSM13+ vector, generating nestin-GFP plasmid. In the final construct, EGFP cDNA was placed between the promoter and the intron sequences of the nestin gene, thus matching the arrangement of the regulatory sequences in the nestin gene. The plasmid was isolated and purified through centrifugation in cesium chloride and digested with the SmaI restriction enzyme; this removed the entire vector backbone, leaving the nestin-GFP sequences intact. An 8.7-kb fragment was purified by electrophoresis through the agarose gel and used for the pronuclear injections of the fertilized oocytes from C57BL/6 × Balb/cBy hybrid mice. Use of animals in the present experiments was reviewed and approved by the Cold Spring Harbor Laboratory Animal Use and Care Committee."

Would output:
Research Task : “Mark the intermediate filament nestin with GFP in mice.”
Steps : “Subclone fragments of the nestin gene into pBSM13+ vector.”
“Combine the second intron and the promoter region with the cDNA of GFP.”
“Isolate and purify the plasmid through centrifugation in cesium chloride.”
“Digested the plasmid with the SmaI restriction enzyme.”
“Use pronuclear injections of the fertilized oocytes from C57BL/6 × Balb/cBy hybrid mice”