Funcom Dataset for summarization

Hi airnicco8,

I’m not an expert, but that looks a bit tricky. What would you intend to do with the funcom data? Would you be trying to build a seq-2-seq model that could translate from java code to comment string?

If you are supposed to be doing NLP, then java code might not be appropriate, as java is not a Natural Language.

A big advantage of the huggingface library is that it includes many pre-trained models, that you can fine-tune to your own data. I don’t think there are any models pre-trained on java code. See this page for the list of models available in huggingface https://huggingface.co/transformers/pretrained_models.html

I suggest you start with something simpler.