Hey @Endre cool to hear that you’re interested in this project!
Given that it might be time consuming to translate all of SQuAD into Hungarian or Romanian, it might make sense to first start by training a model on an existing dataset in one of those languages.
For example, there is the mqa dataset which is a different type of question answering called “community question answering”. It has subsets in both your languages and this way you can get a model trained / Space up and running faster than creating the dataset from scratch.
Community QA is more of a retrieval based approach and you can find an example of what it involves here with the haystack library (based on transformers).
Of course you’re welcome to create your own SQuAD dataset, but thought I should provide an alternative just in case 