Hello,
I’m Delse and I would like some help and advice with a personal project that I’ve been thinking about for a long time now.
I’m a total beginner in artificial intelligence and computer coding; I’m a literature and politics major. Nevertheless, I’m aiming to train a model of language using data from literary texts that I’ve selected myself.
This idea came to me when I was looking for a particular poem by a Persian author from the ninth century AD. I tried to use AIs, particularly chatGPT, to help me select his poems; however, the AI was unable to give me original, authentic texts. It could only give me poems that it had composed itself based on an analysis of the author’s style, even if it meant lying to me about the sources (the link was wrong every time). I then tried to get him to record entire books by the author (which I knew to be authentic) and to do the research in the same prompt; but his answers were very approximate, if not false. I ended up doing the work by hand with a word search that left me with a taste of unfinished business.
For several days now, I’ve been trying to find out as much as I can to develop this tool, but my computer skills are so poor that it takes me hours to understand a concept. That’s why I’m sending out this request like a bottle to the sea.
To clarify, I’d like a helping hand to develop an optimised method between: the price, the capabilities of my computer and my personal abilities to train a language model from selected literary data (I have the skills to organise a corpus). The aim is to be able to make :
- Make connections between the texts in the corpus
- Have perfectly authentic quotations from the texts in the corpus
- Create styles from the texts in the corpus (more precise than at present)
- The ability for everyone to create a digital ‘’phantom‘’ of their knowledge.
- And why not even more
Thank you very much for taking the time to read this. I’d like to thank everyone who took the time to reply. And I wish you all success in your projects.
Delse