Hello, basically I am trying to improve the generated horror stories, creepy pastas to be exact.
I was following a tutorial where someone fine-tuned the falcon-7b in order to create better mid journey prompts, I figured that it is not a distant use-case to mine so it would work.
Well it works, kinda.
The model gets usually stuck on one sentence and repeats it over and over, even when I penalize heavy it provides some part that is ok, but after some tokens it hangs up on one sentence.
Also the stories don’t seem to end.
My problem is that I don’t know where the fault happened since there are many possible points I believe.
I think my dataset is good, I found it online and it provides 3200 creepy pasta stories with a name, rating, category, estimated reading time and tags.
I removed all features but the story and the estimated reading time.
This is the colab document that I rewrote for my use case, I did not change anything but the dataset and the training prompt.
I would appreciate a push in the right direction, whether my use-case is even possible and what my next step could be.