Can't Replicate GPT-2 Output Detector Demo Results

We’re doing preliminary research on detecting GPT2 and we are not able to replicate their results Huggingface are achieving here in the demo.

I’m wondering which exact dataset was used to train the model running at GPT-2 Output Detector

(When we try it, we get drastically different results than the Huggingface demo).

  • Eric Lancheres