AI-Generated Text Detection: Is There a Feature in transformers?

Degnel · January 22, 2025, 12:51pm

A common method in the literature for detecting whether a text has been generated by AI involves using the output probabilities of a Language Model (LLM) and checking how well they match the given text. Naturally, this is an LLM-dependent feature, as each LLM assigns different token probabilities in various contexts. Given the universal applicability of this approach and its significance, I was curious if the transformers library offers a similar feature. I looked in the pipeline section, as it seemed the most appropriate place, but I didn’t find anything relevant.

I would like to gather your insights to understand:

If this feature is of interest to other members of the community?
If it has already been implemented or if there are any ongoing efforts in this direction?

Thank you for your feedback!

John6666 · January 22, 2025, 1:00pm

I’m not very knowledgeable about this, but I don’t think it’s a built-in feature of the library.
I sometimes see people doing it with BERT or derived models.

Degnel · January 22, 2025, 1:18pm

Hi John,

Thank you for sharing the sourcing! It is precisely because many people recode it for numerous models that I think it is an interesting feature

Best,

Topic		Replies	Views
Conceptual questions about transformers 🤗Transformers	10	1081	August 26, 2021
Guidance on getting started with fine tuned uncensored model Beginners	2	1070	March 8, 2025
Compare the likelihood of various sentences in a LM? Beginners	1	391	July 18, 2021
Transformers + Attention / or LLMs in other contexts: (I.e. AlphaFold, ForceGen, etc) Beginners	0	150	March 12, 2024
Theme Extraction from Text Research	1	1852	December 29, 2023

AI-Generated Text Detection: Is There a Feature in transformers?

Related topics