AI detector

Enter your text below, and the model will guess if it was written by an AI or not. The model was trained and tested on a dataset consisting of 50% human and 50% AI-generated texts (GPT-3.5 Turbo, via the Open-AI API). The human-generated texts comprised IMDB reviews, introductory sections of Wikipedia articles, Reddit posts (r/AITA and r/relationship_advice), and the AI-generated texts were responses to prompts requesting IMDB reviews on specific films, Wikipedia articles on the same selection of topics as the real Wikipedia articles, and Reddit posts with the same titles as the real Reddit posts. The model performed extremely well during testing, scoring over 99% accuracy on a set of 4772 texts. It is expected to perform well on texts that are either Wikipedia articles, IMDB reviews or Reddit posts in r/AITA and r/relationship_advice, but please input whatever kind of text you please and see how well it performs (if the origin of the texts is known).

About the models

This app uses a deep-learning model called DistilBERT (publicly available on the huggingface hub), which is a smaller version of the BERT large-language model introduced by Google in 2018. Along with the model used by Chat-GPT, DistilBERT and BERT belong to a class of neural networks called transformers, and were pre-trained to perform text prediction on a very large corpus of documents consisting of about 3.3 billion words. When buildling an AI-detector, I re-trained a number of large language models on the smaller collection of texts I compiled, for the purpose of distinguishing between human and AI generated text. My dataset consisted of an equal number of human and AI generated texts, 24180 of which were used for training and 4772 for testing. The human-generated texts came from three different sources: 1) the IMDB review database, 2) introductory sections of Wikipedia articles, and 3) the r/AITA and r/relationship_advice subreddits. For each human-generated text, I prompted GPT-3.5 Turbo to write a similar text. The model performed extremely well during testing, with only 23 texts incorrectly classified out of the 4772. The model is of course expected to perform less well when given texts that are significantly different in style to the training data, e.g. poetry, film scripts, legal documents, etc, or when given texts generated by a large language model other than GPT-3.5 Turbo. The hope is that the knowledge the model acquired from the training data is transferrable to samples of written language from other sources, but it remains to be seen how well the model generalises beyond its training data.