To estimate whether a text was AI-generated, AI detectors analyze several characteristics:
Sentence structure and predictability: AI-generated text often follows consistent patterns, whereas human writing tends to be more varied and unpredictable.
Repetition and uniformity: AI models frequently repeat phrases, while human writers naturally introduce more variation.
Metadata traces: Some AI tools embed hidden markers in their output, which detection models can identify.
Comparison with known AI outputs: Some detectors compare text to a database of known AI-generated content. This method is less effective for advanced AI models that can generate highly varied outputs.
Perplexity and burstiness
Ever notice how AI writing can feel a little too perfect? That’s because it often has low “perplexity” and lacks “burstiness.”
Perplexity measures how predictable a sequence of words is. AI-generated text tends to have lower perplexity because it usually follows common linguistic patterns. Human writing, on the other hand, often includes unexpected word choices, making it higher in perplexity.
Burstiness refers to variations in sentence length and structure. Human writing naturally includes a mix of short and long sentences, which creates a dynamic rhythm. AI-generated text tends to be more uniform.