Why Fiction Is the Hardest Detection Case

AI detection works by identifying statistical patterns that differ between AI-generated and human-generated text. Academic writing and professional documents are good detection targets because both humans and AI write them in a fairly consistent register: formal vocabulary, predictable structure, hedged claims.

Fiction actively resists this consistency. A skilled novelist deploys voice shifts, POV changes, dialect, stream of consciousness, intentional sentence fragments, and emotional texture that varies paragraph by paragraph. The statistical fingerprint of good literary fiction is highly variable, which is precisely what makes it literary. This variability is what AI detection tools look for in human text, but it also makes AI fiction look more human when the AI is good.

Three factors make fiction particularly difficult:

Style variation is intentional. A thriller chapter written in terse, punchy sentences does not indicate AI; it indicates good craft. A romance chapter with longer, lyrical prose does not indicate AI either. The registers that pattern-matching tools flag as AI-like are often legitimate stylistic choices.
AI fiction training data is now massive. Every published novel, short story, and fanfic repository was scraped. Models trained on this data produce fiction that statistically resembles the corpus it was trained on, which is the same corpus that human novelists learned from. The distributional gap is narrower than in academic writing.
Good AI fiction is actively improving. Early AI fiction (2022-2023 vintage) had identifiable tells: overly smooth prose, generic emotional descriptors, flat character interiority. 2025-2026 models produce fiction that passes casual reading. The markers have shifted from obvious to subtle.

What AI Fiction Still Gets Wrong

Despite improvements, AI-generated fiction has persistent weaknesses that trained readers notice:

Shallow Interiority

AI describes what characters feel in the same vocabulary that characters in published fiction feel things: "a knot of anxiety tightened in her chest," "he felt a surge of hope," "grief washed over her." These are the phrases that appear most in training data. Human fiction writers who have done the work of understanding their characters produce more specific, unexpected emotional language: not "grief" but what grief feels like for this particular person given their specific history and how they process loss.

The interiority test: pick a paragraph of character internal experience and ask whether it could appear, unchanged, in a different story about a different character with a different background. If yes, it is generic. If no, it is specific. AI tends toward the generic because specificity requires actual understanding of the character, not just pattern-matching to "how characters feel in fiction."

Consequence Avoidance

AI fiction tends to describe events without fully committing to their weight. A character's death is mentioned but not felt. A betrayal is named but not lived in. AI narration tells you what happened and even how the characters reacted, but does not dwell in the emotional aftermath long enough to make it matter. This is partly a pacing issue (AI generates at a consistent rate rather than knowing when to slow down) and partly an understanding issue (emotional weight requires caring about something, which AI does not do).

Dialogue That Explains Too Much

AI dialogue tends toward the expository: characters say things primarily to convey information to the reader rather than to achieve goals, maintain relationships, or reveal character. Real people in conversation do not explain their motivations clearly or announce their emotional states. Real dialogue is oblique, transactional, and often at cross-purposes. AI dialogue is cooperative and helpful in ways real speech never is.

Absence of Specific Physical Detail

Human fiction writers ground their work in specific, observed physical reality: the particular smell of a childhood home, the sound a specific floorboard makes, the texture of a character's grandmother's cardigan. AI generates plausible generic sensory detail ("the smell of fresh coffee," "the soft creak of the door") but cannot generate specific observed detail because it has not observed anything. The physical world in AI fiction is always slightly underfurnished.

Detection Tool Performance on Fiction

Running fiction through a standard AI detection tool requires adjusting expectations significantly:

Terse, stylized prose will score high on many tools. Minimalist literary fiction (short sentences, sparse description, flat affect) shares statistical properties with AI-generated text. A Cormac McCarthy pastiche would score 80%+ on most detectors. This is a false positive problem, not evidence of AI.
Dense, lyrical prose may score lower. Long, complex sentences with varied subordinate clauses, unusual vocabulary, and unconventional syntax score as more human-like even when produced by AI, because current AI fiction defaults to more accessible prose.
Score thresholds need adjustment. Use 80%+ as the threshold for investigation on fiction text, not the 65-70% thresholds appropriate for academic or professional writing. Below 80%, the false positive rate on literary prose is too high to act on.
Minimum length is higher. Use at least 500 words for fiction detection; ideally a full chapter or complete scene. Short story excerpts under 300 words produce unreliable results.

Score Range	Interpretation for Fiction
85%+	Worth investigating; check for shallow interiority and generic sensory detail
70-85%	Ambiguous; stylistic choices may inflate score; qualitative reading required
Below 70%	Low signal on fiction; do not act on detection alone

Literary Contests: How the Policy Landscape Is Evolving

Literary contests have had the most visible public debates about AI-generated submissions. The responses vary significantly:

Blanket Prohibitions

Many high-prestige contests (including several major short story prizes and chapbook competitions) have added explicit AI prohibition language to submission guidelines. Submitting AI-generated or substantially AI-generated work is grounds for disqualification and may result in a ban from future submissions.

The challenge for contest organizers is enforcement. Judges reading blind submissions cannot reliably identify AI-generated fiction from the text alone. Detection tools produce too many false positives on stylized prose to use as automatic filters. The prohibition is primarily a deterrent and a standard-setter, not a technically enforceable rule.

Disclosure Requirements

A growing number of smaller contests have moved to disclosure models rather than prohibitions: "Please indicate whether AI tools were used in any part of the writing process." These allow judges to make contextual decisions rather than applying a binary rule. A poem where the human author used AI to brainstorm rhyme schemes and then rewrote everything is different from a short story generated by a single prompt.

The Authenticity Standard

Some contest guidelines have moved away from AI-specific language toward an authenticity standard: "Submitted work must be the original creative work of the author, representing their own voice and vision." This standard predates AI and can cover both AI generation and other forms of ghostwriting or plagiarism without singling out AI specifically.

Publisher Screening: Manuscript-Level Considerations

Traditional publishers and literary agents are not currently running systematic AI detection on every submission. The volume of unsolicited manuscripts is too high and the tool reliability too low for that to be practical. But several factors are making AI screening more common in specific contexts:

Simultaneous submission floods: Agents and small presses that have seen a dramatic increase in submission volume since 2023 are more likely to run detection on outlier volumes. An author submitting 200 stories to 200 venues simultaneously (visible through tracking tools) may get their work flagged.
Series and ghost-writing arrangements: Publishers commissioning work-for-hire or managing ghostwriting arrangements are increasingly building AI use clauses into contracts, requiring that AI tools are disclosed and that the contracted author has been substantially involved in the work.
Quality outliers: Manuscripts that are significantly better (or worse) than the author's previous work are sometimes flagged for closer review. This is not detection per se, but it catches cases where someone has outsourced a submission to AI after publishing prior work themselves.

Ghostwriting Clients: What to Know and Ask

Ghostwriting is a legitimate, long-established practice. Using a ghostwriter to produce a memoir, business book, or novel is not ethically or legally problematic in most contexts. Using a ghostwriter to produce work for submission to contests that prohibit collaboration is a different matter.

For clients hiring fiction ghostwriters in 2026, the relevant questions:

"What is your process for using AI tools in client work?" Some ghostwriters use AI for first drafts they then heavily revise; others write entirely by hand; others use AI for research and outlining only. Understanding the process is reasonable and should not be controversial to ask.
"Can I see a sample chapter before full payment?" Running a sample through a detection tool gives you a baseline expectation for the finished work. A sample that scores 85%+ consistently suggests more AI involvement than most clients want from a ghostwriting relationship.
"Does the contract include a clause about AI use?" Well-established ghostwriters should be willing to specify their AI tool policy in contract terms. Vague language ("I use various tools to assist my process") is less useful than specific disclosure.

Most professional ghostwriters who use AI substantially in their process are not hiding it. It is a legitimate efficiency tool for experienced writers. The concern is not AI use per se but undisclosed AI use and AI use that produces work below what the client paid for in terms of craft and originality.

The Self-Publishing Context

Self-published authors face a different landscape. Amazon KDP and other platforms have not implemented AI detection at scale and are unlikely to do so given volume. The market consequence for AI-generated self-published fiction is a reader trust question: readers who feel deceived by AI-generated books they paid for do leave reviews, and the scale of AI self-publishing has produced enough low-quality content that readers are more discerning about author reputation and reader reviews.

Authors publishing on Amazon who have built an audience face brand risk from AI-generated content that falls below the quality standard their readers expect, independent of any platform policy.

Bottom Line

AI detection on fiction is less reliable than on academic or professional text, and the false positive rate is meaningfully higher on stylized literary prose. Use detection tools at the 85%+ threshold, on 500+ word samples, and as one signal among several qualitative checks (interiority depth, dialogue authenticity, physical specificity). For contest organizers and publishers, the combination of detection scoring and careful close reading is more effective than either alone. For clients and readers, asking direct questions about process and setting clear contractual expectations is more reliable than post-hoc detection.

Analyze Fiction Text with Airno

Paste 500+ words of narrative prose (a complete scene or chapter excerpt) into Airno. Use 85%+ as the investigation threshold for fiction. Weight the neural sub-score over the pattern sub-score on literary prose.

Try Airno Free

AI Detection for Fiction Writing: Ghostwriters, Contests, and Publisher Screening