Why AI Reviews Are a Growing Problem
Generative AI has made it trivially cheap to produce large volumes of reviews that read as plausible and human. A seller who previously needed to recruit real reviewers or pay for incentivized reviews can now generate hundreds of five-star reviews for under a dollar in API costs. The reviews are grammatically correct, appropriately enthusiastic, and vague enough to avoid product-specific fact-checking.
A 2025 study by the Consumer Policy Research Centre found that an estimated 15 to 30 percent of reviews on major e-commerce platforms showed statistical patterns consistent with AI generation, up from roughly 4 percent in 2022. The increase tracks directly with the public availability of capable generative models.
For consumers, the stakes are direct: purchasing decisions based on fabricated social proof. For honest sellers, the issue is competitive disadvantage against sellers gaming the rating system. For platforms, it is regulatory exposure under consumer protection laws that treat fake reviews as a form of deceptive advertising.
The Detection Challenge: Short Text and Genre Conventions
Reviews are one of the hardest content types for AI detection tools. The reasons are structural:
- Length: Most reviews are 50 to 200 words. Reliable AI detection generally requires 150 words minimum; below that, false positive rates climb significantly. A 75-word review may produce a statistically ambiguous result even if it is clearly AI-generated.
- Genre conventions: The review format is already formulaic. Human reviews follow predictable patterns ("I bought this for X, used it for Y, the Z was impressive"). AI follows the same pattern because it was trained on human reviews. The genre conventions that make reviews useful also make them hard to distinguish statistically.
- Platform homogenization: Review platforms and product pages train users to write in a constrained style. Authentic reviews on Amazon or Google often sound like AI output because the platform itself has shaped writing norms toward brevity, bullet points, and star-rating language.
This does not mean detection is useless on review text. It means that single-review detection has higher uncertainty, and that detection works better as a batch signal across many reviews rather than a per-review verdict.
Statistical Patterns That Betray AI Reviews
Perplexity and Burstiness
AI text tends to have low perplexity (predictable word choices, each word following logically from the last) and low burstiness (sentences of similar length and complexity). Human text has higher perplexity (idiosyncratic word choices, domain slang, personal phrasing) and higher burstiness (one short sentence, then a long complex one, then a fragment).
In reviews, this manifests as AI-written text being unusually smooth. Every sentence connects cleanly. There are no half-formed thoughts, abrupt topic changes, or tangential personal anecdotes. Human reviews meander; AI reviews stay on task.
The Adjective Pattern
AI reviews cluster around a specific vocabulary of approval: "impressed," "exceeded expectations," "highly recommend," "would not hesitate," "well-made," "excellent quality." These phrases are not wrong; they are statistically dominant in training data (positive reviews that were selected as "helpful" and therefore overrepresented). Human reviewers still use them, but not as consistently or as the primary vocabulary.
A practical signal: if a review uses three or more of these stock phrases in under 100 words, the probability of AI generation is elevated.
Absence of Specifics
Human reviewers who actually used a product name specific things: the exact color they ordered, how the size compares to what the listing said, what happened when they contacted customer service, how the product performed after three months. AI-generated reviews describe the product category accurately but avoid specifics because the AI has no actual product experience to draw on.
"The material is high quality and the stitching is well done" is AI-likely. "The navy is closer to indigo in person and the inseam ran an inch long, but the fabric after five washes is still holding its shape" is human-likely.
Reviewer Consistency Signals (Batch Detection)
The most reliable detection strategy at platform scale is not analyzing individual reviews but analyzing reviewer behavioral patterns:
- Reviews posted in rapid succession (multiple reviews within minutes)
- All reviews from the same account at the same confidence level (human reviewers vary; AI campaigns tend to score in a narrow band)
- Cross-category reviewing (a reviewer whose history includes mattresses, coffee makers, software tools, and cosmetics at similar enthusiasm levels within a short period)
- New accounts with immediate reviewing activity and no browsing or purchase history
How Platforms Approach This at Scale
Large e-commerce platforms do not primarily use third-party AI detection tools for reviews. They run proprietary behavioral and network analysis at a scale that external tools cannot match. But the techniques are similar in principle:
Velocity and Account Age Filters
The most effective early filters are not about the text at all. An account created yesterday that submits 40 reviews today is flagged before the text is analyzed. Platform-level fraud detection handles obvious volume attacks efficiently.
Linguistic Clustering
When multiple reviews across different products share high linguistic similarity, they likely came from the same prompt template. Platforms run pairwise similarity on review text within the same seller's product catalog to catch templated AI campaigns.
Device Fingerprinting and IP Analysis
Many AI review campaigns run from the same device, VPN cluster, or AWS region. Behavioral signals that precede the review submission (no organic browsing, direct page navigation to the review form) are caught before linguistic analysis.
For Brands: Monitoring Competitor and Inbound Reviews
Brands face two distinct problems: their own products receiving fake negative reviews from competitors, and their competitors' products receiving fake positive reviews that affect comparative rankings.
Practical steps:
- Export and batch-analyze your recent negative review wave. If a product suddenly receives 20 negative reviews in 48 hours, paste the text from all 20 into a detection tool. High average AI score plus linguistic clustering is the evidence needed for a platform dispute submission.
- Compare review velocity to sales velocity. A product with 8 sales per month should not receive 30 new reviews per month. Mismatch between sales data and review volume is more diagnostic than text analysis alone.
- Document before disputing. Platforms require evidence for fake review removal requests. Screenshots, detection scores, and statistical anomaly documentation together are more persuasive than a subjective claim.
Using Detection Tools on Review Text: Practical Guidance
For individual reviews: paste into Airno and interpret with these adjustments:
| Score | Interpretation for Review Text |
|---|---|
| 85%+ | Strong AI signal; check for additional batch patterns from same reviewer |
| 65-85% | Moderate signal; short review text makes this ambiguous; look for specificity tells |
| 40-65% | Inconclusive; human or AI both plausible; do not act on this alone |
| Below 40% | Human-likely; genuine specifics may also be present |
For batch analysis: concatenate 5 to 10 reviews from the same suspected campaign before submitting. Longer text produces more reliable scores, and if the reviews came from the same AI campaign, the combined text will be more statistically consistent than any individual review.
The Legal Landscape
In the United States, the FTC has expanded its rule on fake reviews to explicitly cover AI-generated testimonials. A 2024 rule prohibits businesses from creating, buying, or disseminating fake consumer reviews including those generated by AI, with civil penalties of up to $51,744 per violation. The rule applies to sellers who pay for or use AI-generated reviews and to platforms that fail to reasonably prevent them.
In the EU, the Omnibus Directive (effective 2022, enforcement accelerating in 2025-2026) requires platforms to take reasonable steps to verify that reviews come from actual purchasers. AI-generated reviews from non-purchasers clearly violate this standard.
For brands reporting competitor review fraud, the FTC accepts consumer complaints and has been increasingly willing to investigate review manipulation as a competition law issue.
For Consumers: Reading Reviews More Critically
Without access to platform-level data, individual consumers face a harder problem. Practical heuristics:
- Filter for verified purchase. Not foolproof (AI campaign operators can purchase items), but eliminates a significant fraction of fabricated reviews.
- Sort by most recent, then look for a velocity spike. A sudden influx of recent five-star reviews on a product that was previously middling is a warning sign.
- Read the critical reviews. Fake review campaigns focus on five-star generation; they rarely generate fake one- and two-star reviews from competitors (too obvious). Legitimate negative reviews often contain the product-specific detail that positive reviews lack.
- Check the reviewer history. A reviewer who has reviewed 60 products across unrelated categories in six months is a red flag. A reviewer with a normal purchase history and occasional reviews is credible.
- Paste suspicious reviews into a detector. For high-stakes purchases, running a batch of recent five-star reviews through an AI detection tool is a five-minute due diligence step.
Bottom Line
AI detection on customer review text is less reliable per-review than on longer content, but still useful as a triage and batch signal. The most effective detection combines text analysis with behavioral signals: posting velocity, account age, reviewer history, and cross-product linguistic similarity. For brands dealing with suspected review manipulation campaigns, text detection plus velocity data plus platform dispute documentation is the right approach. For consumers, reading critically and checking reviewer history remain the most accessible defenses.
Analyze Suspicious Reviews with Airno
Paste a batch of reviews (5 to 10 concatenated) from the same product or reviewer into Airno for a confidence score and per-detector breakdown. Best used as one signal among several, not a standalone verdict.
Try Airno Free