The most reliable resource for identifying AI-generated text is Wikipedia

Bitget App

Trade smarter

Bitget

News

Markets

The most reliable resource for identifying AI-generated text is Wikipedia

Bitget-RWA2025/11/20 18:45

By:Bitget-RWA

Many of us have experienced that nagging feeling that a piece of text was generated by a language model, yet it's surprisingly challenging to confirm. For a period last year, there was a widespread belief that certain words like “delve” or “underscore” were clear indicators of AI authorship, but there’s little solid proof, and as these models have advanced, such obvious clues have become much less apparent.

Interestingly, Wikipedia editors have become quite adept at spotting writing produced by AI, and their publicly available “Signs of AI writing” guide is the most useful tool I’ve encountered for verifying those suspicions. (Thanks to poet Jameson Fitzpatrick for highlighting this resource on X.)

Since 2023, Wikipedia’s editors have been tackling the issue of AI-generated content through an initiative called Project AI Cleanup. With millions of daily edits to review, they have ample examples to study, and true to Wikipedia’s tradition, the team has assembled a comprehensive and evidence-based field guide.

The guide starts by reaffirming what many already suspect: automated detection tools are largely ineffective. Instead, it highlights certain writing patterns and expressions that are uncommon on Wikipedia but frequently found elsewhere online (and thus, prevalent in AI training data). The guide notes that AI-generated entries often go out of their way to stress the importance of a topic, typically using broad phrases like “a pivotal moment” or “a broader movement.” These models also tend to list minor media mentions to make a subject appear more significant—behavior more typical of a personal profile than an impartial source.

One notable pattern the guide points out is the use of trailing clauses with vague assertions of significance. AI models might claim that an event is “emphasizing the significance” of something or “reflecting the continued relevance” of a concept. (Grammar enthusiasts will recognize this as the use of present participles.) While this can be subtle, once you know to look for it, it becomes much easier to spot.

Another common feature is the use of generic, promotional language that’s widespread online. Descriptions are often overly positive—landscapes are always beautiful, views are always stunning, and everything is described as spotless and up-to-date. As the editors describe it, “it reads more like a script from a commercial.”

The entire guide is well worth reading, and I found it quite insightful. Previously, I would have argued that LLM-generated writing was evolving too rapidly to reliably identify. However, the tendencies highlighted here are deeply rooted in how AI models are built and used. While these habits can be masked, eliminating them entirely will be difficult. If the public becomes more skilled at recognizing AI-generated text, it could lead to some fascinating changes.

Disclaimer: The content of this article solely reflects the author's opinion and does not represent the platform in any capacity. This article is not intended to serve as a reference for making investment decisions.

PoolX: Earn new token airdrops

Lock your assets and earn 10%+ APR

Lock now!

- Senate Agriculture Committee confirmed Trump's CFTC nominee Michael Selig along party lines, advancing his nomination for final Senate approval. - Selig, an SEC crypto advisor, would expand CFTC's oversight of crypto spot markets under the CLARITY Act, positioning it as a key digital asset regulator. - Democrats raised concerns about CFTC's limited resources (543 staff vs. SEC's 4,200) and potential single-party control after current chair's expected resignation. - Selig emphasized "clear rules" for cryp

Bitget-RWA•2025/11/20 20:08

CFTC’s Expanded Crypto Responsibilities Challenge Regulatory Preparedness and Cross-Party Cooperation

Bitcoin Updates: Bitcoin Approaches Crucial Support Level Amid Heightened Fear, Indicating Possible Recovery

- Bitcoin fell to a seven-month low near $87,300, testing key support levels amid heavy selling pressure and extreme bearish sentiment. - Analysts highlight a "max pain" zone between $84,000-$73,000, with historical patterns suggesting rebounds after fear indices hit annual lows. - The Crypto Fear & Greed Index at 15—a level preceding past rebounds—aligns with historical 10-33% post-dip recovery trends. - A 26.7% correction triggered $914M in liquidations, but a 2% rebound to $92,621 shows resilience amid

Bitget-RWA•2025/11/20 20:08