I found the page Wikipedia:Signs of AI Writing[1] very interesting and informative. It goes into a lot more detail than the typical "em-dashes" heuristic.
"Thus the highly specific "inventor of the first train-coupling device" might become "a revolutionary titan of industry." It is like shouting louder and louder that a portrait shows a uniquely important person, while the portrait itself is fading from a sharp photograph into a blurry, generic sketch. The subject becomes simultaneously less specific and more exaggerated."
I think that's a general guideline to identify "propaganda", regardless of the source. I've seen people in person write such statements with their own hands/fingers, and I know many people who speak like that (shockingly, most of them are in management).
Lots of those points seems to get into the same idea which seems like a good balance. It's the language itself that is problematic, not how the text itself came to be, so makes sense to 100% target what language the text is.
Hopefully those guidelines make all text on Wikipedia better, not just LLM produced ones, because they seem like generally good guidelines even outside the context of LLMs.
Wikipedia already has very detailed guidelines on how text on Wikipedia should look, which address many of these problems.[1] For example, take a look at its advice on "puffery"[2]:
"Peacock example:
Bob Dylan is the defining figure of the 1960s counterculture and a brilliant songwriter.
Just the facts:
Dylan was included in Time's 100: The Most Important People of the Century, in which he was called "master poet, caustic social critic and intrepid, guiding spirit of the counterculture generation". By the mid-1970s, his songs had been covered by hundreds of other artists."
Right, but unless you have a specific page about "This is how to treat AI texts", people will (if they haven't already) bombard you with "This text is so obviously AI written, do something" and by having a specific page to answer to those, you can just link that instead of general "Here's how text on Wikipedia should be" guidelines. Being more specific sometimes helps people understand better :)
The funny thing about this is that this also appears in bad human writing. We would be better off if vague statements like this were eliminated altogether, or replaced with less fantastical but verifiable statements. If this means that nothing of the article is left then we have killed two birds with one stone.
I'm thinking quite a bit about this at the moment in the context of foundational models and their inherent (?) regression to the mean.
Recently there has been a big push into geospatial foundation models (e.g. Google AlphaEarth, IBM Terramind, Clay).
These take in vast amounts of satellite data and with the usual Autoencoder architecture try and build embedding spaces which contain meaningful semantic features.
The issue at the moment is that in the benchmark suites (https://github.com/VMarsocci/pangaea-bench), only a few of these foundation models have recently started to surpass the basic U-Net in some of the tasks.
There's also an observation by one of the authors of the Major-TOM model, which also provides satellite input data to train models, that the scale rule does not seem to hold for geospatial foundation models, in that more data does not seem to result in better models.
My (completely unsupported) theory on why that is, is that unlike writing or coding, in satellite data you are often looking for the needle in the haystack. You do not want what has been done thousands of times before and was proven to work. Segmenting out forests and water? Sure, easy. These models have seen millions of examples of forests and water. But most often we are interested in things that are much, much rarer. Flooding, Wildfire, Earthquakes, Landslides, Destroyed buildings, new Airstrips in the Amazon, etc. etc.. But as I see it, the currently used frameworks do not support that very well.
But I'd be curious how others see this, who might be more knowledgeable in the area.
This is so much detailed and everyone who is sick of reading generated text should read this.
I had a bad experience at a shitty airport, went to google maps to leave a bad review, and found that its rating was 4.7 by many thousand people. Knowing that airport is run by corrupt government, I started reading those super positive reviews and the other older reviews by them. People who could barely manage few coherent sentences of English are now writing multiple paragraphs about history and vital importance of that airport in that region.
Reading first section "Undue emphasis on significance" those fake reviews is all I can think of.
In the case of those big 'foundation models': Fine-tune for whom and how? I doubt it is possible to fine-tune things like this in a way that satisfies all audiences and training set instances. Much of this is probably due to the training set itself containing a lot of propaganda (advertising) or just bad style.
There was a paper recently about using LLMs to find contradictions in Wikipedia, i.e. claims on the same page or between pages which appear to be mutually incompatible.
Either way, I think that generation of article text is the least useful and interesting way to use AI on Wikipedia. It's much better to do things like this paper did.
The Sanderson wiki [1] has a time-travel feature where you read a snapshot just before a publication of a book, ensuring no spoilers.
I would like a similar pre-LLM Wikipedia snapshot. Sometimes I would prefer potentially stale or incomplete info rather than have to wade through slop.
The easiest way to get this is probably Kiwix. You can download a ~100GB file containing all of English Wikipedia as of a particular date, then browse it locally offline.
Alternatively, straight from Wikimedia, those are the dumps I'm using, trivial to parse concurrently and easy format to parse too, multistream-xml in bz2. Latest dump (text only) is from 2026-01-01 and weights 24.1 GB. https://dumps.wikimedia.org/enwiki/20260101/ Also have splits together with indexes, so you can grab few sections you want, if 24GB is too large.
But you can already view the past version of any page on Wikipedia. Go to the page you want to read, click "View history" and select any revision before 2023.
I've made a bunch of nontrivial changes (+- 1000s of characters), none of them seems to have been reverted, never asked for permission, I just went ahead and did it. Maybe the topics I care about are so non-controversial no one actually seen it?
They sold AI giants enterprise downloads in order for them not to hammer Wikimedia's infrastructure by downloading bulk data the usual way available to everyone else. You really have to twist the truth to turn it into something bad for any of the sides.
On PickiPedia (bluegrass wiki - pickipedia.xyz), we've developed a mediawiki extension / middleware that works as an MCP server, and causes all of the contributions from the AI in question to appear as partially grayed out, with a "verify" button. A human can then verify and either confirm the provided source or supply their own.
It started as a fork of a mediawiki MCP server.
It works pretty nicely.
Of course it's only viable in situations where the operator of the LLM is willing to comply / be transparent about that use. So it doesn't address the bulk of the problem on WikiPedia.
That's why they're cataloging specific traits that are common in AI-generated text, and only deleting if it either contains very obvious indicators that could never legitimately appear in a real article ("Absolutely! Here is an article written in the style of Wikipedia:") or violates other policies (like missing or incorrect citations).
It's not inherently bad, but if something was written with AI the chances that it is low effort crap are much much much higher than if someone actually spent time and effort on it.
I found the page Wikipedia:Signs of AI Writing[1] very interesting and informative. It goes into a lot more detail than the typical "em-dashes" heuristic.
[1]: https://en.wikipedia.org/wiki/Wikipedia:Signs_of_AI_writing
An interesting observation from that page:
"Thus the highly specific "inventor of the first train-coupling device" might become "a revolutionary titan of industry." It is like shouting louder and louder that a portrait shows a uniquely important person, while the portrait itself is fading from a sharp photograph into a blurry, generic sketch. The subject becomes simultaneously less specific and more exaggerated."
I think that's a general guideline to identify "propaganda", regardless of the source. I've seen people in person write such statements with their own hands/fingers, and I know many people who speak like that (shockingly, most of them are in management).
Lots of those points seems to get into the same idea which seems like a good balance. It's the language itself that is problematic, not how the text itself came to be, so makes sense to 100% target what language the text is.
Hopefully those guidelines make all text on Wikipedia better, not just LLM produced ones, because they seem like generally good guidelines even outside the context of LLMs.
Wikipedia already has very detailed guidelines on how text on Wikipedia should look, which address many of these problems.[1] For example, take a look at its advice on "puffery"[2]:
"Peacock example:
Bob Dylan is the defining figure of the 1960s counterculture and a brilliant songwriter.
Just the facts:
Dylan was included in Time's 100: The Most Important People of the Century, in which he was called "master poet, caustic social critic and intrepid, guiding spirit of the counterculture generation". By the mid-1970s, his songs had been covered by hundreds of other artists."
[1]: https://en.wikipedia.org/wiki/Wikipedia:Manual_of_Style
[2]: https://en.wikipedia.org/wiki/Wikipedia:Manual_of_Style/Word...
Right, but unless you have a specific page about "This is how to treat AI texts", people will (if they haven't already) bombard you with "This text is so obviously AI written, do something" and by having a specific page to answer to those, you can just link that instead of general "Here's how text on Wikipedia should be" guidelines. Being more specific sometimes helps people understand better :)
That sounds like Flanderization to me https://en.wikipedia.org/wiki/Flanderization
From my experience with LLMs that's a great observation.
The funny thing about this is that this also appears in bad human writing. We would be better off if vague statements like this were eliminated altogether, or replaced with less fantastical but verifiable statements. If this means that nothing of the article is left then we have killed two birds with one stone.
That's actually putting into words, what I couldn't, but felt similar. Spectacular quote
I'm thinking quite a bit about this at the moment in the context of foundational models and their inherent (?) regression to the mean.
Recently there has been a big push into geospatial foundation models (e.g. Google AlphaEarth, IBM Terramind, Clay).
These take in vast amounts of satellite data and with the usual Autoencoder architecture try and build embedding spaces which contain meaningful semantic features.
The issue at the moment is that in the benchmark suites (https://github.com/VMarsocci/pangaea-bench), only a few of these foundation models have recently started to surpass the basic U-Net in some of the tasks.
There's also an observation by one of the authors of the Major-TOM model, which also provides satellite input data to train models, that the scale rule does not seem to hold for geospatial foundation models, in that more data does not seem to result in better models.
My (completely unsupported) theory on why that is, is that unlike writing or coding, in satellite data you are often looking for the needle in the haystack. You do not want what has been done thousands of times before and was proven to work. Segmenting out forests and water? Sure, easy. These models have seen millions of examples of forests and water. But most often we are interested in things that are much, much rarer. Flooding, Wildfire, Earthquakes, Landslides, Destroyed buildings, new Airstrips in the Amazon, etc. etc.. But as I see it, the currently used frameworks do not support that very well.
But I'd be curious how others see this, who might be more knowledgeable in the area.
Outstanding. Praise wikipedia, despite any shortcomings wow, isn't it such a breath of fresh air in the world of 2026.
This is so much detailed and everyone who is sick of reading generated text should read this.
I had a bad experience at a shitty airport, went to google maps to leave a bad review, and found that its rating was 4.7 by many thousand people. Knowing that airport is run by corrupt government, I started reading those super positive reviews and the other older reviews by them. People who could barely manage few coherent sentences of English are now writing multiple paragraphs about history and vital importance of that airport in that region.
Reading first section "Undue emphasis on significance" those fake reviews is all I can think of.
Ironically this is a goldmine for AI labs and AI writer startups to do RL and fine-tuning.
In the case of those big 'foundation models': Fine-tune for whom and how? I doubt it is possible to fine-tune things like this in a way that satisfies all audiences and training set instances. Much of this is probably due to the training set itself containing a lot of propaganda (advertising) or just bad style.
I'm pretty sure Mistral is doing fine tuning for their enterprise clients. OpenAI and Anthropic are probably not?
I'm more thinking about startups for fine-tuning.
There was a paper recently about using LLMs to find contradictions in Wikipedia, i.e. claims on the same page or between pages which appear to be mutually incompatible.
https://arxiv.org/abs/2509.23233
I wonder if something more came out of that.
Either way, I think that generation of article text is the least useful and interesting way to use AI on Wikipedia. It's much better to do things like this paper did.
This is hardly surprising given - New partnerships with tech companies support Wikipedia’s sustainability. Which relies on Human content.
https://wikimediafoundation.org/news/2026/01/15/wikipedia-ce...
I agree with the dig, although it's worth mentioning that this AI Cleanup page's first version was written on the 4th of December 2023.
The Sanderson wiki [1] has a time-travel feature where you read a snapshot just before a publication of a book, ensuring no spoilers.
I would like a similar pre-LLM Wikipedia snapshot. Sometimes I would prefer potentially stale or incomplete info rather than have to wade through slop.
1: https://coppermind.net/wiki/Coppermind:Welcome
The easiest way to get this is probably Kiwix. You can download a ~100GB file containing all of English Wikipedia as of a particular date, then browse it locally offline.
I'm not sure if it's real or not, but the Internet Archive has a listing claiming to be the dump from May 2022: https://archive.org/details/wikipedia_en_all_maxi_2022-05
Alternatively, straight from Wikimedia, those are the dumps I'm using, trivial to parse concurrently and easy format to parse too, multistream-xml in bz2. Latest dump (text only) is from 2026-01-01 and weights 24.1 GB. https://dumps.wikimedia.org/enwiki/20260101/ Also have splits together with indexes, so you can grab few sections you want, if 24GB is too large.
But you can already view the past version of any page on Wikipedia. Go to the page you want to read, click "View history" and select any revision before 2023.
I know but it's not as convenient if you have to keep scrolling through revisions.
Have you personally encountered slop there? I tend to use Wikipedia rabbit holes as a pastime and haven’t really felt a difference.
I wish they also spent on the reverse: automatic rephrasing of the (many) obscure and very poorly worded and/or with no neutral tone whatsoever.
And I say that as a general Wikipedia fan.
WP:BOLD and start your own project to do it.
Or be extra bold, and have an AI bot handle the forum politics associated with being allowed to make nontrivial changes.
Great way to get banned :)
I've made a bunch of nontrivial changes (+- 1000s of characters), none of them seems to have been reverted, never asked for permission, I just went ahead and did it. Maybe the topics I care about are so non-controversial no one actually seen it?
Didn't they just sells access to all the AI giants?
They sold AI giants enterprise downloads in order for them not to hammer Wikimedia's infrastructure by downloading bulk data the usual way available to everyone else. You really have to twist the truth to turn it into something bad for any of the sides.
You mean convenient read-access?
Signed up to help.
On PickiPedia (bluegrass wiki - pickipedia.xyz), we've developed a mediawiki extension / middleware that works as an MCP server, and causes all of the contributions from the AI in question to appear as partially grayed out, with a "verify" button. A human can then verify and either confirm the provided source or supply their own.
It started as a fork of a mediawiki MCP server.
It works pretty nicely.
Of course it's only viable in situations where the operator of the LLM is willing to comply / be transparent about that use. So it doesn't address the bulk of the problem on WikiPedia.
But still might be interesting to some:
https://github.com/magent-cryptograss/pickipedia-mcp
I don't see how this is going to work. 'It sounds like AI' is not a good metric whatsoever to remove content.
Wikipedia agrees: https://en.wikipedia.org/wiki/Wikipedia:Signs_of_AI_writing#...
That's why they're cataloging specific traits that are common in AI-generated text, and only deleting if it either contains very obvious indicators that could never legitimately appear in a real article ("Absolutely! Here is an article written in the style of Wikipedia:") or violates other policies (like missing or incorrect citations).
If that's your takeaway, you need to read the submission again, because that's not what they're suggesting or doing.
This is about wiping unsourced and fake AI generated content, which can be confirmed by checking if the sources are valid
Isn't having a source the only thing that should be required. Why is AI speak bad?
I'm a embarrassed to be associated with US Millennials who are anti AI.
No one cares if you tie your legs together and finish a marathon in 12 hours. Just finish it in 3. Its more impressive.
> Why is AI speak bad
It's not inherently bad, but if something was written with AI the chances that it is low effort crap are much much much higher than if someone actually spent time and effort on it.