Wikipedia talk:Large language models

This is the talk page for discussing improvements to the Large language models page.

Put new text under old text. Click here to start a new topic.
New to Wikipedia? Welcome! Learn to edit; get help.

Archives (index): 1, 2, 3, 4, 5, 6, 7: 14 days

Relevant discussions elsewhere:

9 Dec. 2022 – Wikipedia:Village pump (policy)/Archive 179#Wikipedia response to chatbot-generated content
17 Jan. 2023 – Wikipedia:Miscellany for deletion/Draft:Social Security in the United States of America (USA)
21 Jan. 2023 – Wikipedia talk:Criteria for speedy deletion/Archive 85#New "G" variety for articles totally consisting of LLM text
21 Feb. 2023 – Wikipedia talk:WikiProject Articles for creation/Archive 53#ChatGPT and other AI generated drafts
26 Feb. 2023 – Wikipedia talk:Wikipedia Signpost/2023-02-20/Essay
23 Mar. 2023 – Wikipedia:Village pump (idea lab)/Archive 47#Adding LLM edit tag
20 Apr. 2023 – Wikipedia talk:Good article nominations/Archive 28#AI again (LLM-asisted GA reviews)
Dec. 2023 – Jan. 2024 – Wikipedia talk:Large language model policy (RfC and other discussion about draft policy)

Template talk: AI generated, OpenAI

Incidents:

Further: Wikipedia talk:Using neural network language models on Wikipedia/List of uses of ChatGPT at Wikipedia

This project page has been mentioned by multiple media organizations:

Claire Woodcock (May 2, 2023). "AI Is Tearing Wikipedia Apart". Vice. Retrieved May 2, 2023. Meanwhile, Wikipedia is working to draft a policy that lays out the limits to how volunteers can use large language models to create content.
Jon Gertner (July 18, 2023). "Wikipedia's Moment of Truth". The New York Times. Retrieved July 21, 2023. For the moment, as the Wikipedia community debates rules and policy, article submissions entirely written by L.L.M.s are heavily discouraged on English-language Wikipedia.
Stephen Harrison (August 24, 2023). "Wikipedia Will Survive A.I." Slate. Retrieved September 5, 2023. These days, Wikipedians are in the process of drafting a policy for how LLMs can be used on the project. What's being discussed is essentially a "take care and declare" framework: ...

AI Cleanup

This page is within the scope of WikiProject AI Cleanup, a collaborative effort to clean up artificial intelligence-generated content on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.AI CleanupWikipedia:WikiProject AI CleanupTemplate:WikiProject AI CleanupAI Cleanup

Wikipedia essays High‑impact

	This page is within the scope of WikiProject Wikipedia essays, a collaborative effort to organize and monitor the impact of Wikipedia essays. If you would like to participate, please visit the project page, where you can join the discussion. For a listing of essays see the essay directory.Wikipedia essaysWikipedia:WikiProject Wikipedia essaysTemplate:WikiProject Wikipedia essaysWikiProject Wikipedia essays
High	This page has been rated as High-impact on the project's impact scale.
	The above rating was automatically assessed using data on pageviews, watchers, and incoming links.

Chatbot to help editors improve articles

This section is pinned and will not be automatically archived.

I wrote a user script called WikiChatbot. It works by selecting text in an article and then clicking one of the buttons on the right to enquire about the selected text. It includes many functions. For example, it can summarize and copyedit the selected text, explain it, and provide examples. The chat panel can also be used to ask specific questions about the selected text or the topic in general. The script uses the AI model GPT 3.5. It requires an API key from OpenAI. New OpenAI accounts can use it freely for the first 3 months with certain limitations. For a more detailed description of all these issues and examples of how the script can be used, see the documentation at User:Phlsph7/WikiChatbot.

I was hoping to get some feedback on the script in general and how it may be improved. I tried to follow WP:LLM in writing the documentation of the chatbot. It would be helpful if someone could take a look to ensure that it is understandable and that the limitations and dangers are properly presented. I also added some examples of how to use edit summaries to declare LLM usage. These suggestions should be checked. Feel free to edit the documentation page directly for any minor issues. I'm also not sure how difficult it is to follow the instructions so it would be great if someone could try to set up the script, use it, and explain which steps were confusing. My OpenAI account is already older than 3 months so I was not able to verify the claims about the free period and how severe the limitations are. If someone has a younger account or is willing to open a new account to try it, that would be helpful.

Other feedback on the idea in general, on its problems, or on new features to implement is also welcome. Phlsph7 (talk) 12:45, 12 July 2023 (UTC)[reply]

I meant to reply to this sooner. This is awesome and I'm interested in this (and related ideas) related to writing / reading with ML. I'll try to have a play and give you some feedback soon. Talpedia 10:18, 17 July 2023 (UTC)[reply]

Related: see also m:ChatGPT plugin. Mathglot (talk) 07:22, 18 July 2023 (UTC)[reply]

Whilst I rather like the ability of this nifty little script to do certain things, I do have some criticism. These functions strike me as extremely risky, to the point that they should probably be disabled:

"is it true?" - ChatGPT likely uses Wikipedia as a source, and in any case, we want verifiability, not truth. I feel quite strongly, based on several other reasons too, that this function should be disabled and never see the light of day again.
"is it biased?" - ChatGPT lacks the ability to truly identify anything more than glaring "the brutal savages attacked the defenceless colonist family" level bias (i.e. something that any reasonably aware human should spot very quickly indeed). Best left to humans.
"is this source reliable?" - Same as the first one, this has so much potential to go wrong that it just shouldn't exist. Sure it might tell you that Breitbart or a self-published source isn't reliable, but it may also suggest that a bad source is reliable, or at least not unreliable.

I don't think that any amount of warnings would prevent misuse or abuse of these functions, since there will always be irresponsible and incompetent people who ignore all the warnings and carry on anyway. By not giving them access to these functions, it will limit the damage that these people would cause. Doing so should not be a loss to someone who is using the tool responsibly, as the output generated by these functions would have to be checked so completely that you might as well just do it without asking the bot.

The doc page also needs a big, obvious warning bar at the top, before anything else, making it clear that use of the tool should be with considerable caution.

The doc page also doesn't comment much on the specific suitability of the bot for various tasks, as it is much more likely to stuff up when using certain functions. It should mention this, and also how it may produce incorrect responses for the different tasks. It also doesn't mention that ChatGPT doesn't give wikified responses, so wikilinks and any other formatting (bolt, italics, etc) must be added manually. The "Write new article outline" function also seems to suggest unencyclopaedic styles, with a formal "conclusion", which Wikipedia articles do not have.

Also, you will need to address the issue of WP:ENGVAR, as ChatGPT uses American English, even if the input is in a different variety of English. Mako001 (C) (T) 🇺🇦 01:14, 23 July 2023 (UTC)[reply]

You can ask it return wikified responses and it will do it with reasonable good success rate. -- Zache (talk) 03:03, 23 July 2023 (UTC)[reply]

@Mako001 and Zache: Thanks for all the helpful ideas. I removed the buttons. I gave a short explanation at Wikipedia:Village_pump_(miscellaneous)#Feedback_on_user_script_chatbot and I'll focus here on the issues with the documentation. I implemented the warning banner and add a paragraph on the limitations of the different functions. That's a good point about the English variant being American so I mentioned that as well. I also explained that the response text needs to be wikified before it can be used in the article.

Adding a function to wikify the text directly is an interesting idea. I'll experiment a little with that. The problem is just that the script is not aware of the existing wikitext. So if asked to wikify a paragraph that already contains wikilinks then it would ignore those links. This could be confusing to editors who only want to add more links. Phlsph7 (talk) 09:12, 23 July 2023 (UTC)[reply]

I made summaries/translations/etc it so that I gave wikitext as input to chatgpt instead of plaintext. However, the problem here is how to get the wikitext from page in first place. -- Zache (talk) 09:48, 23 July 2023 (UTC)[reply]

In principle, you can already do that with the current script. To do so, go to the edit page, select the wikitext in the text area, and click one of the buttons or enter your command in chat panel of the script. I got it to add wikilinks to an existing wikitext and a translation was also possible. However, it seems to have problems with reference tags and kept removing them, even when I told it explicitly not to. I tried it for the sections Harry_Frankfurt#Personhood and Extended_modal_realism#Background, both with the same issue. Maybe this can be avoided with the right prompt. Phlsph7 (talk) 12:09, 23 July 2023 (UTC)[reply]

Thanks for setting this up. I've recently had success drafting new Wikipedia articles by feeding the text of up to 5 RS into GPT4-32k through openrouter.com/playground and simply asking it to draft the article. It does a decent job with the right prompt. You can see an example at Harrison Floyd. I'll leave more details on the talk page of User:Phlsph7/WikiChatbot, but I wanted to post here for other interested parties to join the discussion. Nowa (talk) 00:02, 20 September 2023 (UTC)[reply]

Thanks for the information. I've responded to you at Talk:Harrison_Floyd#Initial_content_summarized_from_references_using_GPT4 so that we don't have several separate discussion about the same issue. Phlsph7 (talk) 07:44, 20 September 2023 (UTC)[reply]

Ran into a brick wall I thought might be helpful to know about. I've been working on the bios of people associated with Spiritual_warfare#Spiritual_Mapping_&_the_Charismatic_movement. GPT 4 and LLama refused to read the RS claiming that it was "abusive". I can see from their point of view why that is, but nonetheless, RS is RS, so I just read it manually. Between that and the challenges of avoiding copyvios I'm a bit sour on the utility of LLMs for assisting in writing new articles. It's just easier to do it manually. Having said that, the Bing chatbot does have some utility in finding RS relative to Google. Much less crap. Nowa (talk) 00:35, 9 October 2023 (UTC)[reply]

If we're going to allow LLM editing, this is a great tool to guide editors to the specific use cases that have community approval (even if those use cases are few to none at this point). I found it to be straightforward and easy to use. –dlthewave ☎ 16:06, 23 July 2023 (UTC)[reply]

There is no policy or guideline disallowing the use of LLM or other machine learning tools. No need for any approval unless that changes. MarioGom (talk) 17:29, 11 February 2024 (UTC)[reply]

Three concrete proposals for a future RfC on generative AI policy

See also recent discussions: User:Cremastra/generative_AI and this discussion of it on Idea Lab.

We need an AI policy. Actually, we have needed one for the past two years. But since we don't have one, several very predictable problems have arisen:

For the past two years, we have been silently accumulating a large amount of AI-generated articles and article edits with immediately noticeable issues. These are happening not only on minor pages but also high-traffic articles and contentious articles, and there are some topics on which most of our articles are now AI. That jump in August is not because of a sudden flood of AI articles in August. It's because that's when people -- mostly myself -- actually decided to find the fish in the barrel.
People are using AI very prolifically; some users have been steadily making hundreds or even thousands of LLM edits for years at this point.
When people look into those articles, they have usually found even more problems than the ones that caused it to get flagged, up to and including blatant shit like chatbot responses left in. Several examples can be found on the AI Cleanup WikiProject noticeboard.
People are starting to notice the AI-generated templates popping up all over the place, and we are starting to get some negative coverage for our AI permissiveness (though most of it so far has been negative coverage of WMF initiatives).
Our policy has drifted away from our enforcement. When LLM use comes up at ANI, users are almost always blocked for it. We have a series of warning templates that tell people not to use AI, ever. If we're going to enforce something as if it's policy, then we might as well codify that.

I know that about 1 million words have been expended here on how we should deal with AI, but clearly that wasn't enough, because the problem is only getting worse. If it were up to me, I would support a blanket ban on AI. This is past the point of being unsustainable. However, we don't have consensus for that (yet?), nor alignment with WMF. So what I'm (pre-)proposing instead are some specific restrictions:

Make WP:LLMDISCLOSE mandatory, or at least an enforceable guideline. This would make triaging AI cleanup much easier, and if people include what tools, versions, and prompts they are using, it will also help research -- e.g., ways that newer/older version of ChatGPT differ in output, how those differ from Gemini or Claude, etc. Because it is not mandatory, basically no one is doing it -- not even the most conscientious LLM users who have been open about using AI when asked.
Prohibit the use of AI on BLP articles. This is already in policy regarding AI-generated images, but the potential risks of errors in AI-generated text are even greater than errors in images. It's very weird that we are doing it this way around.
Prohibit the use of AI on some and/or all contentious topics. I'm less sure about this one because the blast radius is really big on some of them (e.g., Eastern Europe, South Asia), and because people are heavily watching these articles anyway. But the last thing we need is AI hallucinations in articles about medicine or Israel/Palestine.

I am open to other restrictions as well. But we need to do something, anything, and fast. Gnomingstuff (talk) 22:13, 5 October 2025 (UTC)[reply]

I'm also for a blanket ban on AI. But I think the better strategy is to make WP:LLMDISCLOSE mandatory. That can quickly get consensus, and get the ball rolling. It will be useful to make it clear the scale of the problem, and also gives a clear cut reason to block AI abusers - given that they will most likely violate it. Tercer (talk) 23:17, 5 October 2025 (UTC)[reply]

I would also support a blanket ban, and I also agree that making WP:LLMDISCLOSE mandatory will be both good in its own right and an important step towards further improvements. Stepwise Continuous Dysfunction (talk) 17:06, 6 October 2025 (UTC)[reply]

There has also been a very similar discussion, now at Wikipedia:Village pump (policy)/Archive 205#LLM/AI generated proposals?, which, in its subsections, also came to the conclusion that such a policy should be proposed. I volunteered to produce an initial draft, and I haven't forgot about it. It's worth looking at the discussion I linked to, because it includes feedback from experienced editors who have concerns about an overly broad policy. I've been trying to follow as many discussions about this, and as many dispute resolution complaints about it, as I can find, and I'm building on that in my thinking about it. When I have something that is worth having other editors look at, I'll post here and elsewhere to let anyone interested know. I expect some significant refining of the proposal before it will be ready to put before the community in an RfC. --Tryptofish (talk) 22:22, 6 October 2025 (UTC)[reply]

Thanks for linking that -- I knew I was forgetting something. Gnomingstuff (talk) 22:43, 6 October 2025 (UTC)[reply]

@Tryptofish No rush, I know you're probably busy, but any update on this? Gnomingstuff (talk) 15:28, 22 October 2025 (UTC)[reply]

Thanks for asking, and I haven't forgotten. I've had a few real-life issues come up, and I'm swamped with having promised to do too many things here on Wikipedia. I tend to do my writing by thinking it through in my head before writing it down onsite. --Tryptofish (talk) 23:13, 22 October 2025 (UTC)[reply]

Oppose ban - I will weigh in against a complete and blind ban because: First, there's problems with *every* kind of source - IP editors (with lesser restraint or WP care of such), named editors (with egos and bias of such), external sources of various kinds with profit motives, advocacy groups (inherently WP:BIASED), etcetera - and the WP goal should be just say how to get them all in here as appropriate and best helpful. WP for long time has made use of bots -- now it is overdue to say how WP will make use of LLM. I'm sure that print media and colleges are struggling with the same challenges -- so maybe some help can be found there. Second, enforcement or even detection is just too much to ask and why should one ask human editors to chase this or expect they can keep up ? In just a couple of years, AI has recently progressed to the point of videos that one struggles to tell from reality and that rapidly includes knowledge across vast archives in ways that humans just cannot match - and in the near future, the AI output might be expected to be consistently *better* than human in quality, and/or be more *devious* than we can detect without using AI to detect AI.

I think I will start a separate thread looking for positive ideas -- the whole article here seems only negatives. Cheers — Preceding unsigned comment added by Markbassett (talk • contribs) 14:55, 29 October 2025 (UTC)[reply]

I would support making WP:LLMDISCLOSE policy, though given my last snafu with WP:VPP, I'd prefer someone else to make the proposal. —Locke Cole • t • c • b 18:16, 1 November 2025 (UTC)[reply]

I also support better regulation, either policy or guideline. Editor opinions on many LLM-topics are often polarized, so it might be a good idea to aim for a minimal consensus, i.e., focus on disclosure rather than ban. For this reason, it could also be beneficial to keep the proposed changes short, maybe a sentence or two if possible. For an earlier unsuccessful RFC on disclosure of LLM use, see Wikipedia_talk:Large_language_model_policy#RFC.

One possible problem is that LLMs are used for various tasks and only some uses are particularly problematic. For example, some modern spellcheckers are based on LLMs, but insisting that the use of spellcheckers needs to be disclosed in edit summaries is probably controversial. A key problem with LLMs arises when users ask them to write an article or a talk-page comment and copy-paste the results with no or minimal changes. If there is a way to regulate the disclosure of this type of behavior, I assume it would not be particularly controversial. If we call this "creation of original content", the proposal could include something along the lines Editors must disclose (in the edit summary?) if an LLM was used to create original content. Phlsph7 (talk) 09:43, 3 November 2025 (UTC)[reply]

I don't think anyone could reasonbly enforce a prohibition against undisclosed use of spellcheckers, though. These conversations are never about things like that. Personally, I think a blanket ban on adding LLM-generated material to articles is almost certainly the most reasonable approach, because any LLM-generated material that is indistinguishable from material composed by hand will never actually fall afoul of such a ban. I really wish people would think of these things more pragmatically on the whole. LLMs can be used to generate malign material that's of a sort different in species from anything people normally create on their own power; they can rapidly generate paragraphs of article text that look reasonable at a glance and even seem well-supported with references—until you start actually checking the references and realize that the links in them all go nowhere, at which point it turns out that all the text is pure hokum. Someone can add text like this to hundreds of articles in a single day using an LLM that may go undiscovered for weeks or months. They can submit GA reviews that they seem to have just asked an LLM to carry out for them, which may take a while to spot and leave very crestfallen editors in their wake. The people who do these things may firmly and repeatedly cite remarkably flimsy grounds for their actions, like vague press releases from the WMF; if we write any exceptions into policy/guidelines on these matters for "reasonable use" or the like, I suspect it will do little to stop this kind of thing, because the people doing it tend to be absolutely convinced beyond a shadow of a doubt that what they're doing is reasonable until people start calling them on what they're up to—and even then they may not stop, because they think those people are wrong to tell them to stop. Simply saying "No LLMs" is hard to quibble with, and it won't actually do anything to stop people who are genuinely using them responsibly, because that will be invisible. ^('"_'⁾ (Mesocarp) (talk) _(@) 02:27, 6 November 2025 (UTC)[reply]

Support in theory, oppose in practice. WP:LLMDISCLOSE is barely a paragraph in the middle of an essay, while it should be mandatory, that version is insufficient. Changing the other two things to requiring prior talk page discussion should allow for wider coverage while still getting support though. ~ Argenti Aertheri^(Chat?) 08:39, 6 November 2025 (UTC)[reply]

MfD: Wikipedia:Case against LLM-generated articles

There is a Miscellany for deletion discussion at Wikipedia:Miscellany for deletion/Wikipedia:Case against LLM-generated articles that may be of interest to editors of this page.—Alalch E. 10:14, 15 October 2025 (UTC)[reply]

Seeking positive suggestions - maybe for a 'good usage' section

I request in this thread folks provide some suggestions for positives ways LLMs can or should be used, and suggestions for a title if it becomes a separate section for such. The essay here is highly negative - my editor counts 154 "not"s here and guidance such as WP:LLMCOMM are all about blocking it out. There doesn't seem to be content here reflective of external remarks that WP has strategy to use AI, such as here or here or AI Strategy Brief for Editors - STABLE - 2025-02-10 here, or Artificial intelligence in Wikimedia projects and official WP policy WP:BOTPOL also seems to regard it as Skynet. But lets face it -- most users are getting search assist or Alexa or something first along with or instead of WP, and it would be good to be value added not value excluded.

I'd like to see some *useful* suggestions stated, ideas on how to maybe use AI. Surely folks can think of some things unless they'd like to step aside for folks out there smarter than me who can ?

I'm not looking for AI to just be some enhanced editor that humans supervise closely -- I'm looking at AI to do things humans cannot do or do poorly. As an initial start I will offer a few notions.

Creating initial drafts for articles or article sections, identifying major themes in external sources and differentiate pop-press from authoritative content.
Providing evidence for TALK discussions - uncovering points while taking the personality out of it and avoiding claims of cherry-picking.

(I've seen a lot of TALK and RFC discussions with wild claims of WEIGHT or Consensus or inability to look at all relevant WP guidances ...)

Citation generator - finding and listing good reference works in good style that editors can look at

Cheers Markbassett (talk) 15:29, 29 October 2025 (UTC)[reply]

I've seen zero evidence that LLMs can do any of those things to any useful degree of reliability.

'Initial drafts': Firstly, and quite obviously, LLMs have no access whatsoever to sources unless they are online. This would make identifying 'major themes' deeply problematic even without the many other issues LLMs have with citations. Which begin with the simple fact that LLM's routinely 'cite' articles with sources selected not because they actually support content, but because the title suggests they might. And with astonishing frequency mangle up the citations in doing so, if they don't hallucinate them entirely. At best, LLM output might find the occasional useful source for article content, but you'll need to go through whole slews of mangled titles and god-knows-what to find such stuff - a very inefficient and untrustworthy search engine. Note also that as all current LLMs have (as far as I'm aware at least) been trained on large quantities Wikipedia output, any draft on anything of any complexity covered by Wikipedia is liable to generate content violating WP:CIRCULAR, only without the citation to Wikipedia to admit to it.
'Evidence for TALK': Unless you've actually checked the sourcing, it isn't 'evidence'. And it is already cherry-picked, due to online bias, the biases inherent in prompting an LLM to produce what it thinks is appropriate content for Wikipedia - which as we've repeatedly seen, frequently fails to conform to core policies, and in the case of Grok at least, is deliberately prompted to reflect the political stance of the owner.
'Citation generator'. NO. NOT EVER. NOT UNDER ANY CIRCUMSTANCES. Absolutely cannot be trusted, per my earlier comments. They mangle citations. They invent them. They 'cite' based on titles, rather than on the content of the document they are supposedly referring to.

In summary, beyond maybe using LLM output as an extra (inefficient and inaccurate) source search, they are functionally useless, and liable to lead all but the most careful and experienced contributors (who should need this sort of help least) right up the garden path. AndyTheGrump (talk) 23:07, 29 October 2025 (UTC)[reply]

Andy - Perhaps you need a newer or better LLM too - because I think I've seen them doing these three things better than human average in WP, and think these are suggested uses turning up in a generic search.
In any case, this thread topic is for brainstorming and contributing suggested uses. Please try to give some positive ideas on where and how to use LLM, or ways disliked least, because without such the natural consequence is people use it in unguided ways since all uses are equally valued. If you have no preference, then you won't particularly mind the preferences of others - but I'm looking for some additional ideas here. Cheers

Markbassett (talk) 19:55, 31 October 2025 (UTC)[reply]

We can't base recommendations to use LLM's on your personal experience. If you can point to e.g. peer-reviewed research which supports your claims, fine, but meanwhile, we KNOW that LLM's hallucinate. And we KNOW that it has been mathematically proven that this is inherent to the software, and can't be fixed. And no, I'm not going to invent fictitious reasons to use LLMs. If you want an honest appraisal, you don't go around insisting only on 'positive ideas'. And yes, I do mind what people do with this stuff - which is why I'm advocating that they shouldn't. AndyTheGrump (talk) 20:28, 31 October 2025 (UTC)[reply]

Grump - I think you're WP:OFFTOPIC. This thread was clearly stated as asking for positive suggestions on where/what/how such should be used for WP, not about more doubts or more barriers or asking where things are imperfect all over - humans and LLMs. If you're saying you want to remove all the essay negatives that do not meet the peer-reviewed research standard, then say so - I suppose that would fit the thread as less of the negatives would sort of count as a 'positive'. If you want to seek research before adding an application suggestion, feel free to go ahead and do so. If you want to suggest such a peer-reviewed forum that explores applications which we should look to, feel free to do that - not an idea per se but maybe helpful to find such. Meanwhile, out in the world folks are using List of large language models and List of chatbots, and it's going to get in WP and/or replace WP... I think this essay may partly choose which way the WP future might go. Cheers Markbassett (talk) 00:23, 4 November 2025 (UTC)[reply]

I don't give a toss if you think that I'm off-topic. It is grossly inappropriate to try to restrict commentary on a controversial subject solely to those who support your position. That isn't a topic, it is an attempt to manipulate on by improper means. AndyTheGrump (talk) 07:16, 4 November 2025 (UTC)[reply]

The only marginally good use I've seen posited is that of translation for non-English speakers needing to interact with this project. Otherwise, every use of LLM/AI I've seen on EN.WP is just a shit-show. —Locke Cole • t • c • b 00:59, 4 November 2025 (UTC)[reply]

OK, thanks - that's one. Markbassett (talk) 13:34, 5 November 2025 (UTC)[reply]

@Locke Cole Sounds like we haven't met. Hi, I am Polygnotus (allegedly). Polygnotus (talk) 21:34, 5 November 2025 (UTC)[reply]

A thread on finding positive uses for llms that opens with using llms to generate article text is not helping its cause. CMD (talk) 04:48, 4 November 2025 (UTC)[reply]

Again, the topic is asking to provide notion(s) for how to maybe use AI/LLM. Cheers Markbassett (talk) 13:46, 5 November 2025 (UTC)[reply]

I used an AI image generator to add a funny image to WP:OMGWTF (I did manually manipulate it in Photoshop though, so it technically isn't 100% AI). But AI images in article-space are a non-starter for obvious reasons. Other than those edge cases, AI/LLM content should generally be avoided and care should be taken with using an AI/LLM in article research as such use is very easy to misread or misjudge (or simply be wrong) and can lead to undue influence in article editing. It's simply not worth the risk.

Like for example, I just saw an esteemed editor with a long history on the project repeatedly use LLM in their edits, and despite protests claiming they were checking the output, many errors were found and ultimately that editor was indef blocked. —Locke Cole • t • c • b 17:57, 5 November 2025 (UTC)[reply]

OK, thanks. That's another, and seems within WP:AIIMAGES - the AI guidance being still in works is obvious there at the it's discussion link. Sorry to hear about the editor getting an indef ban, that seems symptomatic of excesses in the topic area and lack of clear undo, I think it should aim more to be proportionate and providing a positive good. Cheers Markbassett (talk) 07:07, 7 November 2025 (UTC)[reply]

@Markbassett See T360489 and my userspace. Polygnotus (talk) 21:32, 5 November 2025 (UTC)[reply]

Thanks. Much more detail and tech than my little braindstorming in this thread.

I think the T360489 seems largely potential candidates for officially-accepted WP bot service.

(Though I'm thinking we may often already know better before an AI points it out but have difficulty doing right. Like me and my diet or weight ;-) ) Cheers Markbassett (talk) 07:23, 7 November 2025 (UTC)[reply]

@Markbassett I am experimenting with having an LLM provide feedback on articles, write editsummaries and help verify if claims are supported by the source provided. In all cases the human makes the decision, the LLM only provides information. Polygnotus (talk) 09:53, 7 November 2025 (UTC)[reply]

LLM comment?

An editor posted an RFC review request at WP:AN (specifically, Wikipedia:Administrators' noticeboard#Request for Review of RfC Closure: Talk:Floppy disk). A number of editors, myself included, believe the entire request to be LLM/AI-written. The request was hatted and eventually closed (with a number of admins endorsing the closure). The request was made again, and in their most recent comment they claim they wrote the request.

Could we get some experienced eyes to take a second look and help determine if the editor is writing their own comments or utilizing AI? Thanks! —Locke Cole • t • c • b 05:23, 31 October 2025 (UTC)[reply]

I'm perhaps out of turn here since I was involved in that RFC from random looking at RFC/A. But in hopes this helps, it doesn't seem LLM and not something to chase anyway.

I think it was *not* a LLM output.

(a) the Tom94022 challenge is present in his Sandbox2 at 12 successive edits across 3 days with enough time between edits that seems plausible for manual reviewing and manual changing, and

(b) Tom94022 has previously demonstrated a long-winded pain-in-the-ass nature and deep involvement in tech topics widely as visible from his Talk page and edit history. Ding him for BLUDGEON perhaps, but don't think the whole thing was just popped out of a LLM.

And I think that LLM question is really irrelevant -- because in short the closer put forward only two lines of fairly generic declaring "consensus", and after Talking with ed17 a editor put forward an excessively long request asking for review to look at the strength of arguments in light of policy per WP:CONSENSUS. If someone will just do that it would seem a lot less effort than the already spent effort on long-winded rejections and commentary. Markbassett (talk) 21:11, 31 October 2025 (UTC)[reply]

... the Tom94022 challenge is present in his Sandbox2 at 12 successive edits across 3 days with enough time between edits that seems plausible for manual reviewing and manual changing Inspection of the edit history tells another tale, as FaviFake noted, Special:Diff/1318588111 shows Tom inserting text with markdown syntax which is a telltale sign of LLM usage. The other thing I'm noticing in that edit is that the text inserted has inserted line-breaks, which is also consistent with LLM use. I'd like someone uninvolved (but ideally good at spotting LLM use) to take a look at the initial HATGPT'd request and either opine here or at the still-open WP:AN discussion, as Tom is insisting he wrote the initial request. And what's more distressing to me is that his two most recent replies also show signs of LLM use...

And this is more user-conduct related (and likely something for AN/I if this keeps up) but if he continues to insist he wrote it, then there's a WP:CIR issue here because WP:V explicitly says the opposite of what he's claiming in his request (Reliably sourced information must be included. Removal requires lack of sources, not editorial preference. vs. While information must be verifiable for inclusion in an article, not all verifiable information must be included. Consensus may determine that inclusion of a verifiable fact or claim does not improve an article, and other policies may indicate that the material is inappropriate. Such information should be omitted or presented instead in a different article.). If he admits it was LLM generated, then we can at least give him the benefit of the doubt that it was an AI hallucination.

@FaviFake and TonySt: Pinging the other two editors who noted LLM use. —Locke Cole • t • c • b 16:29, 1 November 2025 (UTC)[reply]

I'll quote the relevant comments of mine:

Comment: [...] The "smart" quotes, large number of headings, each numbered, the incorrect capitalisation of every heading and bolded text, the extensive amount of boldface usage, the long bulleted lists using em dashes, shortcuts not being linked and instead being italicised, the incorrect spacing around slashes, the unnecessarily detailed and frequent references, the repetition of the words "as viewed through the lens of Wikipedia policy" three times, the unnecessary table, and the eerily uncommon words used make me think this is AI-generated.

— Wikipedia:Administrators' noticeboard § c-FaviFake-20251027192200-Locke Cole-20251027190400

See Special:Diff/1318491635 and the rest of their sandbox's history. [...] Here's an edit in which they pasted markdown formatting, and another edit which created a slew of weird, empty references. I'd say this closure review was not written by them.

— Wikipedia:Administrators' noticeboard § c-FaviFake-20251027193000-SnowFire-20251027192700 and Wikipedia:Administrators' noticeboard § c-FaviFake-20251027194300-Locke Cole-20251027193000
10_template�For these reasons, I indeed believe there is an extremely high chance all their three comments were mostly generated using AI. FaviFake (talk) 16:34, 1 November 2025 (UTC)[reply]

Again, I don’t think that a human filing a request should be overlooked for a much harder/longer question, but is the question interesting folks whether the editors did ‘enough’ edits to believed LLM that they avoid the policy of WP:MEATBOT ? Cheers Markbassett (talk) 14:16, 2 November 2025 (UTC)[reply]
What? FaviFake (talk) 14:18, 2 November 2025 (UTC)[reply]
I’m asking then why (or if) this is a topic involved with a request about RFC close appeal? Perhaps it is just a side interest of LockeCole thinking that “entire request to be LLM/AI-written.”

My note of Tom’s 12 successive edits had Locke noting edit 811 where “text inserted has line-breaks” and “text inserted has markdown syntax”. Seems reassuring about not “entire request to be LLM/AI-written” but still thinking that part of the edits came from LLM or other paste. (And adding personal distress that two AN replies are thought to also have ‘signs’ of LLM usage.). So… interesting, but what is the WP policy in question here and does it relate to the RFC review request ? Is it whether it is ‘enough’ edits that they avoid the policy of WP:MEATBOT ? Something else? It still seems a long unnecessary and yucky detour rather than just do a bit more than two-line close and/or just say that nobody desires to do the optional review. Cheers Markbassett (talk) 03:16, 3 November 2025 (UTC)[reply]
what is the WP policy in question here
There's currently a discussion waiting to be closed at WP:RFCL bout this exact issue. Policies may change any day now. FaviFake (talk) 15:26, 3 November 2025 (UTC)[reply]
???? Not seeing any relevant discussion at RFCL -- this did not say a location there and still no Policy named as here in question. The original RFC close and blowing off the review request both ended a couple of days ago, no review 'in light of' any policies was done and all that before this thread even started, so I'm not seeing what this discussion is about. Still thinking the "entire request to be LLM/AI-ritten' sounded not so and personally still think it irrelevant - a human entered a request for review, it's the RFC content up for review, not the request up for inclusion. Anyway, think I'm done here -- over and out. Cheers Markbassett (talk) 00:54, 4 November 2025 (UTC)[reply]
i was referring to Wikipedia:Closure requests#Wikipedia:Village pump (policy)/Archive 205#LLM/AI generated proposals? FaviFake (talk) 05:08, 4 November 2025 (UTC)[reply]
Umm ??? I can't see a relevance. The Wikipedia:Administrators' noticeboard#Request for Review of RfC Closure: Talk:Floppy disk) and the belief "that the entire request was LLM/AI-written" has gotten to discussion of LLM like that other request two months ago about a requested move and whether LLM input is becoming undetectable. I maybe need it spelled out more, but I am not seeing a particular policy point named in that for this discussion nor anything here stating the policy point at stake. The TALK here asked for a look at the Floppy Disk discussion and got evidence counterindicating the belief that Floppy disk appeal "entire request was LLM/AI-written", so comments in that other discussion which may be relevant seem like WP:HATGPT does not apply on this and "We really should not care whether it is or isn't AI-generated, that's just wasting everybody's time trying to determine something that is irrelevant."

Somehow I don't think it is a feasible option nor intent to UNhat and restart the request from HATGPT now being counter-indicated -- nor to offer the editor in question some apology about the hatting -- so there seems no action under consideration here.

Again - it seems far simpler, faster, easier, and more respectable of WP:AGF to note a human made a request for a close review so either review the close or choose to not review the close. Only the text at the RFC and the close is up for review in that. It even seems inappropriate to have content in the request affect the outcome because that's not part of what the RFC closer had to work with, is not what the request asked to be reviewed, and additional post-close evidence or single-voice arguing should not be factors in such a review. Maybe it would be more useful to alter the WP:CLOSECHALLENGE guideline limiting the length of the request to shortly identifying the RFC and the concern. Cheers Markbassett (talk) 15:34, 4 November 2025 (UTC)[reply]

Similar discussion:

Courtesy link: Wikipedia talk:Talk page guidelines § c-Markbassett-20251105143100-Chipmunkdavis-20250905092100. FaviFake (talk) 16:49, 5 November 2025 (UTC)[reply]

Should vs. must?

I had assumed that LLMDISCLOSE was mandatory, but I now realize it's optional. That seems totally broken, and so I'm wondering if anyone here could help enlighten me as to why it was written that way. I ask this with an eye to making this conversation a pre-RfC to make LLM disclosure mandatory. CaptainEek ^{Edits Ho Cap'n!}⚓ 19:37, 2 November 2025 (UTC)[reply]

Probably to offer some flexibility as is the norm. Even COI disclosure is only a should, must only comes into play with PAID and that is mandated by the TOU. Maybe not as important to be precise when wording an essay as opposed to PAGs, but still usually best to try. 184.152.65.118 (talk) 22:01, 2 November 2025 (UTC)[reply]

Previous discussion about making it mandatory is at Wikipedia:Village pump (policy)/Archive 205#Alternative approach: make transparency policy. --Tryptofish (talk) 00:52, 3 November 2025 (UTC)[reply]

@Tryptofish that seems like a rather decent pre-RfC. I see the point about a technical solution, but I think that a policy solution would probably be necessary first to both be a stopgap and to spur on the technical solution. It seems like the path forward is to propose that LLMDISCLOSE be made mandatory and elevated to policy? You obviously have way more experience in this arena so I'll happily defer to you, just don't want this to wither on the vine :) CaptainEek ^{Edits Ho Cap'n!}⚓ 17:27, 3 November 2025 (UTC)[reply]

Thanks. I'm going to give a more detailed reply below, but it applies to both what you are saying and what isaacl is saying. --Tryptofish (talk) 21:20, 3 November 2025 (UTC)[reply]

I submitted a closure request for that village pump discussion a week and a half ago. — Newslinger talk 20:41, 3 November 2025 (UTC)[reply]

Thanks for that, too. Pity it got archived without a consensus being found. --Tryptofish (talk) 21:20, 3 November 2025 (UTC)[reply]

I'm mildly hopeful the LLMDISCLOSE may make the cut, the rest was my being unaware of the prior RFC and basically re-asking for something the community had already supported. But even if LLMDISCLOSE doesn't make policy, I think tackling this one piece at a time will prove more useful than trying to push one grand proposal (unless it's easily broken into pieces and put up as a Watchlist notice, etc for a month). —Locke Cole • t • c • b 23:32, 3 November 2025 (UTC)[reply]

Generally speaking, the community doesn't object to all uses of programs to assist with writing. The concerns are about using programs that generate original content, with greater detail than any human input used to trigger the generation. Thus in my view, any guidance should focus on how the program is being used, rather than the underlying technology (which in any case might not be readily apparent to the end user).

I think from a holistic perspective, a key question is what happens next if disclosure is required? If it's just a pre-cursor to removing the text, then maybe we should be banning generated text from mainspace instead (as with a disclosure requirement, of course this only works for those who read and follow guidance). If it's to queue up edits for examination, then what's the best format for disclosure that assists with this (template used on talk page, perhaps?) and do we need to organize more volunteers to manage and process the queue? Can we build more tools to help with analyzing edits (including presumably the vast majority of problematic edits from editors who won't comply with any relevant guidance)? isaacl (talk) 17:30, 3 November 2025 (UTC)[reply]

I've been paying very close attention to as many community discussions about these issues as I can find, and yes, we definitely need to find a carefully calibrated middle ground between no regulation and too much regulation. It's very clear to me that many members of the editing community don't want a complete ban on LLM use, and want to be careful that it doesn't become something that editors might weaponize against one another, and from my own observation, I've come to agree that this is a risk that we need to avoid. At the same time, by now, most of us have seen use of LLM-generated content that is obviously disruptive.

I've been working (I promise I have!) on trying to pull together a draft proposal for a potential policy (a full policy page, including, but going beyond, disclosure), that others could then workshop further before putting it to the community for an RfC. It's not easy, and I've been swamped with other stuff, both on and off site. So I want to promise CaptainEek that I won't let it wither on the vine – but I expect that it will take a while before it ripens on the vine. --Tryptofish (talk) 21:20, 3 November 2025 (UTC)[reply]

User:CaptainEek - Why disclosure was written that way -- Archives shows the discussion in Oct 2023 about a RFC showing this page was clearly not going to be promoted to either policy or guideline, so it was relabeled as essay and to make the phrasing more essay-like replaced "must"s with "should"s to "better reflect its status as an essay". See also discussion about LLMDISCLOSE lacking clarity on how to disclose and being incentivised to hide it completely instead of disclosing. Cheers Markbassett (talk) 05:30, 4 November 2025 (UTC)[reply]