Talk:Open-source artificial intelligence
| This article is rated Start-class on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| Individuals with a conflict of interest, particularly those representing the subject of the article, are strongly advised not to directly edit the article. See Wikipedia:Conflict of interest. You may request corrections or suggest content here on the Talk page for independent editors to review, or contact us if the issue is urgent. |
Open Source AI Status
[edit]With RC1 of the OSI's Open Source AI definition set to come out soon, I think we should modify this page to be in line with their definition. I still think documenting models such as Llama and Mistral are worthwhile, but they are decidedly not Open Source. I am new to editing Wikipedia, so I am not yet sure how to handle the "category" of Open Source artificial intelligence including these models, but I can propose changes and figure out the details later.
Below are my proposed edits:
Change this page to "Open release artificial intelligence" or some other term like that, I am open to and would encourage suggestions. I know that the term "Open weights" is used more often, but it does not encompass all models that we are interested in here, so I propose "Open release" as the alternative for now.
On this page, we would link to the actual "Open Source artificial intelligence" page, and note the distinction between the two - this would look like an expansion of the point currently mentioned under the Llama subheading. Note that all models included on the would still remain on the "Open release artificial intelligence" page, and more could be added as well, but only models which meet the OSI's Open Source AI definition would be included on the "Open Source artificial intelligence" page.
I have some other suggestions for improvement of this page, but I'll stick to this one until it is resolved. JacobHaimes (talk) 02:43, 13 October 2024 (UTC)
- The title "Open-release artificial intelligence" would indeed include more of the models interesting to cover, but not sure the term "open-release" is widely used. Alenoach (talk) 01:32, 9 December 2024 (UTC)
- I've amended all of the remaining instances of this article describing Llama as open-source in Wikipedia's voice to attribute such descriptions to Meta. The Llama (language model) article currently describes Llama as source-available (which is undisputed) and not open-source (which is disputed and would conflict with our Open-source software article). — Newslinger talk 03:40, 15 December 2024 (UTC)
- Thanks folks. I've added some material on both OSAID and MOF and tried to make the history section at least conform to that terminology, clarifying open-source vs open model etc. Still needs significant cleanup but I think keeping this as one page makes sense so all the confusion can be handled here. ★NealMcB★ (talk) 05:07, 25 July 2025 (UTC)
Wiki Education assignment: WRIT 340 for Engineers - Fall 2024 -MW 330-450
[edit]
This article was the subject of a Wiki Education Foundation-supported course assignment, between 26 August 2024 and 6 December 2024. Further details are available on the course page. Student editor(s): Nathanhuh, Hridayg2004, Potatochippy, DiscountKangaroo, BennettHarp (article contribs).
— Assignment last updated by 1namesake1 (talk) 23:30, 28 October 2024 (UTC)
Wiki Education assignment: Signals, Data, and Equity
[edit]
This article was the subject of a Wiki Education Foundation-supported course assignment, between 26 August 2024 and 13 December 2024. Further details are available on the course page. Student editor(s): Phzing (article contribs).
— Assignment last updated by Phzing (talk) 21:39, 22 November 2024 (UTC)
Some Proposed Contributions
[edit]I think this page has some really strong content and could benefit from the following changes:
1. New Equity and Ethical Implications section Discuss how open-source AI democratizes access to technology but raises ethical concerns, including bias, misuse, and the need for ethical guidelines.
2. A more structured 'Concerns' section. Specifically, I'm proposing the addition of some new content and the division of existing content into the following categories:
- Security Risks (Vulnerabilities in Open-source Models, Lack of Security Updates)
- Equity and Ethical Implications (Bias and Discrimination, Lack of Ethical Guidelines)
- Regulatory and Legal Concerns (Unclear Licensing and Usage Restrictions, Compliance with Privacy Laws)
- Potential for Harmful Applications (Weaponization and Bioterrorism, AI for Manipulation and Misinformation)
- Quality and Performance Concerns (Underperformance compared to Closed-source models, lack of robustness in real-world scenarios)
I'm also considering adding Challenges in Open-source AI and Benefits sections, and making updates to the Applications section (to include some new tools), but the two above proposed changes I think are of higher priority.
Happy to incorporate any feedback and also am new to editing Wikipedia, so apologies for anything unconventional. Phzing (talk) 23:06, 22 November 2024 (UTC)
- Your additions look good. I think the article should also introduce the notion of "open-weight". Alenoach (talk) 01:43, 9 December 2024 (UTC)
Filling in the timeline of early open source models.
[edit]![]() | Part of an edit requested by an editor with a conflict of interest has been implemented. |
The current section titled Key milestones in open-source AI (2020s–Present) > Companies and models does a poor job of articulating its subject matter. In fact, the majority of the text is dedicated to PyTorch and the Linux Foundation. While PyTorch and LF are certainly major players in the open source ecosystem, they are not examples of major milestones in the development of open-source AI models. PyTorch is a computer programing language that is very popular, but neither PyTorch nor LF have ever trained a notable open source AI model. Speaking as both an expert in the field and someone who has tried and failed to find independent notable coverage of LF AI&D I especially believe that they should be excluded. PyTorch currently doesn't seem to make much sense to include but I'm open to preserving the associated text in another part of the article.
I have a significant COI for this section as I was personally involved in the training of several of the models discussed (GPT-Neo, GPT-NeoX-20B, Pythia, RWKV, and BLOOM), was a co-lead of the BigScience Research Workshop, and am currently the Executive Director of EleutherAI
My edit is primarily focused on articulating the timeline of major open-source AI models and open-weight models from the release of GPT-3 through 2023. I've tried to find independent secondary sources that meet the notability guidelines for all claims below, but haven't been able to to my satisfaction due to the minimal coverage of the space in the media. I can provide significant primary sources for these claims if that's desired, but I'm choosing to not do so in part to make assessing the quality of the secondary sources easier. I believe that these edits significantly improve the article, but that further details on subsequent developments should also be added.
Key milestones in open-source AI (2020s–Present)
[edit]Companies and models
[edit]The 2020s saw the continued growth and maturation of open-source AI. Companies and research organizations began to release large-scale pre-trained models to the public, which led to a boom in both commercial and academic applications of AI. Notably, Hugging Face, a company focused on NLP, became a hub for the development and distribution of AI models, including open-source versions of transformers like GPT-2 and BERT.[1]
With the announcement of GPT-2, OpenAI originally planned to keep the source code of their models private citing concerns about malicious applications.[2] After OpenAI faced public backlash, however, it released the source code for GPT-2 to GitHub three months after its release.[2] OpenAI has not publicly released the source code or pretrained weights for the GPT-3 or GPT-4 models, though their functionalities can be integrated by developers through the OpenAI API.[3][4]
The rise of large language models (LLMs) and generative AI, such as OpenAI's GPT-3 (2020), further propelled the demand for open-source AI frameworks.[5][6] These models have been used in a variety of applications, including chatbots, content creation, and code generation, demonstrating the broad capabilities of AI systems.[7] At the time of GPT-3's release GPT-2 was still the most powerful open source language model in the world, spurring EleutherAI to train and release GPT-Neo[8] and GPT-J[9][10] in 2021, both of which were the most powerful open-source GPT-style model in the world when they were released. In February 2022 EleutherAI released GPT-NeoX-20B, taking back the title of most powerful open source language model in the world from Meta whose FairSeq Dense 13B model had surpassed GPT-J at the end of 2021 [11]. 2022 also saw the rise of larger and more powerful models under various non-open source licenses including Meta's OPT[12] and Galactica[13][14], the BigScience Research Workshop's BLOOM[15][16], and Tsinghua University's GLM.
In 2023 open-source and open-weight models both exploded in popularity with dozens of models of each type being released by a wide variety of actors. Particularly notable among these were Llama 1 and 2, MosaicML's MPT (a LLaMA-quality model with an open-source license) [17][18], and Mistral AI's Mistral and Mixtral models. The first models trained by start-ups that would grow to become major players such as DeepSeek, Stability AI, and Alibaba (Qwen) were trained in this time, as was RWKV (the first non-transformer architecture to see success at the 7 billion parameter scale) [19]
In 2024, Meta released a collection of large AI models, including Llama 3.1 405B, comparable to the most advanced closed-source models.[20] The company claimed its approach to AI would be open-source, differing from other major tech companies.[20] The Open Source Initiative and others stated that Llama is not open-source despite Meta describing it as open-source, due to Llama's software license prohibiting it from being used for some purposes.[21][22][23]
DeepSeek R1 reasoning model released as an open source project on January 20, 2025.[24]
Stellaathena (talk) 07:43, 21 July 2025 (UTC)
- In 2023 open-source and open-weight models both exploded in popularity with dozens of models of each type being released by a wide variety of actors. Particularly notable among these were Llama 1 and 2, MosaicML's MPT (a LLaMA-quality model with an open-source license) [25][26], and Mistral AI's Mistral and Mixtral models. The first models trained by start-ups that would grow to become major players such as DeepSeek, Stability AI, and Alibaba (Qwen) were trained in this time, as was RWKV (the first non-transformer architecture to see success at the 7 billion parameter scale) [27]
- I've found across the entire article a crazy amount of promotional language, and quite a bit in this request too. The red section I've highlighted is quite promotional and has been declined. The orange section doesn't seem to have anything to do with open-source AI? Might be best placed in the general "History of AI" article. Encoded Talk 💬 07:29, 11 September 2025 (UTC)
- I was, to some extent, trying to tone-match the existing article. In retrospect that was a mistake and I'm glad you turned down the red text. I do want to flag for posterity that the first words I wrote were "At the time of GPT-3's release GPT-2 was still..." and that all prior text was stuff I took from the existing article.
- Regarding the orange text, DeepSeek, Stable Diffusion, Qwen, and RWKV are prominent open source AI models today. I think that the best form of this article talks about each of them (though perhaps not here), and so the idea was to seed discussion of their initial emergence even if it's pre-popularity. Looking at this again, that doesn't make a whole lot of sense if that seeding doesn't actually pay off with subsequent discussion of the models. I said "I believe that these edits significantly improve the article, but that further details on subsequent developments should also be added" because I had hoped this would serve as a starting point rather than something adopted largely as-is. That's obviously not what happened (and may have been an unreasonable expectation on my part), so for now I'll just say that if/when this is further extended to cover 2024/2025 in more detail it may make more sense to include. Stellaathena (talk) 14:51, 16 September 2025 (UTC)
- ^ kakkar, Yuvraj (2024-01-23). "Hugging Face 🤗: Revolutionizing AI Collaboration in the Machine Learning Community". Medium. Retrieved 2024-11-25.
- ^ a b Xiang, Chloe (2023-02-28). "OpenAI Is Now Everything It Promised Not to Be: Corporate, Closed-Source, and For-Profit". VICE. Retrieved 2024-11-14.
- ^ "OpenAI is giving Microsoft exclusive access to its GPT-3 language model". MIT Technology Review. Archived from the original on 2021-02-05. Retrieved 2024-12-08.
- ^ "API platform". openai.com. Retrieved 2024-12-08.
- ^ Staff, Kyle Daigle, GitHub (2023-11-08). "Octoverse: The state of open source and rise of AI in 2023". The GitHub Blog. Archived from the original on 2025-01-21. Retrieved 2024-11-24.
{{cite web}}: CS1 maint: multiple names: authors list (link) - ^ "GPT-3 powers the next generation of apps". 29 March 2024.
- ^ "Generative AI vs. Large Language Models (LLMs): What's the Difference?". appian.com. Retrieved 2024-11-25.
- ^ "GPT-3's free alternative GPT-Neo is something to be excited about". VentureBeat. 2021-05-15. Archived from the original on 9 March 2023. Retrieved 2023-04-14.
- ^ "GPT-3's free alternative GPT-Neo is something to be excited about". VentureBeat. 2021-05-15. Archived from the original on 9 March 2023. Retrieved 2023-04-14.
- ^ "Why Release a Large Language Model?". EleutherAI. 2021-06-02.
- ^ "EleutherAI: When OpenAI Isn't Open Enough". IEEE Spectrum. 2021-06-02.
- ^ Heaven, Will (2022-05-03). "Meta has built a massive new language AI—and it's giving it away for free". MIT Technology Review. Retrieved 2023-12-26.
- ^ Heaven, Will (2022-11-18). "Why Meta's latest large language model survived only three days online". MIT Technology Review. Retrieved 2023-12-26.
- ^ Goldman, Sharon (2022-11-18). [venturebeat.com/ai/what-meta-learned-from-galactica-the-doomed-model-launched-two-weeks-before-chatgpt/ "What Meta learned from Galactica, the doomed model launched two weeks before ChatGPT"]. VentureBeat. Retrieved 2025-07-21.
{{cite web}}: Check|url=value (help) - ^ Heikkilä, Melissa (2022-07-12). "BLOOM: Inside the radical new project to democratize AI". MIT Technology Review. Retrieved 2023-12-26.
- ^ "Release of largest trained open-science multilingual language model ever". French National Centre for Scientific Research. 2022-07-12. Retrieved 2023-12-26.
- ^ Nunez, Michael (2023-06-22). "MosaicML challenges OpenAI with its new open-source language model". VentureBeat. Retrieved 2025-07-21.
- ^ Chen, Joanne (2023-07-19). "MosaicML launches MPT-7B-8K, a 7B-parameter open-source LLM with 8k context length". VentureBeat. Retrieved 2025-07-21.
- ^ Dey, Victor (2024-07-23). "What's Next After Transformers". Forbes. Retrieved 2025-07-21.
- ^ a b Mirjalili, Seyedali (2024-08-01). "Meta just launched the largest 'open' AI model in history. Here's why it matters". The Conversation. Retrieved 2024-11-14.
- ^ Waters, Richard (2024-10-17). "Meta under fire for 'polluting' open-source". Financial Times. Retrieved 2024-11-14.
- ^ Edwards, Benj (18 July 2023). "Meta launches Llama 2, a source-available AI model that allows commercial applications". Ars Technica. Archived from the original on 7 November 2023. Retrieved 14 December 2024.
- ^ "Meta offers Llama AI to US government for national security". CIO. 5 November 2024. Archived from the original on 14 December 2024. Retrieved 14 December 2024.
- ^ "How a top Chinese AI model overcame US sanctions". Archived from the original on 2025-01-25. Retrieved 2025-02-03.
- ^ Nunez, Michael (2023-06-22). "MosaicML challenges OpenAI with its new open-source language model". VentureBeat. Retrieved 2025-07-21.
- ^ Chen, Joanne (2023-07-19). "MosaicML launches MPT-7B-8K, a 7B-parameter open-source LLM with 8k context length". VentureBeat. Retrieved 2025-07-21.
- ^ Dey, Victor (2024-07-23). "What's Next After Transformers". Forbes. Retrieved 2025-07-21.
Stellaathena (talk) 07:43, 21 July 2025 (UTC)
- Thanks for your contributions. I've begun to cover EleutherAI and some others after revamping the terminology as described above. More work is needed and more input is welcome! ★NealMcB★ (talk) 05:09, 25 July 2025 (UTC)
Partly done: I've implemented most of the request. Some language was tweaked. The first paragraph isn't covered by a reliable source. Encoded Talk 💬 07:28, 11 September 2025 (UTC)
- As I mentioned in my other reply, the first words I wrote were "At the time of GPT-3's release GPT-2 was still..." and that all prior text was stuff I took from the existing article. If there's a recommended practice about how to clearly delineate that distinction I would love to do better with future suggested edits!
- One thing I regret writing (but made it into your edit) was the sentence
- In February 2022 EleutherAI released GPT-NeoX-20B, taking back the title of most powerful open source language model in the world from Meta whose FairSeq Dense 13B model had surpassed GPT-J at the end of 2021.
- That is something that I say a lot when talking about the history of EleutherAI, but the way it de-emphasizes Meta's model doesn't feel appropriate for Wikipedia. The counter-arguement would be that FairSeq Dense didn't have a big impact in large part because Meta minimally promoted its release / it wasn't widely known about at the time. That's an assessment that I am comfortable making as a SME, but I don't know where I would source citations for this that meet Wikipedia's editorial standards. But the most important thing is that it's inappropriate for me to recommending wording that minimizes the importance of / narratively de-centers work done by others. I think it would be more appropriate to say something like
- At the time of GPT-3's release GPT-2 was still the most powerful open source language model in the world, spurring EleutherAI to release GPT-Neo and GPT-J in 2021. In December, Meta released the FairSeq Dense13B, which surpassed GPT-Neo and GPT-J in both size and quality. At the same time, EleutherAI was in the process of training GPT-NeoX-20B, which became the largest and most powerful open source LLM when it was released in February 2022.
- It's a minor note but I changed "train and release GPT-Neo and GPT-J in 2021" to "release GPT-Neo and GPT-J in 2021" because GPT-Neo was actually trained in 2020 and was just released much latter. I don't think that needs to be talked about, but this language is more precise.
- To record my thinking with respect to when I stopped with itemizing every new biggest model: There is an early cluster of models that are noteworthy because they were the first open LLMs post GPT-2. I would put GPT-Neo, GPT-J, FairSeq Dense, and GPT-NeoX-20B on this list, along with (non-openly licensed but publicly released) OPT, YaLM, and BLOOM. Qualitatively I would say it feels like something changed in the summer of 2022, where the idea of releasing models went from a unique thing that only a few orgs were doing to something mainstream. Empirically, I have a spreadsheet that I used to keep updated with new model releases and it looks like a model as good or better than FairSeq Dense was released open source or "open weight" (the community term for models with publicly released weights but not OSI-approved licenses) was released every month from September 2022 through December 2023 and beyond. I'm not going to say that this is the correct way to think about / present this history, but I want to be transparent about why I wrote it the way I did. Stellaathena (talk) 15:39, 16 September 2025 (UTC)
Article content
[edit]Hi, I'm just popping on here to let other editors know that I've gone through and ripped out quite a bit of content in this article. There was an absolutely insane amount of promotional language, as well as content that was completely irrelevant to open-source AI and should've been placed in the History of AI article. To be honest, it seemed to me more like a puffed up promotional piece for AI companies to promote their open source stuff, so I am concerned about potential undisclosed COIs editing here.
I've rewritten the history and applications sections, but work will need to be done in benefits, concerns, and frameworks sections. I've tagged the article as such and added it to more AI areas across the wiki to hopefully get more eyes on it as well. Encoded Talk 💬 07:43, 11 September 2025 (UTC)
- The current Frameworks section conflates two different meanings of the term "framework." The "PyTorch deep learning framework" is a set of libraries and a methodological approach to implementing neural networks. But the previous several paragraphs are about frameworks for thinking about what it means to release a model openly.
- There are three things that stand out strongly to me about this section:
- 1. The language about "neutral platform" or "neutral governance" is a huge red flag, that's language that the Linux Foundation uses for self-promotion but which I virtually never hear anyone else using.
- 2. While it isn't obvious on first glance, this section is entirely and exclusively about the Linux Foundation. The discussion of the OSAID is included only insofar as it talks about how LF's Model Openness Framework influenced the OSAID and the PyTorch Foundation is a sub-foundation of the Linux Foundation.
- 3. There seems to be a recurring issue with NPOV and paid contributions to the Linux Foundation page.
- I strongly recommend deleting the entire section. I would do so myself but I'm hesitant to do so as I'm trying to be cautious about COI policies. Stellaathena (talk) 16:10, 16 September 2025 (UTC)
- I think it is important, especially in a time when a bunch of truly promotional material on the web is claiming that open-weight == open source, that we cover the terms accurately. True open source is related to open science, and reproducibility, not just sharing models. I think in doing that it helps to go back to the roots of AI and noting that it indeed used to have a lot of truly open-source material. Yes, that is also the history of AI, but the fact that it was actually open-source, vs open-weight, is what made it possible for a community to help grow it. I'm sure there are more good sources to make that point, and will look for them. ★NealMcB★ (talk) 04:03, 9 October 2025 (UTC)
- The Open Source Initiative's definition [1] and comparison of open weight vs open source [2] is a good place to start.
- One thing I want to emphasize is that there's two separate conversations happening, one within the open source community and one by company PR departments. Here are three openness standards listed in increasing order of strictness:
- 1. The model weights can be downloaded and the model can be run locally subject to a license.
- 2. The model weights and the code to train, finetune, and run the model can be obtained under an OSI-approved license.
- 3. The model weights and the code to train, finetune, and run the model can be obtained under an OSI-approved license. All training data is released under terms consistent with the Open Knowledge Foundation's Open Definition [3].
- You can think of 1 and 3 as far extremes. Nobody would call a model that doesn't meet 1 "open" and 3 is as maximally open as possible. 2 is a notable middle point.
- Many companies (most prominently Meta, but they're far from the only one) want to call models that are more closed than 2 (aka between 1 and 2) "open" when it suites them. This has been widely rejected by the open source community, see for example criticism from the OSI [4][5], myself [6] [7], IEEE Spectrum [8], an opinion piece published in the Register that quotes people from several notable open source orgs [9], IEEE Spectrum [10], etc. [11] [12]. I am unaware of any reliable source within the open source community that is willing to call a model that doesn't meet standard 2 "open source." The debate over whether something that doesn't meet #2 is "open source" is a debate external to the OSS community.
- Within the OSS community there is a debate about where between #2 and #3 we should draw the line. As far as I am aware the only model to meet #3 is Comma v0.1, a model recently trained by researchers at U Toronto, EleutherAI, and several other orgs (primary sources: [13] [14] [15], secondary sources: [16] [17] [18]). This is an extremely strict definition and few advocate for it being the standard, though some do. The OSI definition is one place to draw the line between #2 and #3 and one that I expect to carry the day for various reasons (and support) but there isn't consensus within the community about this. For various positions on this, see [19] [20] [21] [22] [23] [24].
- As far as I am aware, I am the primary person who coined / promoted the term "open weight." I started using it to try to get people to use some other term to refer to Meta's models. Descriptively, I would say that any model that meets #1 is frequently called "open weight." Proscriptively, I wish the term was reserved for models that meet #2. There are multiple attempts to formalize a definition of open weight, such as this one by Heather Meeker [25]. The OSI claims to endorse it [26], but their description of the standard is so incorrect that I don't think its fair to consider it an endorsement of the actual definition.
- I hope this is helpful at laying out the conversation, since I know a lot of the coverage (esp. by news outlets) is written by people who don't really understand this conversation. I'm happy to assist in finding additional reliable sources for any points desired. Stellaathena (talk) 04:27, 4 November 2025 (UTC)
- I think it is important, especially in a time when a bunch of truly promotional material on the web is claiming that open-weight == open source, that we cover the terms accurately. True open source is related to open science, and reproducibility, not just sharing models. I think in doing that it helps to go back to the roots of AI and noting that it indeed used to have a lot of truly open-source material. Yes, that is also the history of AI, but the fact that it was actually open-source, vs open-weight, is what made it possible for a community to help grow it. I'm sure there are more good sources to make that point, and will look for them. ★NealMcB★ (talk) 04:03, 9 October 2025 (UTC)
- Start-Class Artificial Intelligence articles
- Mid-importance Artificial Intelligence articles
- WikiProject Artificial Intelligence articles
- Start-Class Computer science articles
- Mid-importance Computer science articles
- WikiProject Computer science articles
- Start-Class Computing articles
- Mid-importance Computing articles
- Start-Class software articles
- Mid-importance software articles
- Start-Class software articles of Mid-importance
- All Software articles
- Start-Class Free and open-source software articles
- Mid-importance Free and open-source software articles
- Start-Class Free and open-source software articles of Mid-importance
- All Free and open-source software articles
- All Computing articles
- Partially implemented requested edits



