路徑語言模型

PaLM
開發者	Google AI
前任	LaMDA
繼任	Google Gemini
语言	英語
类型	大型語言模型
网站	ai.google/discover/palm2/

路徑語言模型（英語：Pathways Language Model，通稱：PaLM）是由Google AI開發的5400億參數密集型解碼器專用transformer架構大型語言模型（LLM）^[1]。研究人員同時訓練了PaLM的精簡版本（分別為80億與620億參數），以測試模型規模的影響^[2]。

模型

PaLM具備廣泛的任務能力，涵蓋常識推理、算術推理、笑話解釋、程式碼生成及翻譯等領域^[2]^[3]^[4]^[5]。當結合鏈式思考提示法時，PaLM在需要多步驟推理的數據集上表現顯著提升，例如應用題與邏輯推理題型^[1]^[2]。

該模型最初於2022年4月公布，並在2023年3月前保持私有狀態。當時Google為PaLM及其他多項技術推出API介面^[6]。該API最初僅開放給有限數量的開發者使用，這些開發者需加入候補名單方能獲取，其後才對公眾開放^[7]。

Google與DeepMind開發了一款名為Med-PaLM的PaLM 540B（參數數量達5400億）版本，該模型經醫療數據微調後，在醫療問答基準測試中表現超越前代模型^[8]^[9]。Med-PaLM是首個在美國醫師執照考試題庫中取得合格分數的模型，除能精準回答選擇題與開放式問題外，還能提供推理過程並自我評估回答準確性^[10]。

Google還透過視覺transformer擴展了PaLM，創造出PaLM-E——一款可應用於機器人操作的尖端視覺語言模型^[11]^[12]。該模型能在機器人領域執行任務時展現競爭力，且無需重新訓練或微調^[13]。

2023年5月，Google在年度Google I/O主題演講中宣布推出PaLM 2^[14]。據報導，PaLM 2是一款擁有3,400億個參數的模型，其訓練數據包含3.6兆個詞元^[15]。

2023年6月，Google宣布推出採用PaLM-2架構與初始化的語音轉語音翻譯系統AudioPaLM^[16]。

訓練

PaLM預先訓練於一個包含7800億個詞元的優質語料庫，涵蓋各類自然語言任務與應用場景。此資料集包含過濾後的網頁內容、書籍、維基百科條目、新聞文章、從GitHub開源儲存庫取得的原始碼，以及社群媒體對話^[1]^[2]。該模型基於用於訓練Google LaMDA模型的數據集^[2] 。該數據集中的社交媒體對話內容佔語料庫的50%，有助於提升模型的對話能力^[2]。

PaLM 540B於兩個TPU v4叢集上進行訓練，每個叢集配備3,072顆TPU v4晶片，連接至768台主機，採用模型並行與資料平行混合架構，此為迄今規模最大的TPU配置^[2]^[17]。此架構透過6,144顆晶片實現高效大規模訓練，創下同等規模下大型語言模型最高訓練效率紀錄：每秒浮點運算次數利用率達57.8%^[3]。

參見

LaMDA，PaLM的前任模型
Gemini，PaLM的後任模型

參考資料

^ ^1.0 ^1.1 ^1.2 Narang, Sharan; Chowdhery, Aakanksha. Pathways Language Model (PaLM): Scaling to 540 Billion Parameters for Breakthrough Performance. ai.googleblog.com. [17 March 2023] （英语）.
^ ^2.0 ^2.1 ^2.2 ^2.3 ^2.4 ^2.5 ^2.6 Chowdhery, Aakanksha; Narang, Sharan; Devlin, Jacob; et al. PaLM: Scaling Language Modeling with Pathways. 2022. arXiv:2204.02311  [cs.CL].
^ ^3.0 ^3.1 Anadiotis, George. Google sets the bar for AI language models with PaLM. VentureBeat. 12 April 2022 [17 March 2023].
^ Bastian, Matthias. Google PaLM: Giant language AI can explain jokes. the decoder. 5 April 2022 [17 March 2023].
^ Google: Why Is No One Talking About PaLM. seekingalpha.com. 12 December 2022 [17 March 2023] （英语）.
^ Vincent, James. Google opens up its AI language model PaLM to challenge OpenAI and GPT-3. The Verge. 14 March 2023 [17 March 2023].
^ Huffman, Scott; Woodward, Josh. PaLM API & MakerSuite: an approachable way to start prototyping and building generative AI applications. [17 March 2023] （英语）.
^ Singhal, Karan; Azizi, Shekoofeh; Tu, Tao; et al. Large Language Models Encode Clinical Knowledge. 2022. arXiv:2212.13138  [cs.CL].
^ MedPaLM: New Chatbots Will Soon Be Better Than Waiting For A Doctor. The Medical Futurist. 17 January 2023 [17 March 2023].
^ Matias, Yossi; Corrado, Greg. Our latest health AI research updates. Google. 14 March 2023 [17 March 2023] （美国英语）.
^ Driess, Danny; Xia, Fei; Sajjadi, Mehdi S. M.; et al. PaLM-E: An Embodied Multimodal Language Model. 2023. arXiv:2303.03378  [cs.LG].
^ Driess, Danny; Florence, Pete. PaLM-E: An embodied multimodal language model. ai.googleblog.com. [17 March 2023] （英语）.
^ Edwards, Benj. Google's PaLM-E is a generalist robot brain that takes commands. Ars Technica. 7 March 2023 [17 March 2023] （美国英语）.
^ Lardinois, Frederic. Google launches PaLM 2, its next-gen large language model. TechCrunch. May 10, 2023 [May 10, 2023]. （原始内容存档于May 10, 2023）.
^ Elias, Jennifer. Google's newest A.I. model uses nearly five times more text data for training than its predecessor. CNBC. 16 May 2023 [18 May 2023].
^ AudioPaLM. google-research.github.io. [2023-06-30].
^ An empirical analysis of compute-optimal large language model training. www.deepmind.com. 12 April 2022 [17 March 2023] （英语）.

[blog-1] 1.0 ^1.1 ^1.2 Narang, Sharan; Chowdhery, Aakanksha. Pathways Language Model (PaLM): Scaling to 540 Billion Parameters for Breakthrough Performance. ai.googleblog.com. [17 March 2023] （英语）.

[paper-2] 2.0 ^2.1 ^2.2 ^2.3 ^2.4 ^2.5 ^2.6 Chowdhery, Aakanksha; Narang, Sharan; Devlin, Jacob; et al. PaLM: Scaling Language Modeling with Pathways. 2022. arXiv:2204.02311  [cs.CL].

[venturebeat-3] 3.0 ^3.1 Anadiotis, George. Google sets the bar for AI language models with PaLM. VentureBeat. 12 April 2022 [17 March 2023].

[4] Bastian, Matthias. Google PaLM: Giant language AI can explain jokes. the decoder. 5 April 2022 [17 March 2023].

[5] Google: Why Is No One Talking About PaLM. seekingalpha.com. 12 December 2022 [17 March 2023] （英语）.

[6] Vincent, James. Google opens up its AI language model PaLM to challenge OpenAI and GPT-3. The Verge. 14 March 2023 [17 March 2023].

[7] Huffman, Scott; Woodward, Josh. PaLM API & MakerSuite: an approachable way to start prototyping and building generative AI applications. [17 March 2023] （英语）.

[8] Singhal, Karan; Azizi, Shekoofeh; Tu, Tao; et al. Large Language Models Encode Clinical Knowledge. 2022. arXiv:2212.13138  [cs.CL].

[9] MedPaLM: New Chatbots Will Soon Be Better Than Waiting For A Doctor. The Medical Futurist. 17 January 2023 [17 March 2023].

[10] Matias, Yossi; Corrado, Greg. Our latest health AI research updates. Google. 14 March 2023 [17 March 2023] （美国英语）.

[11] Driess, Danny; Xia, Fei; Sajjadi, Mehdi S. M.; et al. PaLM-E: An Embodied Multimodal Language Model. 2023. arXiv:2303.03378  [cs.LG].

[12] Driess, Danny; Florence, Pete. PaLM-E: An embodied multimodal language model. ai.googleblog.com. [17 March 2023] （英语）.

[13] Edwards, Benj. Google's PaLM-E is a generalist robot brain that takes commands. Ars Technica. 7 March 2023 [17 March 2023] （美国英语）.

[14] Lardinois, Frederic. Google launches PaLM 2, its next-gen large language model. TechCrunch. May 10, 2023 [May 10, 2023]. （原始内容存档于May 10, 2023）.

[cnbc-20230516-15] Elias, Jennifer. Google's newest A.I. model uses nearly five times more text data for training than its predecessor. CNBC. 16 May 2023 [18 May 2023].

[16] AudioPaLM. google-research.github.io. [2023-06-30].

[17] An empirical analysis of compute-optimal large language model training. www.deepmind.com. 12 April 2022 [17 March 2023] （英语）.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

查论编自然语言处理
一般术语	语料库口语语料库停用词词袋人工智慧完全（英语：AI-complete） n元语法（双字母组、三元语法（英语：Trigrams））
文本挖掘	文本分割词性标注（英语：Part-of-speech tagging）拆句处理（英语：Shallow parsing）复合词处理（英语：Compound term processing）搭配提取（英语：Collocation extraction）词干提取词形还原命名实体识别指代文本情感分析概念挖掘（英语：Concept mining）语法分析词义消歧术语提取（英语：Terminology extraction）真实大小写处理（英语：Truecasing）
自动摘要（英语：Automatic summarization）	多文档摘要（英语：Multi-document summarization）句子抽取（英语：Sentence extraction）文本简化（英语：Text simplification）
分佈語義（英语：Distributional semantics）模型	潜在语义学 Seq2Seq模型 Word2vec 語言模型大型语言模型基础模型推理語言模型 LLaMA PaLM ChatGPT GPT-4 文心一言深度求索通義千問 Grok Gemini Copilot 词嵌入
机器翻译	電腦輔助翻譯基于实例（英语：Example-based machine translation）基于规则（英语：Rule-based machine translation）
自动识别与数据采集	语音识别语音合成光学字符识别自然语言生成提示工程
主题模型	弹珠分布（英语：Pachinko allocation）隐含狄利克雷分布潜在语义索引
计算机辅助审查（英语：Computer-assisted reviewing）	自动作文评分（英语：Automated essay scoring）语料库检索工具（英语：Concordancer）文法检查器（英语：Grammar checker）预测文本（英语：Predictive text）拼寫檢查语法猜测（英语：Syntax guessing）
自然语言用户界面（英语：Natural language user interface）	自动在线助手聊天機器人文字冒险游戏問答系統