北大综述论文 A Survey for In-context Learning 的作者在GitHub上维护了一个论文列表,还在不断更新。目前已收集的论文如下:

Papers

Model Warmup for ICL

This section contains the pilot works that might contributes to the warmup strategies of ICL.

  1. MetaICL: Learning to Learn In Context NAACL 2022 a pretrained language model is tuned to do in-context learning on a large set of training tasks

    Sewon Min, Mike Lewis, Luke Zettlemoyer, Hannaneh Hajishirzi. [pdf], [project], 2021.10,   

  2. Improving In-Context Few-Shot Learning via Self-Supervised Training

    Mingda Chen, Jingfei Du, Ramakanth Pasunuru, Todor Mihaylov, Srini Iyer, Veselin Stoyanov, Zornitsa Kozareva. [pdf], [project], 2022.5,   

  3. Calibrate Before Use: Improving Few-shot Performance of Language Models

    Zihao Zhao, Eric Wallace, Shi Feng, Dan Klein, Sameer Singh. [pdf], [project], 2021.2,   

    • Using N/A string to calibrate LMs away from common token bias

Prompt Tuning for ICL

This section contains the pilot works that might contributes to the prompt selection and prompt formulation strategies of ICL.

  1. On the Effect of Pretraining Corpora on In-context Learning by a Large-scale Language Model

    Seongjin Shin, Sang-Woo Lee, Hwijeen Ahn, Sungdong Kim, HyoungSeok Kim, Boseop Kim, Kyunghyun Cho, Gichang Lee, Woomyoung Park, Jung-Woo Ha, Nako Sung. [pdf], [project], 2022.04,   

    • how in-context learning performance changes as the training corpus varies, investigate the effects of the source and size of the pretraining corpus on in-context learning
  2. Chain of Thought Prompting Elicits Reasoning in Large Language Models

    Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, Denny Zhou. [pdf], [project], 2022.01,   

  3. Least-to-Most Prompting Enables Complex Reasoning in Large Language Models

    Denny Zhou, Nathanael Schärli, Le Hou, Jason Wei, Nathan Scales, Xuezhi Wang, Dale Schuurmans, Claire Cui, Olivier Bousquet, Quoc Le, Ed Chi. [pdf], [project], 2022.05,   

  4. Self-Generated In-Context Learning: Leveraging Auto-regressive Language Models as a Demonstration Generator

    Hyuhng Joon Kim, Hyunsoo Cho, Junyeob Kim, Taeuk Kim, Kang Min Yoo, Sang-goo Lee. [pdf], [project], 2022.06,   

  5. Iteratively Prompt Pre-trained Language Models for Chain of Thought

    Boshi Wang, Xiang Deng, Huan Sun. [pdf], [project], 2022.03,   

  6. Automatic Chain of Thought Prompting in Large Language Models

    Zhuosheng Zhang, Aston Zhang, Mu Li, Alex Smola. [pdf], [project], 2022.10,   

  7. Learning To Retrieve Prompts for In-Context Learning NAACL 2022 Learn an example retriever via contrastive learning

    Ohad Rubin, Jonathan Herzig, Jonathan Berant. [pdf], [project], 2022.12,   

  8. Finetuned Language Models Are Zero-Shot Learners instruction tuning

    Jason Wei, Maarten Bosma, Vincent Y. Zhao, Kelvin Guu, Adams Wei Yu, Brian Lester, Nan Du, Andrew M. Dai, Quoc V. Le. [pdf], [project], 2021.09,   

    • finetuning language models on a collection of tasks described via instructions
    • substantially improves zero-shot performance on unseen tasks
  9. Active Example Selection for In-Context Learning

    Yiming Zhang, Shi Feng, Chenhao Tan. [pdf], [project], 2022.11,   

  10. Prompting GPT-3 To Be Reliable establish simple and effective prompts to demonstrate GPT-3's reliability in these four aspects

  11. An lnformation-theoretic Approach to Prompt Engineering Without Ground Truth Labels 

  12. Self-adaptive In-context Learning 

  13. Demystifying Prompts in Language Models via Perplexity Estimation 

  14. Structured Prompting: Scaling In-Context Learning to 1,000 Examples  Yaru Hao, Yutao Sun, Li Dong, Zhixiong Han, Yuxian Gu, Furu Wei. [pdf], [project], 2022.12.

  15. Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity

Yao Lu, Max Bartolo, Alastair Moore, Sebastian Riedel, Pontus Stenetorp. [pdf], [project], 2021.04,   

  1. On the Relation between Sensitivity and Accuracy in In-context Learning

Yanda Chen, Chen Zhao, Zhou Yu, Kathleen McKeown, He He. [pdf], [project], 2022.09,   

  1. Can language models learn from explanations in context?.  

Andrew K. Lampinen, Ishita Dasgupta, Stephanie C. Y. Chan, Kory Matthewson, Michael Henry Tessler, Antonia Creswell, James L. McClelland, Jane X. Wang, Felix Hill. [pdf], [project], 2022.04   

  1. Prototypical Calibration for Few-shot Learning of Language Models  Zhixiong Han, Yaru Hao, Li Dong, Furu Wei. [pdf], [project], 2022.05.

Analysis of ICL

This section contains the pilot works that might contributes to the influence factors and working mechanism analysis of ICL.

Influence Factors for ICL

  1. Rethinking the Role of Demonstrations: What Makes In-Context Learning Work? 

    Sewon Min, Xinxi Lyu, Ari Holtzman, Mikel Artetxe, Mike Lewis, Hannaneh Hajishirzi, Luke Zettlemoyer. [pdf], [project], 2022.03,   

  2. What Makes Good In-Context Examples for GPT-3? 

    Jiachang Liu, Dinghan Shen, Yizhe Zhang, Bill Dolan, Lawrence Carin, Weizhu Chen. [pdf], [project], 2022.08,   

  3. Emergent Abilities of Large Language Models 

    Jason Wei, Yi Tay, Rishi Bommasani, Colin Raffel, Barret Zoph, Sebastian Borgeaud, Dani Yogatama, Maarten Bosma, Denny Zhou, Donald Metzler, Ed H. Chi, Tatsunori Hashimoto, Oriol Vinyals, Percy Liang, Jeff Dean, William Fedus. [pdf], [project], 2022.07,   

  4. Ground-Truth Labels Matter: A Deeper Look into Input-Label Demonstrations 

    Junyeob Kim, Hyuhng Joon Kim, Hyunsoo Cho, Hwiyeol Jo, Sang-Woo Lee, Sang-goo Lee, Kang Min Yoo, Taeuk Kim. [pdf], [project], 2022.05,   

  5. On the Effect of Pretraining Corpora on In-context Learning by a Large-scale Language Model 

    Seongjin Shin, Sang-Woo Lee, Hwijeen Ahn, Sungdong Kim, HyoungSeok Kim, Boseop Kim, Kyunghyun Cho, Gichang Lee, Woo-Myoung Park, Jung-Woo Ha, Nako Sung. [pdf], [project], 2022.08,   

  6. Rethinking the Role of Scale for In-Context Learning: An Interpretability-based Case Study at 66 Billion Scale 

    Hritik Bansal, Karthik Gopalakrishnan, Saket Dingliwal, Sravan Bodapati, Katrin Kirchhoff, Dan Roth. [pdf], [project], 2022.12,   

  7. Data Distributional Properties Drive Emergent In-Context Learning in Transformers 

    Stephanie C.Y. Chan, Adam Santoro, Andrew K. Lampinen, Jane X. Wang, Aaditya Singh, Pierre H. Richemond, Jay McClelland, Felix Hill. [pdf], [project], 2022.05, 

Working Mechanism of ICL

  1. An Explanation of In-context Learning as Implicit Bayesian Inference 

    Sang Michael Xie, Aditi Raghunathan, Percy Liang, Tengyu Ma. [pdf], [project], 2022.08,   

  2. In-context Learning and Induction Heads 

    Catherine Olsson, Nelson Elhage, Neel Nanda, Nicholas Joseph, Nova DasSarma, Tom Henighan, Ben Mann, Amanda Askell, Yuntao Bai, Anna Chen, Tom Conerly, Dawn Drain, Deep Ganguli, Zac Hatfield-Dodds, Danny Hernandez, Scott Johnston, Andy Jones, Jackson Kernion, Liane Lovitt, Kamal Ndousse, Dario Amodei, Tom Brown, Jack Clark, Jared Kaplan, Sam McCandlish, Chris Olah. [pdf], [project], 2022.10,   

  3. What Can Transformers Learn In-Context? A Case Study of Simple Function Classes 

    Shivam Garg, Dimitris Tsipras, Percy Liang, Gregory Valiant. [pdf], [project], 2022.08,   

  4. "Data Distributional Properties Drive Emergent In-Context Learning in Transformers" 

    Stephanie C. Y. Chan, Adam Santoro, Andrew K. Lampinen, Jane X. Wang, Aaditya Singh, Pierre H. Richemond, Jay McClelland, Felix Hill. [pdf], [project], 2022.05,   

  5. What learning algorithm is in-context learning? Investigations with linear models 

    Ekin Akyürek, Dale Schuurmans, Jacob Andreas, Tengyu Ma, Denny Zhou. [pdf], [project], 2022.11,   

  6. Transformers learn in-context by gradient descent 

    von Oswald, Johannes, Eyvind Niklasson, Ettore Randazzo, João Sacramento, Alexander Mordvintsev, Andrey Zhmoginov, Max Vladymyrov. [pdf], [project], 2022.12,   

  7. Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers 

    Damai Dai, Yutao Sun, Li Dong, Yaru Hao, Zhifang Sui, Furu Wei. [pdf], [project], 2022.12   

Evaluation and Resources

This section contains the pilot works that might contributes to the evaluation or resources of ICL.

  1. Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt et. al.. [pdf], [project], 2022.06,   

  2. SUPER-NATURALINSTRUCTIONS: Generalization via Declarative Instructions on 1600+ NLP Task

    Yizhong Wang, Swaroop Mishra, Pegah Alipoormolabashi, Yeganeh Kordi, Amirreza Mirzaei, Anjana Arunkumar, Arjun Ashok, Arut Selvan Dhanasekaran, Atharva Naik, David Stap, Eshaan Pathak, Giannis Karamanolakis, Haizhi Gary Lai, Ishan Purohit et. al.. [pdf], [project], 2022.04,   

  3. Language Models are Multilingual Chain-of-Thought Reasoners

    Freda Shi, Mirac Suzgun, Markus Freitag, Xuezhi Wang, Suraj Srivats, Soroush Vosoughi, Hyung Won Chung, Yi Tay, Sebastian Ruder, Denny Zhou, Dipanjan Das, Jason Wei. [pdf], [project], 2022.10,   

    • evaluate the reasoning abilities of large language models in multilingual settings, introduce the Multilingual Grade School Math (MGSM) benchmark, by manually translating 250 grade-school math problems from the GSM8K dataset into ten typologically diverse languages.
  4. Instruction Induction: From Few Examples to Natural Language Task Descriptions

    Or Honovich, Uri Shaham, Samuel R. Bowman, Omer Levy. [pdf], [project], 2022.05,   

    • how to learn task instructions from input output demonstrations
  5. Language Models Are Greedy Reasoners: A Systematic Formal Analysis of Chain-of-Thought2022.10.3 

  6. What is Not in the Context? Evaluation of Few-shot Learners with Informative Demonstrations 2212.01692.pdf (arxiv.org) 

Application

This section contains the pilot works that expands the application of ICL.

  1. Meta-learning via Language Model In-context Tuning

    Yanda Chen, Ruiqi Zhong, Sheng Zha, George Karypis, He He. [pdf], [project], 2021.10,   

  2. Does GPT-3 Generate Empathetic Dialogues? A Novel In-Context Example Selection Method and Automatic Evaluation Metric for Empathetic Dialogue Generation

    Young-Jun Lee, Chae-Gyun Lim, Ho-Jin Choi. [pdf], [project], 2022.10,   

  3. In-context Learning Distillation: Transferring Few-shot Learning Ability of Pre-trained Language Models

    Yukun Huang, Yanda Chen, Zhou Yu, Kathleen McKeownpdf, [project], 2022.12,   

  4. In-context Learning Distillation: Transferring Few-shot Learning Ability of Pre-trained Language Models 

  5. Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions

Problems

This section contains the pilot works that points out the problems of ICL.

  1. The Inductive Bias of In-Context Learning: Rethinking Pretraining Example Design

    Yoav Levine, Noam Wies, Daniel Jannai, Dan Navon, Yedid Hoshen, Amnon Shashua. [pdf], [project], 2021.10,   

Challenges and Future Directions

This section contains the pilot works that might contributes to the challenges and future directions of ICL.

内容中包含的图片若涉及版权问题,请及时与我们联系删除