语境学习（In-context Learning）论文列表

北大综述论文 A Survey for In-context Learning 的作者在GitHub上维护了一个论文列表，还在不断更新。目前已收集的论文如下：

Papers

Model Warmup for ICL

This section contains the pilot works that might contributes to the warmup strategies of ICL.

MetaICL: Learning to Learn In Context NAACL 2022 a pretrained language model is tuned to do in-context learning on a large set of training tasks.

Sewon Min, Mike Lewis, Luke Zettlemoyer, Hannaneh Hajishirzi. [pdf], [project], 2021.10,
Improving In-Context Few-Shot Learning via Self-Supervised Training.

Mingda Chen, Jingfei Du, Ramakanth Pasunuru, Todor Mihaylov, Srini Iyer, Veselin Stoyanov, Zornitsa Kozareva. [pdf], [project], 2022.5,
Calibrate Before Use: Improving Few-shot Performance of Language Models.

Zihao Zhao, Eric Wallace, Shi Feng, Dan Klein, Sameer Singh. [pdf], [project], 2021.2,
- Using N/A string to calibrate LMs away from common token bias

Prompt Tuning for ICL

This section contains the pilot works that might contributes to the prompt selection and prompt formulation strategies of ICL.

On the Effect of Pretraining Corpora on In-context Learning by a Large-scale Language Model.

Seongjin Shin, Sang-Woo Lee, Hwijeen Ahn, Sungdong Kim, HyoungSeok Kim, Boseop Kim, Kyunghyun Cho, Gichang Lee, Woomyoung Park, Jung-Woo Ha, Nako Sung. [pdf], [project], 2022.04,
- how in-context learning performance changes as the training corpus varies, investigate the effects of the source and size of the pretraining corpus on in-context learning
Chain of Thought Prompting Elicits Reasoning in Large Language Models.

Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, Denny Zhou. [pdf], [project], 2022.01,
Least-to-Most Prompting Enables Complex Reasoning in Large Language Models.

Denny Zhou, Nathanael Schärli, Le Hou, Jason Wei, Nathan Scales, Xuezhi Wang, Dale Schuurmans, Claire Cui, Olivier Bousquet, Quoc Le, Ed Chi. [pdf], [project], 2022.05,
Self-Generated In-Context Learning: Leveraging Auto-regressive Language Models as a Demonstration Generator.

Hyuhng Joon Kim, Hyunsoo Cho, Junyeob Kim, Taeuk Kim, Kang Min Yoo, Sang-goo Lee. [pdf], [project], 2022.06,
Iteratively Prompt Pre-trained Language Models for Chain of Thought.

Boshi Wang, Xiang Deng, Huan Sun. [pdf], [project], 2022.03,
Automatic Chain of Thought Prompting in Large Language Models.

Zhuosheng Zhang, Aston Zhang, Mu Li, Alex Smola. [pdf], [project], 2022.10,
Learning To Retrieve Prompts for In-Context Learning NAACL 2022 Learn an example retriever via contrastive learning.

Ohad Rubin, Jonathan Herzig, Jonathan Berant. [pdf], [project], 2022.12,
Finetuned Language Models Are Zero-Shot Learners instruction tuning.

Jason Wei, Maarten Bosma, Vincent Y. Zhao, Kelvin Guu, Adams Wei Yu, Brian Lester, Nan Du, Andrew M. Dai, Quoc V. Le. [pdf], [project], 2021.09,
- finetuning language models on a collection of tasks described via instructions
- substantially improves zero-shot performance on unseen tasks
Active Example Selection for In-Context Learning.

Yiming Zhang, Shi Feng, Chenhao Tan. [pdf], [project], 2022.11,
Prompting GPT-3 To Be Reliable establish simple and effective prompts to demonstrate GPT-3's reliability in these four aspects
An lnformation-theoretic Approach to Prompt Engineering Without Ground Truth Labels
Self-adaptive In-context Learning
Demystifying Prompts in Language Models via Perplexity Estimation
Structured Prompting: Scaling In-Context Learning to 1,000 Examples Yaru Hao, Yutao Sun, Li Dong, Zhixiong Han, Yuxian Gu, Furu Wei. [pdf], [project], 2022.12.
Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity.

Yao Lu, Max Bartolo, Alastair Moore, Sebastian Riedel, Pontus Stenetorp. [pdf], [project], 2021.04,

On the Relation between Sensitivity and Accuracy in In-context Learning.

Yanda Chen, Chen Zhao, Zhou Yu, Kathleen McKeown, He He. [pdf], [project], 2022.09,

Can language models learn from explanations in context?.

Andrew K. Lampinen, Ishita Dasgupta, Stephanie C. Y. Chan, Kory Matthewson, Michael Henry Tessler, Antonia Creswell, James L. McClelland, Jane X. Wang, Felix Hill. [pdf], [project], 2022.04

Prototypical Calibration for Few-shot Learning of Language Models Zhixiong Han, Yaru Hao, Li Dong, Furu Wei. [pdf], [project], 2022.05.

Analysis of ICL

This section contains the pilot works that might contributes to the influence factors and working mechanism analysis of ICL.

Influence Factors for ICL

Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?

Sewon Min, Xinxi Lyu, Ari Holtzman, Mikel Artetxe, Mike Lewis, Hannaneh Hajishirzi, Luke Zettlemoyer. [pdf], [project], 2022.03,
What Makes Good In-Context Examples for GPT-3?

Jiachang Liu, Dinghan Shen, Yizhe Zhang, Bill Dolan, Lawrence Carin, Weizhu Chen. [pdf], [project], 2022.08,
Emergent Abilities of Large Language Models

Jason Wei, Yi Tay, Rishi Bommasani, Colin Raffel, Barret Zoph, Sebastian Borgeaud, Dani Yogatama, Maarten Bosma, Denny Zhou, Donald Metzler, Ed H. Chi, Tatsunori Hashimoto, Oriol Vinyals, Percy Liang, Jeff Dean, William Fedus. [pdf], [project], 2022.07,
Ground-Truth Labels Matter: A Deeper Look into Input-Label Demonstrations

Junyeob Kim, Hyuhng Joon Kim, Hyunsoo Cho, Hwiyeol Jo, Sang-Woo Lee, Sang-goo Lee, Kang Min Yoo, Taeuk Kim. [pdf], [project], 2022.05,
On the Effect of Pretraining Corpora on In-context Learning by a Large-scale Language Model

Seongjin Shin, Sang-Woo Lee, Hwijeen Ahn, Sungdong Kim, HyoungSeok Kim, Boseop Kim, Kyunghyun Cho, Gichang Lee, Woo-Myoung Park, Jung-Woo Ha, Nako Sung. [pdf], [project], 2022.08,
Rethinking the Role of Scale for In-Context Learning: An Interpretability-based Case Study at 66 Billion Scale

Hritik Bansal, Karthik Gopalakrishnan, Saket Dingliwal, Sravan Bodapati, Katrin Kirchhoff, Dan Roth. [pdf], [project], 2022.12,
Data Distributional Properties Drive Emergent In-Context Learning in Transformers

Stephanie C.Y. Chan, Adam Santoro, Andrew K. Lampinen, Jane X. Wang, Aaditya Singh, Pierre H. Richemond, Jay McClelland, Felix Hill. [pdf], [project], 2022.05,

Working Mechanism of ICL

An Explanation of In-context Learning as Implicit Bayesian Inference

Sang Michael Xie, Aditi Raghunathan, Percy Liang, Tengyu Ma. [pdf], [project], 2022.08,
In-context Learning and Induction Heads

Catherine Olsson, Nelson Elhage, Neel Nanda, Nicholas Joseph, Nova DasSarma, Tom Henighan, Ben Mann, Amanda Askell, Yuntao Bai, Anna Chen, Tom Conerly, Dawn Drain, Deep Ganguli, Zac Hatfield-Dodds, Danny Hernandez, Scott Johnston, Andy Jones, Jackson Kernion, Liane Lovitt, Kamal Ndousse, Dario Amodei, Tom Brown, Jack Clark, Jared Kaplan, Sam McCandlish, Chris Olah. [pdf], [project], 2022.10,
What Can Transformers Learn In-Context? A Case Study of Simple Function Classes

Shivam Garg, Dimitris Tsipras, Percy Liang, Gregory Valiant. [pdf], [project], 2022.08,
"Data Distributional Properties Drive Emergent In-Context Learning in Transformers"

Stephanie C. Y. Chan, Adam Santoro, Andrew K. Lampinen, Jane X. Wang, Aaditya Singh, Pierre H. Richemond, Jay McClelland, Felix Hill. [pdf], [project], 2022.05,
What learning algorithm is in-context learning? Investigations with linear models

Ekin Akyürek, Dale Schuurmans, Jacob Andreas, Tengyu Ma, Denny Zhou. [pdf], [project], 2022.11,
Transformers learn in-context by gradient descent

von Oswald, Johannes, Eyvind Niklasson, Ettore Randazzo, João Sacramento, Alexander Mordvintsev, Andrey Zhmoginov, Max Vladymyrov. [pdf], [project], 2022.12,
Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers

Damai Dai, Yutao Sun, Li Dong, Yaru Hao, Zhifang Sui, Furu Wei. [pdf], [project], 2022.12

Evaluation and Resources

This section contains the pilot works that might contributes to the evaluation or resources of ICL.

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models.

Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt et. al.. [pdf], [project], 2022.06,
SUPER-NATURALINSTRUCTIONS: Generalization via Declarative Instructions on 1600+ NLP Task.

Yizhong Wang, Swaroop Mishra, Pegah Alipoormolabashi, Yeganeh Kordi, Amirreza Mirzaei, Anjana Arunkumar, Arjun Ashok, Arut Selvan Dhanasekaran, Atharva Naik, David Stap, Eshaan Pathak, Giannis Karamanolakis, Haizhi Gary Lai, Ishan Purohit et. al.. [pdf], [project], 2022.04,
Language Models are Multilingual Chain-of-Thought Reasoners.

Freda Shi, Mirac Suzgun, Markus Freitag, Xuezhi Wang, Suraj Srivats, Soroush Vosoughi, Hyung Won Chung, Yi Tay, Sebastian Ruder, Denny Zhou, Dipanjan Das, Jason Wei. [pdf], [project], 2022.10,
- evaluate the reasoning abilities of large language models in multilingual settings, introduce the Multilingual Grade School Math (MGSM) benchmark, by manually translating 250 grade-school math problems from the GSM8K dataset into ten typologically diverse languages.
Instruction Induction: From Few Examples to Natural Language Task Descriptions.

Or Honovich, Uri Shaham, Samuel R. Bowman, Omer Levy. [pdf], [project], 2022.05,
- how to learn task instructions from input output demonstrations
Language Models Are Greedy Reasoners: A Systematic Formal Analysis of Chain-of-Thought2022.10.3
What is Not in the Context? Evaluation of Few-shot Learners with Informative Demonstrations 2212.01692.pdf (arxiv.org)

Application

This section contains the pilot works that expands the application of ICL.

Meta-learning via Language Model In-context Tuning.

Yanda Chen, Ruiqi Zhong, Sheng Zha, George Karypis, He He. [pdf], [project], 2021.10,
Does GPT-3 Generate Empathetic Dialogues? A Novel In-Context Example Selection Method and Automatic Evaluation Metric for Empathetic Dialogue Generation.

Young-Jun Lee, Chae-Gyun Lim, Ho-Jin Choi. [pdf], [project], 2022.10,
In-context Learning Distillation: Transferring Few-shot Learning Ability of Pre-trained Language Models.

Yukun Huang, Yanda Chen, Zhou Yu, Kathleen McKeown. pdf, [project], 2022.12,
In-context Learning Distillation: Transferring Few-shot Learning Ability of Pre-trained Language Models
Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions

Problems

This section contains the pilot works that points out the problems of ICL.

The Inductive Bias of In-Context Learning: Rethinking Pretraining Example Design .

Yoav Levine, Noam Wies, Daniel Jannai, Dan Navon, Yedid Hoshen, Amnon Shashua. [pdf], [project], 2021.10,

Challenges and Future Directions

This section contains the pilot works that might contributes to the challenges and future directions of ICL.

内容中包含的图片若涉及版权问题，请及时与我们联系删除