Paper reading: December 2022 arXiv notes

I tried to track arXiv papers under cs.CL to learn about the recent progress in NLP. However, such tracking usually breaks when there are a burst of papers, for example, when near an anonymous DDL. My solution to a hundred arXiv tabs opened has been always cmd+Q (a “hard-reboot”, as Southwest Airlines would call it). I realized I always missed piles of interesting papers and relying on Twitter/slack recommendations made me extremely biased of what’s going on. So I decide this time to put in some more effort to go through December’s arXiv papers.

After EMNLP, NLP forks all left Abu Dhabi except me, staying here to renew my US visa. Sadly, my visa is under “administrative processing” and I have to wait for an “indefinite” period of time. Flying back home (China) is also extremely unaffordable due to lack of flight, so I spent Christmas and New Year wondering around the area, traveling around UAE, Oman, Morocco, and Turkey. While traveling around is lonely, I can also use the time to read some papers and experience being a “digital nomad”.

Responsive image
Photo from Muscat, Oman.

Ok let’s get down to business. This paper list focuses on new papers coming out on arXiv in December 2022 (some might be revision). I am mainly interested in the following directions and the list reflects them too: retrieval (LMs and retrieval; dense retrieval) and LLM (in-context learning; instruction tuning; prompting; generation; efficient use of LLM). Let me know if you find other interesting papers that I didn’t mention!

Retrieval+LM: a path to faithful and generalizable LLM?

We have seen a lot of hype brought by GPT-3 and ChatGPT, especially how they can answer questions with “facts”, memorized in their parameters. But such behaviors are not generalizable — a model trained in 2022 wouldn’t know any news in 2023 — and also have hallucination issues — the answers are not supported by any explicit documents and may be inaccurate.

Adding a retrieval component to LMs can be a remedy. We have seen works adding entity knowledge to LMs (Zhang et al., 2019),