Tianyu Gao

Tianyu Gao 高天宇

I am a 5th-year PhD student at Princeton University, advised by Prof. Danqi Chen. I am also a member of the Princeton NLP group and the Princeton Language and Intelligence. Before Princeton, I received my bachelor's degree at Tsinghua University, advised by Prof. Zhiyuan Liu.

Find me on Twitter, Google Scholar, and Github!

Email: [firstname]g@princeton.edu

I am on the academic job market this year and here are my CV and research statement.

Research

My research is at the intersection of natural language processing and machine learning, with a particular focus on large language models (LLMs).

I am driven by the exciting potential of LLMs in transformative applications, such as their use as information-seeking tools. I develop principled techniques and evaluations for key components of this emerging paradigm:

Powerful semantic search with embedding models (SimCSE, LitSearch)
Generations with citations for verifiability (ALCE)
Long-context language models (CEPE, HELMET, ProLong, LongProc)
Instruction following (LLMBar)

To enable these powerful applications in large-scale deployments, I also work on improving the capabilities and efficiency of LLMs:

Efficient fine-tuning methods that require only a few examples (LM-BFF) and memory footprint comparable to inference (MeZO)
Cost-effective approaches to build capable small models (Sheared-Llama)
Techniques to accelerate pre-training (high masking rate, MeCo)
Understanding LLM capabilities (what ICL learns)

Please refer to publications for the full list of my research papers.

This website is adapted from Gregory Gunderson.