How to Train Long-Context Language Models (Effectively)
Preprint, 2024
[pdf]
[code]
[huggingface]
HELMET: How to Evaluate Long-Context Language Models Effectively and Thoroughly
Preprint, 2024
[pdf]
[code]
LitSearch: A Retrieval Benchmark for Scientific Literature Search
Proceedings of EMNLP, 2024
[pdf]
[code]
Improving Language Understanding from Screenshots (PTP)
Preprint, 2024
[pdf]
[code]
Long-Context Language Modeling with Parallel Context Encoding (CEPE)
Proceedings of ACL, 2024
[pdf]
[code]
Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
Proceedings of ICLR, 2024
[pdf]
[code]
[blog]
Evaluating Large Language Models at Evaluating Instruction Following (LLMBar)
Proceedings of ICLR, 2024
[pdf]
[code]
Enabling Large Language Models to Generate Text with Citations (ALCE)
Proceedings of EMNLP, 2023
[pdf]
[code]
Fine-Tuning Language Models with Just Forward Passes (MeZO)
Proceedings of Neurips, 2023
(oral)
[pdf]
[code]
What In-Context Learning "Learns" In-Context: Disentangling Task Recognition and Task Learning
Proceedings of Findings of ACL, 2023
[pdf]
[code]
The CRINGE Loss: Learning what language not to model
Proceedings of ACL, 2023
[pdf]
Should You Mask 15% in Masked Language Modeling?
Proceedings of EACL, 2023
[pdf]
[code]
Recovering Private Text in Federated Learning of Language Models
Proceedings of NeurIPS, 2022
[pdf]
Automatic Label Sequence Generation for
Prompting Sequence-to-sequence Models
Proceedings of COLING, 2022
[pdf]
[code]
Ditch the Gold Standard: Re-evaluating Conversational Question Answering
Proceedings of ACL, 2022
(Outstanding Paper Award)
[pdf]
[code]
SimCSE: Simple Contrastive Learning of Sentence Embeddings
Proceedings of EMNLP, 2021
[pdf]
[code]
Making Pre-trained Language Models Better Few-shot Learners (LM-BFF)
Proceedings of ACL, 2021
[pdf]
[code]
Manual Evaluation Matters: Reviewing Test Protocols of Distantly Supervised Relation Extraction
Proceedings of ACL Findings, 2021
[pdf]
[code]
Learning from Context or Names? An Empirical Study on Neural Relation Extraction
Proceedings of EMNLP, 2020
[pdf]
[code]
More Data, More Relations, More Context and More Openness: A Review and Outlook for Relation Extraction
Proceedings of AACL, 2020
[pdf]
Continual Relation Learning via Episodic Memory Activation and Reconsolidation
Proceedings of ACL, 2020
[pdf]
[code]
Few-shot Relation Extraction via Bayesian Meta-learning on Task Graphs
Proceedings of ICML, 2020
[pdf]
KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation
Proceedings of TACL, 2020
[pdf]
Neural Snowball for Few-Shot Relation Learning
Proceedings of AAAI, 2020
[pdf]
[code]
FewRel 2.0: Towards More Challenging Few-Shot Relation Classification
Proceedings of EMNLP (Short Paper), 2019
[pdf]
[code]
OpenNRE: An Open and Extensible Toolkit for Neural Relation Extraction
Proceedings of EMNLP (Demonstration Track), 2019
[pdf]
[code]
Hybrid Attention-Based Prototypical Networks for Noisy Few-Shot Relation Classification
Proceedings of AAAI, 2019
[pdf]
[code]
* indicates equal contribution.