Pang Wei Koh

Our group develops methods for making AI systems more useful, responsible, and reliable in the real world. Our goal is to enable AI to make a positive impact in ways it could not do before, e.g., to accelerate scientific discovery or provide universal access to medical advice.

We are part of UW ML and NLP, and I'm also a research scientist at AI2. I received my PhD in Computer Science from Stanford, advised by Percy Liang. Before that, I was the 3rd employee and Director of Partnerships at Coursera. I received my BS/MS from Stanford as well, advised by Andrew Ng and Daphne Koller. Previously, I worked on computational biology with Anshul Kundaje at Stanford and then at Calico Life Sciences.

Interested in joining us? Please read this! I am recruiting for the 2025-2026 cycle and am looking for students and postdocs who are doing core ML/NLP research and/or are interested in AI for science broadly construed.

Current students

Scott Geng
PhD student

Jacqueline He
PhD student
(with Luke Zettlemoyer)

Rulin Shao
PhD student
(with Luke Zettlemoyer)

Rui Xin
PhD student
(with Sewoong Oh)

Ian Magnusson
PhD student
(with Noah Smith)

Zhiyuan Zeng
PhD student
(with Hanna Hajishirzi)

Joseph Lee
PhD student
(with Hanna Hajishirzi)

Chang Ma
Incoming postdoc
(with Hanna Hajishirzi)

Molly Park
Undergrad

Yufei Zhang
Undergrad

Jina Kim
Visiting undergrad

Alumni

Gregory Lau (Visiting PhD student in 2025, now PhD student at NUS)
Qiao Rui (Visiting PhD student in 2024, now research scientist at Meta)
Irena Gao (MS 2023, now PhD student at Stanford)
Kendrick Shen (MS 2022, now ML research engineer at Phonic)
Henrik Marklund (MS 2021, now PhD student at Stanford)
Kai-Siang Ang (MS 2021, now Technical Lead Manager at Nuro)
Erik Jones (MS 2020, now researcher at Anthropic)
Hubert Teo (MS 2019, now senior software engineer at CodeSignal)
Thao Nguyen (BS 2019, now PhD student at the University of Washington)
Yew-Siang Tang (BS 2019, now staff software engineer at You.com)

Publications

* = equal contribution.

DR Tulu: Reinforcement learning with evolving rubrics for deep research

Rulin Shao*, Akari Asai*, Shannon Zejiang Shen*, Hamish Ivison*, Varsha Kishore, Jingming Zhuo, Xinran Zhao, Molly Park, Samuel Finlayson, David Sontag, Tyler Murray, Sewon Min, Pradeep Dasigi, Luca Soldaini, Faeze Brahman, Wen-tau Yih, Tongshuang Wu, Luke Zettlemoyer, Yoon Kim, Hannaneh Hajishirzi, and Pang Wei Koh

arXiv 2025

(paper) (code)

RLVE: Scaling up reinforcement learning for language models with adaptive verifiable environments

Zhiyuan Zeng*, Hamish Ivison*, Yiping Wang*, Lifan Yuan*, Shuyue Stella Li, Zhuorui Ye, Siting Li, Jacqueline He, Runlong Zhou, Tong Chen, Chenyang Zhao, Yulia Tsvetkov, Simon Shaolei Du, Natasha Jaques, Hao Peng, Pang Wei Koh, and Hannaneh Hajishirzi

arXiv 2025

(paper) (code)

Olmo 3

Team Olmo, Allyson Ettinger, Amanda Bertsch, Bailey Kuehl, David Graham, David Heineman, Dirk Groeneveld, Faeze Brahman, Finbarr Timbers, Hamish Ivison, Jacob Morrison, Jake Poznanski, Kyle Lo, Luca Soldaini, Matt Jordan, Mayee Chen, Michael Noukhovitch, Nathan Lambert, Pete Walsh, Pradeep Dasigi, Robert Berry, Saumya Malik, Saurabh Shah, Scott Geng, Shane Arora, Shashank Gupta, Taira Anderson, Teng Xiao, Tyler Murray, Tyler Romero, Victoria Graf, Akari Asai, Akshita Bhagia, Alexander Wettig, Alisa Liu, Aman Rangapur, Chloe Anastasiades, Costa Huang, Dustin Schwenk, Harsh Trivedi, Ian Magnusson, Jaron Lochner, Jiacheng Liu, Lester James V. Miranda, Maarten Sap, Malia Morgan, Michael Schmitz, Michal Guerquin, Michael Wilson, Regan Huff, Ronan Le Bras, Rui Xin, Rulin Shao, Sam Skjonsberg, Shannon Zejiang Shen, Shuyue Stella Li, Tucker Wilde, Valentina Pyatkin, Will Merrill, Yapei Chang, Yuling Gu, Zhiyuan Zeng, Ashish Sabharwal, Luke Zettlemoyer, Pang Wei Koh, Ali Farhadi, Noah A. Smith, and Hannaneh Hajishirzi

arXiv 2025