I am a final-year PhD candidate in Computer Science at UCLA, working with Prof. Wei Wang. I’m also a machine learning scientist on the Genentech Prescient Design Language Model team, working on LLM agents for scientific discovery and LLM post-training with Dr. Keunwoo Choi, Dr. Stephen Ra, and Prof. Kyunghyun Cho. I’m a recipient of the J.P. Morgan Chase AI PhD Fellowship and an Amazon Fellow.
I’ve worked at Amazon AGI, USC (working with Prof. Nanyun (Violet) Peng and Prof. Muhao Chen), The Chinese University of Hong Kong (working with Prof. Helen Meng), UC Santa Cruz (working with Prof. Marilyn Walker) and MIT (working with Dr. Abel Sanchez and Prof. John R. Williams). I earned my bachelor’s degree in Computing from The Hong Kong Polytechnic University, advised by Prof. Qin Lu and Prof. Jiannong Cao and studied at the University of Maryland.
I develop machine learning (ML) systems inspired by scientific data and expert tasks, equipping large language models (LLMs) with the intuition and knowledge of domain experts. My research introduces machine learning innovations and insights to enable a comprehensive spectrum of expertise acquisition, from explicit to implicit knowledge and from individual decision-making to the automation of complex expert workflows. Specifically, I focus on:
Extracting explicit knowledge from unstructured data in low-resource scenarios: dataset (ACL'23), library (NAACL'21), data-efficient (ACL'23) and parameter-efficient (INTERSPEECH'23) methods, indirect supervision (ACL'23, EMNLP-F'22), cross-document (ACL-F'23), data synthesis/augmentation for zero-shot scenarios (AAAI'24, ACL'24)
Capturing implicit expert intuition: LLMs’ clinical decision-making benchmark (preprint 24), rich supervision to model decision sequences (AAAI'25), conveying intuition with a decoding-free paradigm (NeurIPS ENLSP'24)
Compositional, project-level reasoning and automation: KG-inspired reasoning (preprint 24), cross-modality (NeurIPS'24), drug discovery agents, scientific workflow agent platform, material design (preprint 24)
Fairness and safety of generative LLMs: unsupervised bias mitigation (NAACL'24), attacking LLM with data poisoning (NAACL'24), ownership protection (NAACL'24), LLMs’ clinical bias analysis (preprint 24)
Empowered expert applications: clinical diagnosis (preprint 24, preprint 24, preprint 24), health outcome prediction (AAAI'25), clinical event extraction (ACL'23), biomedical and scientific QAs (preprint 24, NeurIPS'24, NAACL'24, ACL'23), computational social science (EMNLP'24, AAAI'24), political event forecasting (preprint 24), dialogue state tracking (INTERSPEECH'23), knowledge structure/graph construction (EMNLP-F'21, AKBC'22)