I am a Senior Machine Learning Scientist on the Foundation Models team at Prescient Design, Genentech/Roche, where I work on LLM agents and post-training for molecular and scientific discovery.
I earned my PhD in Computer Science from UCLA, advised by Prof. Wei Wang. I’m a recipient of the J.P. Morgan Chase AI PhD Fellowship and the Amazon Fellowship. I’ve conducted research at Amazon AGI, USC, CUHK, UC Santa Cruz and MIT. I hold a bachelor’s degree in Computing from The Hong Kong Polytechnic University and studied at the University of Maryland.
I develop machine learning (ML) systems inspired by scientific data and expert tasks, equipping large language models (LLMs) with the intuition and knowledge of domain experts. My research introduces machine learning innovations and insights to enable a comprehensive spectrum of expertise acquisition, from explicit to implicit knowledge and from individual decision-making to the automation of complex expert workflows. Specifically, I focus on:
Extracting explicit knowledge from unstructured data in low-resource scenarios: dataset (ACL'23), library (NAACL'21), data-efficient (ACL'23) and parameter-efficient (INTERSPEECH'23) methods, indirect supervision (ACL'23, EMNLP-F'22), cross-document (ACL-F'23), data synthesis/augmentation for zero-shot scenarios (AAAI'24, ACL'24)
Capturing implicit expert intuition: LLMs’ clinical decision-making benchmark (preprint 24), rich supervision to model decision sequences (AAAI'25), conveying intuition with a decoding-free paradigm (NeurIPS ENLSP'24)
Compositional, project-level reasoning and automation: KG-inspired reasoning (preprint 24), cross-modality (NeurIPS'24), drug discovery agents, scientific workflow agent platform, material design (preprint 24)
Fairness and safety of generative LLMs: unsupervised bias mitigation (NAACL'24), attacking LLM with data poisoning (NAACL'24), ownership protection (NAACL'24), LLMs’ clinical bias analysis (preprint 24)
Empowered expert applications: clinical diagnosis (preprint 24, preprint 24, preprint 24), health outcome prediction (AAAI'25), clinical event extraction (ACL'23), biomedical and scientific QAs (preprint 24, NeurIPS'24, NAACL'24, ACL'23), computational social science (EMNLP'24, AAAI'24), political event forecasting (preprint 24), dialogue state tracking (INTERSPEECH'23), knowledge structure/graph construction (EMNLP-F'21, AKBC'22)