I am a Senior Machine Learning Scientist on the Foundation Models team at Prescient Design, Genentech (Roche).
My work involves leading the development of agentic automation and intelligent platforms for molecular drug discovery and contributing to the training of scientific large language models.
I hold a PhD in Computer Science from UCLA, advised by Prof. Wei Wang. I’m an award recipient of the J.P. Morgan Chase AI PhD Fellowship and the Amazon Fellowship. My prior research experience includes work at Amazon AGI, USC, CUHK, UC Santa Cruz, MIT and PolyU.
I develop machine learning (ML) systems inspired by scientific data and expert tasks, equipping large language models (LLMs) with the intuition and knowledge of domain experts. My research introduces machine learning innovations and insights to enable a comprehensive spectrum of expertise acquisition, from explicit to implicit knowledge and from individual decision-making to the automation of complex expert workflows. Specifically, I focus on:
Extracting explicit knowledge from unstructured data in low-resource scenarios: dataset (ACL'23), library (NAACL'21), data-efficient (ACL'23) and parameter-efficient (INTERSPEECH'23) methods, indirect supervision (ACL'23, EMNLP-F'22), cross-document (ACL-F'23), data synthesis/augmentation for zero-shot scenarios (AAAI'24, ACL'24)
Capturing implicit expert intuition: LLMs’ clinical decision-making benchmark (preprint 24), rich supervision to model decision sequences (AAAI'25), conveying intuition with a decoding-free paradigm (ACL'25)
Compositional, project-level reasoning and automation: KG-inspired reasoning (ICML'25), cross-modality (NeurIPS'24), drug discovery agents, scientific workflow agent platform (preprint 25), material design (NAACL'25)
Fairness and safety of generative LLMs: unsupervised bias mitigation (NAACL'24), attacking LLM with data poisoning (NAACL'24), ownership protection (NAACL'24), LLMs’ clinical bias analysis (preprint 24)
Empowered expert applications: clinical diagnosis (preprint 24, ACL'25, preprint 24), health outcome prediction (AAAI'25), clinical event extraction (ACL'23), biomedical and scientific QAs (ICML'25, NeurIPS'24, NAACL'24, ACL'23), computational social science (EMNLP'24, AAAI'24), political event forecasting (preprint 24), dialogue state tracking (INTERSPEECH'23), knowledge structure/graph construction (EMNLP-F'21, AKBC'22)


