Mingyu Derek Ma - UCLA Computer Science

🧬

Mingyu Derek Ma

PhD Candidate

derek.ma at ucla.edu
he/his/him

I am a PhD candidate in Computer Science at UCLA, working with Prof. Wei Wang. I’m also a machine learning engineer at the Genentech Prescient Design Language Model team, working with Dr. Keunwoo Choi, Dr. Stephen Ra, and Prof. Kyunghyun Cho.

My PhD research was generously supported by J.P. Morgan Chase AI PhD Fellowship and Amazon Fellowship. I’ve worked at Amazon AGI, USC Information Sciences Institute (working with Prof. Nanyun (Violet) Peng and Prof. Muhao Chen), The Chinese University of Hong Kong (working with Prof. Helen Meng), UC Santa Cruz (working with Prof. Marilyn Walker) and MIT (working with Dr. Abel Sanchez and Prof. John R. Williams). I earned my bachelor’s degree in Computing from The Hong Kong Polytechnic University, advised by Prof. Qin Lu and Prof. Jiannong Cao and studied at the University of Maryland.

I’m interested in the architecture, training, and agentic use of generative language models inspired by and applied to clinical, medical, and scientific scenarios. I’m currently working on equipping the language models with the intuition and knowledge of domain experts, such as clinicians or scientists, and utilizing them as assistants for scientific discovery. Recent works include:

Architecture and training of generative language models ACL'23, INTERSPEECH'23
Data generation and augmentation with LLMs AAAI'24, ACL'24
Language models for clinical outcome prediction and scientific IE ACL'23a, ACL'23b
Bias, fairness, and safety of language models NAACL'24a, NAACL'24b, NAACL'24c
Data-efficient information extraction NAACL'21, EMNLP-F'22, ACL-F'23
Knowledge graphs EMNLP-F'21, AKBC'22

Awarded the J.P. Morgan Chase AI PhD Fellowship 🏆

I’m excited to be awarded the J.P. Morgan Chase AI PhD Fellowship!

Sep 29, 2024

Conference

Modeling Unobservable Susceptibility at EMNLP 2024

In a paper to be presented at EMNLP 2024, we present a computational approach to efficiently model users’ latent susceptibility levels guided by the supervision of people’s sharing behavior. The estimated susceptibility is significantly aligned with human judgments. This model enables large-scale susceptibility analysis for the first time.

Sep 19, 2024

Recognition

Awarded the Amazon Fellowship 🏆

I’m excited to be awarded the Amazon Fellowship!

Jul 10, 2024

Conference

Presenting at NAACL 2024 🇲🇽

Presenting three papers on bias, fairness and safety of Large Language Models at NAACL 2024 in Mexico City, including detecting and mitigating bias in QA models with ground-truth bias labels, fingerprinting LLMs, and a pilot study on injecting backdoors by instruction tuning data poisoning. Click for schedule and location details.

Jun 15, 2024

Conference

Presenting at AAAI 2024 🇨🇦

Presenting a demo and a poster at AAAI 2024 in Vancouver, including a demo on information diffusion via community-level information pathways and a poster on improving low-resource information extraction by structure-to-text data generation with Large Language Models.

Feb 22, 2024

Preprint

New preprint on LLM ownership protection

In InstructionalFingerprint, we present a pilot study on LLM fingerprinting as a form of very lightweight instruction tuning. Model publisher specifies a confidential private key and implants it as an instruction backdoor that causes the LLM to generate specific text when the key is present. Results on 11 popularly-used LLMs showed that this approach is lightweight and does not affect the normal behavior of the model.

Jan 21, 2024

More news

Featured Publications

Inferring from Logits: Exploring Best Practices for Decoding-Free Generative Candidate Selection

Mingyu Derek Ma^*, Yanna Ding^*, Zijie Huang, Jianxi Gao, Yizhou Sun, Wei Wang

NeurIPS ENLSP, 2024

Existing works have been using decoding-free candidate selection methods to obtain candidate probability from initial output logits over vocabulary. Though these estimation methods are widely used, they are not systematically evaluated, especially on end tasks. We introduce an evaluation of a comprehensive collection of decoding-free candidate selection approaches.

Code

CliBench: A Multifaceted and Multigranular Evaluation of Large Language Models for Clinical Decision Making

Mingyu Derek Ma, Chenchen Ye, Yu Yan, Xiaoxuan Wang, Peipei Ping, Timothy S Chang, Wei Wang

arXiv, 2024

We introduce CliBench, a novel benchmark offering a comprehensive and realistic assessment of LLMs' capabilities in clinical diagnosis. This benchmark not only covers diagnosis from a diverse range of medical cases across various specialties but also incorporates tasks of clinical significance: treatment procedure identification, lab test ordering and medication prescriptions.

Project PDF Code Dataset Leaderboard

Structured Reasoning with Knowledge Graph Inspired Veracity Extrapolation

Jiashu He, Mingyu Derek Ma, Jinxuan Fan, Dan Roth, Wei Wang, Alejandro Ribeiro

arXiv, 2024

GIVE is a novel reasoning framework that integrates the parametric and non-parametric memories to enhance both knowledge retrieval and faithful reasoning processes on very sparse knowledge graphs. By leveraging the external structured knowledge to inspire LLM to model the interconnections among relevant concepts, our method facilitates a more logical and step-wise reasoning approach akin to human problem-solving, rather than gold answer retrieval.

PDF

GraphVis: Boosting LLMs with Visual Knowledge Graph Integration

Yihe Deng, Chenchen Ye, Zijie Huang, Mingyu Derek Ma, Yiwen Kou, Wei Wang

NeurIPS, 2024

GraphVis conserves the intricate graph structure through the visual modality to enhance the comprehension of KGs with the aid of Large Vision Language Models (LVLMs). Our approach incorporates a unique curriculum fine-tuning scheme which first instructs LVLMs to recognize basic graphical features from the images, and subsequently incorporates reasoning on QA tasks with the visual graphs.

Decoding Susceptibility: Modeling Misbelief to Misinformation Through a Computational Approach

Yanchen Lin, Mingyu Derek Ma, Wenna Qin, Azure Zhou, Jiaao Chen, Weiyan Shi, Wei Wang, Diyi Yang

EMNLP, 2024

We propose a computational model to infer users' susceptibility levels given their activities. Since user's susceptibility is a key indicator for their reposting behavior, we utilize the supervision from the observable sharing behavior to infer the underlying susceptibility tendency. Building upon such large-scale susceptibility labeling, we further conduct a comprehensive analysis of how different social factors relate to susceptibility.

PDF Cite DOI

CLIMB: A Benchmark of Clinical Bias in Large Language Models

Yubo Zhang^*, Shudi Hou^*, Mingyu Derek Ma, Wei Wang, Muhao Chen, Jieyu Zhao

EMNLP Workshop on NLP for Positive Impact, 2024

We introduce a pioneering comprehensive benchmark to evaluate both intrinsic (within LLMs) and extrinsic (on downstream tasks) bias in LLMs for clinical decision tasks. Our experiments across popular and medically adapted LLMs, particularly from the Mistral and LLaMA families, unveil prevalent behaviors with both intrinsic and extrinsic bias. This work underscores the critical need to mitigate clinical bias and sets a new standard for future evaluations of LLMs' clinical bias.

PDF Code

MIRAI: Evaluating LLM Agents for Event Forecasting

Chenchen Ye^*, Ziniu Hu^*, Yihe Deng^*, Zijie Huang, Mingyu Derek Ma, Yanqiao Zhu, Wei Wang

arXiv, 2024

We introduce MIRAI, a novel benchmark designed to systematically evaluate LLM agents as temporal forecasters in the context of international events. Our benchmark features an agentic environment with tools for accessing an extensive database of historical, structured events and textual news articles.

Project PDF Code Dataset Demo Video Demo Notebook

Memorize and Rank: Elevating Large Language Models for Clinical Diagnosis Prediction

Mingyu Derek Ma, Xiaoxuan Wang, Yijia Xiao, Anthony Cuturrufo, Vijay S Nori, Eran Halperin, Wei Wang

NeurIPS GenAI4Health & AAAI 2024 Spring Symposium on Clinical Foundation Models, 2024

We introduce MERA, a clinical diagnosis prediction model that bridges pertaining natural language knowledge with medical practice. We apply hierarchical contrastive learning on a disease candidate ranking list to alleviate the large decision space issue. With concept memorization through fine-tuning, we bridge the natural language clinical knowledge with medical codes.

Project PDF

Mitigating Bias for Question Answering Models by Tracking Bias Influence

Mingyu Derek Ma, Jiun-Yu Kao, Arpit Gupta, Yu-Hsiang Lin, Wenbo Zhao, Tagyoung Chung, Wei Wang, Kai-Wei Chang, Nanyun Peng

NAACL, 2024

We propose BMBI, an approach to mitigate the bias of multiple-choice QA models. Based on the intuition that a model would lean to be more biased if it learns from a biased example, we measure the bias level of a query instance by observing its influence on another instance. We then use the bias level detected as an optimization objective to form a multi-task learning setting in addition to the original QA task.

PDF Cite Poster DOI

Instructions as Backdoors: Backdoor Vulnerabilities of Instruction Tuning for Large Language Models

Jiashu Xu, Mingyu Derek Ma, Fei Wang, Chaowei Xiao, Muhao Chen

NAACL, 2024

Our studies demonstrate that an attacker can inject backdoors by issuing very few malicious instructions among thousands of gathered data and control model behavior through data poisoning. Through such instruction attacks, the attacker can achieve over 90% attack success rate across four commonly used NLP datasets, and cause persistent backdoors that are easily transferred to 15 diverse datasets zero-shot.

PDF Cite DOI

Instructional Fingerprinting of Large Language Models

Jiashu Xu, Fei Wang^*, Mingyu Derek Ma^*, Pang Wei Koh, Chaowei Xiao, Muhao Chen

NAACL, 2024

We present a pilot study on LLM fingerprinting as a form of very lightweight instruction tuning. Model publisher specifies a confidential private key and implants it as an instruction backdoor that causes the LLM to generate specific text when the key is present. Results on 11 popularly-used LLMs showed that this approach prevents publisher overclaim, maintains robustness against fingerprint guessing and parameter-efficient training, and supports multi-stage fingerprinting akin to MIT License.

Project PDF Cite Code DOI

STAR: Boosting Low-Resource Information Extraction by Structure-to-Text Data Generation with Large Language Models

Mingyu Derek Ma, Xiaoxuan Wang, Po-Nien Kung, P. Jeffrey Brantingham, Nanyun Peng, Wei Wang

AAAI, 2024

We propose STAR, a structure-to-text data generation method for complicated structure prediction tasks that first generates complicated event structures (Y) and then generates input passages (X), all with Large Language Models. We further reduce errors and improve data quality through self-reflection error identification and self-refinement with iterative revision. We show that the data generated by STAR significantly improves the performance of low-resource event extraction and relation extraction tasks, even surpassing the effectiveness of human-curated data.

PDF Cite Code Poster DOI

DICE: Data-Efficient Clinical Event Extraction with Generative Models

Mingyu Derek Ma^*, Alexander K. Taylor^*, Wei Wang, Nanyun Peng

ACL, 2023

We introduce DICE, a robust and data-efficient generative model for clinical event extraction, which specializes in clinical mention identification, and MACCROBAT-EE, the first clinical event extraction dataset with event argument annotation.

PDF Cite Code Dataset Poster Slides Video DOI ACL Anthology

Can NLI Provide Proper Indirect Supervision for Low-resource Biomedical Relation Extraction?

Jiashu Xu, Mingyu Derek Ma, Muhao Chen

ACL, 2023

We present NBR, which converts biomedical relation extraction as natural language inference formulation through indirect supervision.

PDF Cite Code DOI ACL Anthology

Parameter-Efficient Low-Resource Dialogue State Tracking by Prompt Tuning

Mingyu Derek Ma, Jiun-Yu Kao, Shuyang Gao, Arpit Gupta, Di Jin, Tagyoung Chung, Nanyun Peng

INTERSPEECH, 2023 & ENLSP at NeurIPS, 2022

We use soft prompt tokens to learn task properties, incorporate segment information and reiterate the task before predicting value. Our method drastically reduces the number of parameters needed to less than 0.5% of prior works while achieving better low-resource dialogue state tracking performance.

PDF Cite Poster Slides Video DOI ISCA Archive ISCA PDF Amazon Science

HyperExpan: Taxonomy Expansion with Hyperbolic Representation Learning

Mingyu Derek Ma, Muhao Chen^*, Te-Lin Wu^*, Nanyun Peng

EMNLP Findings, 2021

A taxonomy expansion algorithm that seeks to preserve the structure of a taxonomy in a more expressive hyperbolic embedding space and learn to represent concepts and their relations with a Hyperbolic Graph Neural Network.

PDF Cite Code Slides Video DOI ACL Anthology

More publications

Curriculum Vitae

Experience

Genentech Prescient Design
Machine Learning Engineer
Since Jun 2024, New York City, NY
Amazon Alexa AI
Applied Scientist Intern with Jiun-Yu Kao, Arpit Gupta, Yu-Hsiang Lin, Wenbo Zhao, Kai-Wei Chang, Nanyun Peng and Tagyoung Chung
Jun - Sep 2022, Sunnyvale, CA
Amazon Alexa AI
Applied Scientist Intern with Jiun-Yu Kao, Shuyang Gao, Arpit Gupta, Di Jin, Nanyun Peng and Tagyoung Chung
Jun - Sep 2021, Remote

UCLA Computer Science Department
Graduate Student Researcher / Teaching Assistant
Since Sep 2020, Los Angeles, CA
USC Information Sciences Institute
Graduate Research Assistant to Dr. Nanyun Peng, Aug 2019 - Aug 2020; Marina del Rey, CA
Knowledge-directed Artiﬁcial Intelligence Reasoning Over Schemas
The Chinese University of Hong Kong Human-Computer Communications Lab
Research Assistant with Prof. Helen Meng
Jan - Jul 2019, Hong Kong
UC Santa Cruz Natural Language and Dialogue Systems Lab
Research Intern with Prof. Marilyn Walker
Jun - Oct 2018, Santa Cruz, CA
PolyU Department of Computing
Undergraduate Research Assistant with Prof. Qin Lu and Prof. Jiannong Cao
Jan 2017 - June 2018, Hong Kong
MIT Geospatial Data Center
Research Intern with Dr. Abel Sanchez and Prof. John R. Williams
Jul - Aug 2017, Cambridge, MA

Education

University of California, Los Angeles
PhD Student in Computer Science
Since 2020, Los Angeles, CA
University of Southern California
PhD Student in Computer Science
2019 - 2020, Los Angeles, CA
The Hong Kong Polytechnic University
Bachelor of Science in Computing (First Class Honours)
2014 - 2018, Hong Kong
Best Capstone Project Award (Top 1%), Graduate Representative for Valedictory Speech
University of Maryland, College Park
Exchange Student
2016, College Park, MD

Awards

J.P. Morgan PhD Fellowship , 2024

Amazon Fellowship , 2024

Outstanding Project Award - Best Capstone Project Award Competition, PolyU Dept. of Computing (1/100) , 2018

HKSAR Government Scholarship Fund Talent Development Scholarship , 2018

Silver Award - Hong Kong ICT (Information and Communication Technologies) Awards (website | wiki) Student Innovation Award (Tertiary or Above), Hong Kong Government , 2018

Champion and Most Innovative Award (HKSAR) - Imagine Cup (website | wiki), Microsoft , 2017

Commercial Radio 50th Anniversary Scholarship, Hong Kong Commercial Broadcasting Company Limited & PolyU (1/400) , 2017

Winner - Hong Kong Techathon, PolyU and City University of Hong Kong , 2018

CMA (The Chinese Manufacturers’ Association of Hong Kong) & Donors Scholarship (3/100) , 2018

Champion - PolyU Smart Computing Competition (website) , 2017

Best Creative Service Project - Youth Volunteer Service Conference (website) , 2017

PolyU Undergraduate Summer Research Abroad Sponsorship , 2017

PolyU Chinese Mainland and Overseas Activities Fund , 2016

Wong Tit-Shing Student Exchange Scholarship , 2016

PolyU Exchange Scholarship , 2016