Jingwei Yi

I am a fourth-year Ph.D. candidate of Microsoft Research Asia and University of Science and Technology of China (USTC) supervised by Doc. Xing Xie and Prof. Guangzhong Sun.

I am currently focusing on responsible AI research problems, especially the societal, safety and security problems related to large language models.

I have a bachelor's degree in Information Security from University of Science and Technology of China (USTC).

Email / GitHub / Google Scholar / LinkedIn

Selected Publications

* indicates joint first authors. Full publication list can be found on Google Scholar.

	ImageRef-VL: Enabling Contextual Image Referencing in Vision-Language Models Jingwei Yi, Junhao Yin, Ju Xu, Peng Bao, Yongliang Wang, Wei Fan, Hao Wang arxiv, 2024 paper / code / We investigated methods to improve Vision Language Models’ ability to comprehend and reference images within RAG documents, and developed an effective automated evaluation methodology.
	Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models Jingwei Yi, Yueqi Xie, Bin Zhu, Keegan Hines, Emre Kiciman, Guangzhong Sun, Xing Xie, Fangzhao Wu KDD, 2023 paper / code / We systematically evaluate the robustness of LLMs to indirect prompt injection attacks and propose several defense techniques to mitigate the risks.
	Control Risk for Potential Misuse of Artificial Intelligence in Science Jiyan He, Weitao Feng, Yaosen Min, Jingwei Yi, Kunsheng Tang, Shuai Li, Jie Zhang, Kejiang Chen, Wenbo Zhou, Xing Xie, Weiming Zhang, Nenghai Yu, Shuxin Zheng arXiv, 2023 paper / We itemize the risks posed by AI in scientific contexts, then demonstrate the risks by highlighting real-world examples of misuse in chemical science. We further propose a system called SciGuard to control misuse risks for AI models in science.
	Self-Reminders: Defending ChatGPT against Jailbreak Attack via Self-Reminders Yueqi Xie, Jingwei Yi, Jiawei Shao, Justin Curl, Lingjuan Lyu, Qifeng Chen, Xing Xie, Fangzhao Wu Nature Machine Intelligence, 2023 paper / code / We draw inspiration from the psychological concept of self-reminders and further propose system-mode self-reminder to defend against Jailbreak attacks.
	Non-IID always Bad? Semi-Supervised Heterogeneous Federated Learning with Local Knowledge Enhancement. Chao Zhang, Fangzhao Wu, Jingwei Yi, Derong Xu, Yang Yu, Jindong Wang, Yidong Wang, Tong Xu, Xing Xie, Enhong Chen CIKM, 2023 paper / code / We propose FedLoke, and effective semi-supervised federated learning under non-IID settings.
	On the Vulnerability of Value Alignment in Open-Access LLMs Jingwei Yi, Rui Ye, Qisi Chen, Bin Zhu, Siheng Chen, Defu Lian, Guangzhong Sun, Xing Xie, Fangzhao Wu ACL 2024 Findings, 2023 paper / We reveal the vulnerabilities of large language models (LLMs) to reverse alignment attacks and introduce reverse supervised fine-tuning (RSFT) and reverse preference optimization (RPO) as efficient attack methods. Our research underscores the limitations of current value alignment methods and emphasizes the need for robust solutions to counteract malicious fine-tuning.
	UA-FedRec: Untargeted Attack on Federated News Recommendation Jingwei Yi, Fangzhao Wu, Bin Zhu, Jing Yao, Zhulin Tao, Guangzhong Sun, Xing Xie KDD, 2023 paper / code / We study this problem by proposing an untargeted attack on federated news recommendation called UA-FedRec.
	Are You Copying My Model? Protecting the Copyright of Large Language Models for EaaS via Backdoor Watermark Wenjun Peng, Jingwei Yi, Fangzhao Wu, Shangxi Wu, Bin Zhu, Lingjuan Lyu, Binxing Jiao, Tong Xu, Guangzhong Sun, Xing Xie ACL, 2023 (Area Chair Award of NLP Application Track) paper / code / We propose Embmarker, a backdoor watermark to defend againt model extraction attacks to embedding services.
	Effective and Efficient Query-aware Snippet Extraction for Web Search Jingwei Yi, Fangzhao Wu, Chuhan Wu, Xiaolong Huang, Binxing Jiao, Guangzhong Sun, Xing Xie EMNLP, 2022 paper / code / We propose Efficient-DeepQSE, an effective and efficient query-aware snippet extraction method for web search.
	Robust Quantity-Aware Aggregation for Federated Learning Jingwei Yi, Fangzhao Wu, Huishuai Zhang, Bin Zhu, Tao Qi, Guangzhong Sun, Xing Xie arXiv, 2022 paper / We propose a quantity-robust aggregation method for federated learning, which is robust to the quantity-enhanced untargeted attacks.
	Tiny-NewsRec: Effective and Efficient PLM-based News Recommendation Yang Yu, Fangzhao Wu, Chuhan Wu, Jingwei Yi, Qi Liu EMNLP, 2021 paper / code / We propose Tiny-NewsRec, atwo-stage knowledge distillation method to improve the efficiency of the large PLM-based news recommendation.
	Efficient-FedRec: Efficient Federated Learning Framework for Privacy-Preserving News Recommendation Jingwei Yi, Fangzhao Wu, Chuhan Wu, Ruixuan Liu, Guangzhong Sun, Xing Xie EMNLP, 2021 paper / code / We propose Efficient-FedRec, an efficient federated new recommendation framework.