Jingwei Yi

I am a fourth-year Ph.D. candidate of Microsoft Research Asia and University of Science and Technology of China (USTC) supervised by Doc. Xing Xie and Prof. Guangzhong Sun.

I am currently focusing on responsible AI research problems, especially the societal, safety and security problems related to large language models.

I have a bachelor's degree in Information Security from University of Science and Technology of China (USTC).

Email  /  GitHub  /  Google Scholar  /  LinkedIn

profile photo

Selected Publications

* indicates joint first authors. Full publication list can be found on Google Scholar.

project image Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models
Jingwei Yi*, Yueqi Xie*, Bin Zhu, Keegan Hines, Emre Kiciman, Guangzhong Sun, Xing Xie, Fangzhao Wu
arXiv, 2023
paper / code /

We systematically evaluate the robustness of LLMs to indirect prompt injection attacks and propose several defense techniques to mitigate the risks.

project image Control Risk for Potential Misuse of Artificial Intelligence in Science
Jiyan He*, Weitao Feng*, Yaosen Min*, Jingwei Yi*, Kunsheng Tang, Shuai Li, Jie Zhang, Kejiang Chen, Wenbo Zhou, Xing Xie, Weiming Zhang, Nenghai Yu, Shuxin Zheng
arXiv, 2023
paper /

We itemize the risks posed by AI in scientific contexts, then demonstrate the risks by highlighting real-world examples of misuse in chemical science. We further propose a system called SciGuard to control misuse risks for AI models in science.

project image Self-Reminders: Defending ChatGPT against Jailbreak Attack via Self-Reminders
Yueqi Xie*, Jingwei Yi*, Jiawei Shao, Justin Curl, Lingjuan Lyu, Qifeng Chen, Xing Xie, Fangzhao Wu
Nature Machine Intelligence, 2023
paper / code /

We draw inspiration from the psychological concept of self-reminders and further propose system-mode self-reminder to defend against Jailbreak attacks.

project image Non-IID always Bad? Semi-Supervised Heterogeneous Federated Learning with Local Knowledge Enhancement.
Chao Zhang, Fangzhao Wu, Jingwei Yi, Derong Xu, Yang Yu, Jindong Wang, Yidong Wang, Tong Xu, Xing Xie, Enhong Chen
CIKM, 2023
paper / code /

We propose FedLoke, and effective semi-supervised federated learning under non-IID settings.

project image UA-FedRec: Untargeted Attack on Federated News Recommendation
Jingwei Yi, Fangzhao Wu, Bin Zhu, Jing Yao, Zhulin Tao, Guangzhong Sun, Xing Xie
KDD, 2023
paper / code /

We study this problem by proposing an untargeted attack on federated news recommendation called UA-FedRec.

project image Are You Copying My Model? Protecting the Copyright of Large Language Models for EaaS via Backdoor Watermark
Wenjun Peng*, Jingwei Yi*, Fangzhao Wu, Shangxi Wu, Bin Zhu, Lingjuan Lyu, Binxing Jiao, Tong Xu, Guangzhong Sun, Xing Xie
ACL, 2023 (Area Chair Award of NLP Application Track)
paper / code /

We propose Embmarker, a backdoor watermark to defend againt model extraction attacks to embedding services.

project image Effective and Efficient Query-aware Snippet Extraction for Web Search
Jingwei Yi, Fangzhao Wu, Chuhan Wu, Xiaolong Huang, Binxing Jiao, Guangzhong Sun, Xing Xie
EMNLP, 2022
paper / code /

We propose Efficient-DeepQSE, an effective and efficient query-aware snippet extraction method for web search.

project image Robust Quantity-Aware Aggregation for Federated Learning
Jingwei Yi, Fangzhao Wu, Huishuai Zhang, Bin Zhu, Tao Qi, Guangzhong Sun, Xing Xie
arXiv, 2022
paper /

We propose a quantity-robust aggregation method for federated learning, which is robust to the quantity-enhanced untargeted attacks.

project image Tiny-NewsRec: Effective and Efficient PLM-based News Recommendation
Yang Yu, Fangzhao Wu, Chuhan Wu, Jingwei Yi, Qi Liu
EMNLP, 2021
paper / code /

We propose Tiny-NewsRec, atwo-stage knowledge distillation method to improve the efficiency of the large PLM-based news recommendation.

project image Efficient-FedRec: Efficient Federated Learning Framework for Privacy-Preserving News Recommendation
Jingwei Yi, Fangzhao Wu, Chuhan Wu, Ruixuan Liu, Guangzhong Sun, Xing Xie
EMNLP, 2021
paper / code /

We propose Efficient-FedRec, an efficient federated new recommendation framework.




Awards

NIPS 2023 TDC Red-Teaming Competition (Large Model Track) – The Third Prize

We optimize the GCG attack for efficient and effective LLM red teaming.

leaderboard / code /

CIKM 2022 AnalytiCup Competition: Federated Hetero-Task Learning – First Runners Up

We provide a solution for federated hetero-task learning, where the tasks are heterogeneous across multiple clients.

leaderboard / code /

First Prize of National College Student Information Security Contest (Project Track)

We implement a font-based watermarking algorithm for digital documents.

news /

Silver Award of International Genetically Engineered Machine competition

We implement a platform to search gene blocks and share experiment reports.

news /





Design and source code from Jon Barron's website