可信人工智能
可信人工智能 当前位置: 网站首页 >> 科研方向 >> 可信人工智能

中心自 2018 年以来一直对可信人工智能,包括图像分类对抗攻击和对抗防御展开深入的研究,并得多项成果,早期于 20192020 年在 ICLR等发表了三篇论文,每个单篇谷歌引用量已达 80 多;同时于 2019 年,我们开始探索在自然语言处理领域文本分类任务上的对抗样本问题,在计算机语言学顶会 ACL 2019(Oral)发表了基于贪心策略的对抗攻击算法,是最早将对抗样本研究引入自然语言处理领域的团队之一。该篇论文的谷歌引用量已达 220 多,成为自然语言处理对抗样本研究的代表工作。随后提出的文本分类对抗防御算法发表在 UAI 2021 上,谷歌引用量已达 50 多,受到国内外学术界的广泛关注。2020 年展开了对机器翻译模型的对抗攻击研究,所做工作被计算语言学的顶级会议 ACL 2021 录用为 Oral

代表性论文成果:

图像对抗攻击:

[1] Jiadong Lin#, Chuanbiao Song#, Kun He#*, Liwei Wang, John E. Hopcroft. Nesterov Accelerated Gradient and Scale Invariance for Adversarial Attacks. ICLR 2020.

[2] Xiaosen Wang, Kun He*. Enhancing the Transferability of Adversarial Attacks Through Variance Tuning. CVPR 2021: 1924-1933.

[3] Yifeng Xiong#, Jiadong Lin#, Min Zhang, John E. Hopcroft, Kun He*. Stochastic Variance Reduced Ensemble Adversarial Attack for Boosting the Adversarial Transferability. CVPR 2022: 14963-14972.

[4] Xiaosen Wang, Xuanran He, Jingdong Wang, Kun He*. Admix: Enhancing the Transferability of Adversarial Attacks. ICCV 2021: 16138-16147.

[5] Xiaosen Wang, Zeliang Zhang, Kangheng Tong, Dihong Gong, Kun He*, Zhifeng Li, Wei Liu. Triangle Attack: A Query-Efficient Decision-Based Adversarial Attack. ECCV (5) 2022: 156-174.

图像对抗防御:

[1] Chuanbiao Song, Kun He*, Liwei Wang, John E. Hopcroft. Improving the Generalization of Adversarial Training with Domain Adaptation. ICLR 2019.

[2] Chuanbiao Song#, Kun He#*, Jiadong Lin#, Liwei Wang, John E. Hopcroft. Robust Local Features for Improving the Generalization of Adversarial Training. ICLR 2020

文本对抗攻击:

[3] Shuhuai Ren, Yihe Deng, Kun He*, Wanxiang Che. Generating Natural Language Adversarial Examples through Probability Weighted Word Saliency. ACL (Oral) 2019: 1085-1097.

[4] Xinze Zhang#, Junzhe Zhang#, Zhenhua Chen#, Kun He#*. Crafting Adversarial Examples for Neural Machine Translation. ACL (Oral) 2021: 1967-1977

文本对抗防御:

[1] Xiaosen Wang, Jin Hao, Yichen Yang, Kun He*. Natural language adversarial defense through synonym encoding. UAI 2021: 823-833.

[2] Xiaosen Wang#, Yichen Yang#, Yihe Deng, Kun He*. Adversarial Training with Fast Gradient Projection Method against Synonym Substitution Based Text Attacks. AAAI 2021: 13997-14005.

[3] Yichen Yang#, Xiaosen Wang#, Kun He*. Robust textual embedding against word-level adversarial attacks. UAI 2022: 2214-2224.

[4] Xiaosen Wang#, Yifeng Xiong#, Kun He*. Detecting textual adversarial examples through randomized substitution and vote. UAI 2022: 2056-2065.

代表性竞赛成果:

12020“安全AI挑战者计划2ImageNet图像分类对抗攻击竞赛冠军;

2、IJCAI 2019国际AI对抗攻防挑战赛防御赛道第3名;

3、CVPR 2021安全AI挑战者计划第6期防御模型的白盒对抗攻击竞赛第4名。






Copyright©2022   华中科技大学 Hopcroft Center on Computing Science (霍普克罗夫特计算科学研究中心)

中国 武汉 洪山区珞喻路 1037 号   华中科技大学 计算机学院 Hopcroft Center on Computing Science (霍普克罗夫特计算科学研究中心)   联系电话:86-027-87543885