Zhaolong Su

学无止境

Ph.D. Student in Systems Engineering, Cornell University

About Me

I am a first-year PhD at Cornell University, focusing on Unified Models / Multimodal Learning / World Models. Before that, I Graduated from Beijing University of Technology (B.Eng in Artificial Intelligence).

I am also conducting computer vision research at Johns Hopkins University with Bloomberg Distinguished Prof. Alan Yuille. Previously at Peking University supervised by Prof. Wentao Zhang. I also worked at HKU promoting CT-free scoliosis treatment. I maintain close industry collaborations and have engineering experience with leading organizations. My research mainly focus on multimodal learning, unified model, and vision related topic like world model/ video gen. I am deeply interested in how models can better perceive & encode specific knowledge (e.g., physics, medicine) to enable reliable reasoning and generation.

News

2026.05 🏆 FedUMM received the Best Student Paper Award at FL@FM-WWW'26.

2026.05 Triplet-Block Diffusion RWKV released on arXiv.

2026.02 🍺 First tutorial The Road to Convergence: Evolution of Unified Multimodal Models accepted by CVPR'26, big thanks to all collaborators, will participate as main speaker.

2026.02 🍺 UniGame accepted by CVPR'26. See you in Denver!

2026.02 First-author paper "Consistency Should Be the Priority for Unified Multimodal Models" released.

2026.01 🍺 First-author paper FedUMM accepted by FL@FM-WWW'26. Congratulations!

2025.11 Attending NeurIPS 2025 in San Diego.

2025.11 First-author paper UniGame released on arXiv (CVPR'26 scoring 554).

Selected Publications

Unified Models

UniGame: Turning a Unified Multimodal Model Into Its Own Adversary

Zhaolong Su, Wang Lu, Hao Chen, Sharon Li, Jindong Wang.

CVPR'26 (Scoring 554)

Paper Code Website

FedUMM: A General Framework for Federated Learning with Unified Multimodal Models

Zhaolong Su, Leheng Zhao, Xiaoying Wu, Ziyue Xu, Jindong Wang.

FL@FM-WWW'26 Best Student Paper🏆

Paper Code

Consistency Should Be the Priority for Unified Multimodal Models

Zhaolong Su, Yinyi Luo, Yiqiao Jin, Mengqi Zhang, Wenyue Hua, Srijan Kumar, Qingsong Wen, Jindong Wang.

ICML 2026 (Under-review)

Paper

Language Models

Triplet-Block Diffusion RWKV

Ke Lin, Yiyang Luo, Zhaolong Su, Yunya Song, Anyi Rao.

ACL (Under-review)

Paper Code

Teaching · Service · Invited Talk

Teaching

Coming soon.

Service

Peer-Review: ACL'26 CVPR'26, CVPR'26 Workshop Journey to the Awards: Generative AI for Movie-Grade Video Production, ICCV'25, NeurIPS'24, KDD'26, ICML'26

Invited Talk

The Road to Convergence: Evolution of Unified Multimodal Models, CVPR'26 tutorial ·

Awards

FL@FM-WWW'26 Best Student Paper Award 2026.
NVIDIA Academic Grant Program 2025.
Silver Medal, China English Debate National Championship 2022.
China National Encouragement Scholarship 2022.
Outstanding Scholarship, BJUT 2021.

Album

I love postmodernism and wasteland punk-style photography, and I wish to be a great photographer someday.

BJUT — Beijing University of Technology, my beloved alma mater