Zhang SHENGYU
Tenure-track Assistant Professor
ZJU100 Young Professor(百人计划研究员)
School of Software Technology & Lab of Artificial Intelligence
Zhejiang University
Zhejiang, China. 310000.
Address: Room 207, Zetong Building, Yuquan Campus
Email: sy_zhang αt zju dοt edu dοt cn
|
|
Biography
I obtained my Ph.D in the College of Computer Science and Technology at Zhejiang University, advised by Prof. Fei Wu. I was so lucky to work with Prof. Zhou Zhao and Prof. Kun Kuang at Zhejiang University. From March 2021 to September 2022, I become a visiting research scholar of NExT++ Research Center, National University of Singapore, advised by Prof. Tat-seng Chua. I feel grateful to work with Prof. Fuli Feng at University of Science and Technology of China.
As an AI researcher with a specialization in machine learning, my work revolves around the cutting-edge domain of device-cloud collaborative learning, multi-media analysis, and data mining. My research endeavors are driven by a deep interest in the development and deployment of machine learning models that operate collaboratively across both edge devices and cloud servers. Specifically, my research seeks to address the unique challenges associated with the seamless integration of heterogeneous models in these diverse computational environments. Through the design and evaluation of these models, I seek to improve the scalability, reliability, and real-time performance of machine learning systems in a wide range of applications.
News
-
[2024-12] Being recognized as OUTSTANDING REVIEWER (top 10% of reviewers) in KDD 2025.
-
[2024-12] Three papers accepted by AAAI on Device-cloud heterogeneous model collaboration.
-
[2024-11] Two papers accepted by KDD on Device-cloud heterogeneous model collaboration.
-
[2024-11] COIN on MM reasoning was selected as the ACM MM 2024 best paper candidate.
-
[2024-07] Three Paper accepted by ACM MM on multi-model collaboration, MM reasoning, 3D Gaussian talking head.
-
[2024-07] Paper accepted by ECCV: an early work exploring Combinatorial Solver augmented LLM.
-
[2024-05] Paper accepted by KDD Research Track on device-cloud collaborative recommendation.
-
[2024-03] Paper accepted by TOIS on out-of-domain model transfer without access to target domain.
-
[2024-02] Paper accepted by ICLR on out-of-domain knowledge distillation without access to source domain.
-
[2024-02] Paper accepted by WWW on on-device model uncertainty detection for intelligent cloud request.
-
[2023-07] Three papers accepted by CICAI, including a Best Paper Award on causality-inspired structure learning for recsys.
-
[2023-07] Two papers accepted by ACM MM.
-
[2023-07] One paper accepted by TOIS.
-
[2023-07] One paper accepted by TKDE on causal distillation of heterogeneous models.
-
[2023-05] Two papers accepted by ACL 2023.
-
[2023-03] One paper accepted by SIGIR 2023 on disentangled music representation learning.
-
[2023-03] Two papers accepted by CVPR 2023.
-
[2023-02] One paper accepted by TPAMI on causality-inspired disentangled representation learning for recsys.
-
[2023-01] One paper accepted by WWW 2023 on edge-cloud collaborative recommendation.
Research Summary
Knowledge Transfer
Collaborative Learning
Applied Research
Cloud to Device (C2D): Diverse end devices exhibit distinct task functionalities and usage scenarios, rendering the migration and deployment of cloud models to the edge a complex endeavor. This process encounters significant challenges in achieving cross-scenario/domain/task/distribution generalization.
Cross-domain/OOD Learning
Compression
Pretraining
Model Aggregation
Cloud for Device (C4D): collaborative inference challenges requires managing discrepancies in model scales, architectures, and optimization goals between cloud and edge computing. The cloud's role is not to execute tasks directly, but to enhance edge devices' performance in executing predefined functions through strategic support.
Collaboration of Foundation models and the others
Collaboration of Heterogeneous Models
On-device User Behavior Modeling (RecSys)
Multi-media Computing
Publications
* denotes co-first authors,
✉ denotes the corresponding author,
# denotes (co-)supervised students
Highlights
OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use
Xueyu Hu, Tao Xiong, Biao Yi, Zishu Wei, Ruixuan Xiao, Yurun Chen, Jiasheng Ye, Meiling Tao, Xiangxin Zhou, Ziyu Zhao, Yuhuai Li, Shengze Xu, Shawn Wang, Xinchen Xu, Shuofei Qiao, Kun Kuang, Tieyong Zeng, Liang Wang, Jiwei Li, Yuchen Eleanor Jiang, Wangchunshu Zhou, Guoyin Wang, Keting Yin, Zhou Zhao, Hongxia Yang, Fan Wu,
Shengyu Zhang✉, Fei Wu
[
Paper]
[
GitHub]
Main Idea:
One of the early surveys on Mobile/UI/OS/Computer MLLM/LLM Agent
Reinforcement Learning Enhanced LLMs: A Survey
Shuhe Wang,
Shengyu Zhang, Jie Zhang, Runyi Hu, Xiaoya Li, Tianwei Zhang, Jiwei Li, Fei Wu, Guoyin Wang, Eduard Hovy
[
Paper]
[
GitHub]
Main Idea:
An early survey on RL-enhanced LLMs
ModelGPT: Unleashing LLM’s Capabilities for Tailored Model Generation
Zihao Tang, Zheqi Lv,
Shengyu Zhang, Fei Wu, Kun Kuang
Arxiv, 2024
[
Paper]
[
知乎]
Main Idea:
User description + A few data + ModelGPT
=(Inference) Off-the-shelf AI Model
Instruction tuning for large language models: A survey
Shengyu Zhang, Linfeng Dong, Xiaoya Li, Sen Zhang, Xiaofei Sun, Shuhe Wang, Jiwei Li, Runyi Hu, Tianwei Zhang, Guoyin Wang, Fei Wu
Arxiv, 2023
[
Paper]
[
GitHub]
Main Idea:
An early survey on LLM instruction tuning
2025
MergeNet: Knowledge Migration across Heterogeneous Models, Tasks, and Modalities
Kunxi Li, Tianyu Zhan, Kairui Fu, Shengyu Zhang✉, Kun Kuang, Jiwei Li, Zhou Zhao, Fan Wu, Fei Wu
AAAI 2025 (to appear)
Main Idea:
An Unified framework for heterogeneous knowledge transfer
FedCFA: Alleviating Simpson's Paradox in Model Aggregation with Counterfactual Federated Learning
Zhonghua Jiang, Jimin Xu, Shengyu Zhang✉, Tao Shen, Jiwei Li, Kun Kuang, Haibin Cai, Fei Wu
AAAI 2025 (to appear)
Main Idea:
Counterfactual on-device learning for debiased on-cloud aggregation
Optimize Incompatible Parameters through Compatibility-aware Knowledge Integration
Zheqi Lv, KeMing Ye, Wei Zishu, Qi Tian, Shengyu Zhang✉, Wenqiao Zhang, Wenjie Wang, Kun Kuang, Tat-Seng Chua, Fei Wu
AAAI 2025 (to appear)
Main Idea:
Optimize model over other models with parameter merging
Forward Once for All: Structural Parameterized Adaptation for Efficient Cloud-coordinated On-device Recommendation
Kairui Fu, Zheqi Lv, Shengyu Zhang✉, Fan Wu, Kun Kuang
KDD 2025 (Research Track, to appear)
Main Idea:
Compact Device Model *Architecture* customized in Real time.
Collaborative Large Language Models and Sequential Recommendation Models for Device-Cloud Recommendation
Zheqi Lv, Tianyu Zhan, Wenjie Wang, Xinyu Lin, Shengyu Zhang✉, Wenqiao Zhang, Jiwei Li, Kun Kuang, Fei Wu
KDD 2025 (Research Track, to appear)
Main Idea:
Large-small model collaboration for RecSys.
2024
PhiloGPT: A Philology-Oriented Large Language Model for Ancient Chinese Manuscripts with Dunhuang as Case Study
Yuqing Zhang, Baoyi He, Yihan Chen, Hangqi Li, Han Yue,
Shengyu Zhang✉, Huaiyong Dou, Junchi Yan, Zemin Liu, Yongquan Zhang, Fei Wu
EMNLP 2024 (Main)
Main Idea:
Dynamic Parameter generation through codebook learning.
GaussianTalker: Speaker-specific Talking Head Synthesis via 3D Gaussian Splatting
Hongyun Yu, Zhan Qu, Qihang Yu, Jianchuan Chen, Zhonghua Jiang, Zhiwen Chen,
Shengyu Zhang✉, Jimin Xu, Fei Wu, chengfei lv, Gang Yu
ACM MM 2024
[
Presentation]
Main Idea:
An early work to explore Combinatorial Solver augmented LLM
Main Idea:
Random Networks - Distribution-incompatible Params.
= Compact Device Model customized in Real time
MPOD123: One Image to 3D Content Generation Using Mask-enhanced Progressive Outline-to-Detail Optimization
Jimin Xu, Tianbao Wang, Tao Jin,
Shengyu Zhang✉, Dongjie Fu, Zhe Wang, Jiangjing Lyu, Chengfei Lv, Chaoyue Niu, Zhou Yu, Zhou Zhao, Fei Wu
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
[
Paper]
[
Demo]
2023
Causal Distillation for Alleviating Performance Heterogeneity in Recommender Systems
Shengyu Zhang, Ziqi Jiang, Jiangchao Yao, Fuli Feng, Kun Kuang, Zhou Zhao, Shuo Li, Hongxia Yang, Tat-seng Chua, Fei Wu
IEEE Transactions on Knowledge and Data Engineering (TKDE), 2023
SLED: Structure Learning based Denoising for Recommendation
Shengyu Zhang, Tan Jiang, Kun Kuang, Fuli Feng, Jin Yu, Jianxin Ma, Zhou Zhao, Jianke Zhu, Hongxia Yang, Tat-sen Chua, Fei Wu
ACM Transactions on Information Systems (TOIS), 2023
DisCover: Disentangled Music Representation Learning for Cover Song Identification
Jiahao Xun,
Shengyu Zhang✉, Yanting Yang, Jieming Zhu, Liqun Deng, Zhou Zhao, Zhenhua Dong, Ruiqi Li, Lichao Zhang, Fei Wu
International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2023
Video-Audio Domain Generalization via Confounder Disentanglement
Shengyu Zhang, Xusheng Feng, Wenyan Fan, Wenjing Fang, Fuli Feng, Wei Ji, Shuo Li, Li Wang, Shanshan Zhao, Zhou Zhao, Tat-Seng Chua, Fei Wu
AAAI Conference on Artificial Intelligence (AAAI), 2023
DUET: A Tuning-Free Device-Cloud Collaborative Parameters Generation Framework for Efficient Device Model Generalization
Zheqi Lv, Wenqiao Zhang,
Shengyu Zhang, Kun Kuang, Feng Wang, Yongwei Wang, Zhengyu Chen, Tao Shen, Hongxia Yang, Beng Chin Ooi, Fei Wu
The Web Conference (WWW), 2023
Multi-modal Action Chain Abductive Reasoning
Mengze Li, Tianbao Wang, Jiahe Xu, Kairong Han,
Shengyu Zhang, Zhou Zhao, Jiaxu Miao, Wenqiao Zhang, Shiliang Pu, Fei Wu
The Annual Meeting of the Association for Computational Linguistics (ACL), 2023
2022 selected papers
Edge-Cloud Polarization and Collaboration: A Comprehensive Survey
Jiangchao Yao,
Shengyu Zhang, Yang Yao, Feng Wang, Jianxin Ma, Jianwei Zhang, Yunfei Chu, Luo ji, Kunyang Jia, Tao Shen, Anpeng Wu, Fengda Zhang, Ziqi Tan, Kun Kuang, Chao Wu, Fei Wu
IEEE Transactions on Knowledge and Data Engineering (TKDE), 2022
Re4: Learning to Re-contrast, Re-attend, Re-construct for Multi-interest Recommendation
Shengyu Zhang, Lingxiao Yang, Dong Yao, Yujie Lu, Fuli Feng, Zhou Zhao, Tat-Seng Chua, Fei Wu
International World Wide Web Conferences (WWW), 2022
[
Paper]
[
GitHub]
Dilated Context Integrated Network with Cross-Modal Consensus for Temporal Emotion Localization in Videos
Juncheng Li, Junlin Xie, Linchao Zhu, Long Qian, Siliang Tang, Wenqiao Zhang, Haochen Shi,
Shengyu Zhang, Longhui Wei, Qi Tian, Yueting Zhuang
ACM International Conference on Multimedia (MM), 2022
BoostMIS: Boosting Medical Image Semi-supervised Learning with Adaptive Pseudo Labeling and Informative Active Annotation
Wenqiao Zhang, Lei Zhu, James Hallinan,
Shengyu Zhang, Andrew Makmur, Qingpeng Cai, Beng Chin Ooi
IEEE/CVF International Conference on Computer Vision and Pattern Recognition (CVPR), 2022
[
Paper]
[
GitHub]
End-to-End Modeling via Information Tree for One-Shot Natural Language Spatial Video Grounding
Mengze Li, Tianbao Wang, Haoyu Zhang,
Shengyu Zhang, Zhou Zhao, Jiaxu Miao, Wenqiao Zhang, Wenming Tan, Jin Wang, PENG WANG, Shiliang Pu, Fei Wu
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Contrastive Learning Adopting Positive-Negative Frame Mask for Music Representation
Dong Yao, Zhou Zhao,
Shengyu Zhang✉, Jieming Zhu, Yudong Zhu, Rui Zhang, Xiuqiang He
International World Wide Web Conferences (WWW), 2022
[
Paper]
[
GitHub]
Uncovering Causal Effects of Online Short Videos on Consumer Behaviors
Ziqi Tan✰,
Shengyu Zhang, Nuanxin Hong, Kun Kuang, Yifan Yu, Zhou Zhao, Jin Yu, Hongxia Yang, Shiyuan Pan, Jingren Zhou, Fei Wu
The Fifteenth International Conference on Web Search and Data Mining (WSDM), 2022
2021
CauseRec: Counterfactual User Sequence Synthesis for Sequential Recommendation
Shengyu Zhang, Dong Yao, Zhou Zhao, Tat-Seng Chua, Fei Wu
ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2021
[
Paper]
[
GitHub]
Why Do We Click: Visual Impression-aware News Recommendation
Jiahao Xun*,#,
Shengyu Zhang*, Zhou Zhao, Jieming Zhu, Qi Zhang, Jingjie Li, Xiuqiang He, Xiaofei He, Tat-Seng Chua, Fei Wu
ACM International Conference on Multimedia (MM), 2021. Oral Presentation
[
Paper]
[
GitHub]
Future-Aware Diverse Trends Framework for Recommendation
Yujie Lu*,
Shengyu Zhang*, Yingxuan Huang*, Xinyao Yu, Luyao Wang, Zhou Zhao, Fei Wu
International World Wide Web Conferences (WWW), 2021
[
Paper]
[
GitHub]
2020 and prior
DeVLBert: Learning Deconfounded Visio-Linguistic Representations
Shengyu Zhang, Tan Jiang, Tan Wang, Kun Kuang, Zhou Zhao, Jianke Zhu, Jin Yu, Hongxia Yang, Fei Wu
ACM International Conference on Multimedia (MM), 2020
[
Paper]
[
GitHub]
Poet: Product-oriented Video Captioner for E-commerce
Shengyu Zhang, Ziqi Tan, Jin Yu, Zhou Zhao, Kun Kuang, Jie Liu, Jingren Zhou, Hongxia Yang, Fei Wu
ACM International Conference on Multimedia (MM), 2020, Oral Presentation
[
Paper]
[
Dataset]
Comprehensive Information Integration Modeling Framework for Video Titling
Shengyu Zhang, Ziqi Tan, Jin Yu, Zhou Zhao, Kun Kuang, Tan Jiang, Jingren Zhou, Hongxia Yang, Fei Wu
SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2020
[
Paper]
[
Dataset]
Workshop & Short Papers
Grounded and Controllable Image Completion by Incorporating Lexical Semantics
Shengyu Zhang, Tan Jiang, Qinghao Huang, Ziqi Tan, Zhou Zhao, Siliang Tang, Jin Yu, Hongxia Yang, Yi Yang, Fei Wu
CVPR 2021 Causality in Vision Workshop, 2021
Talks
-
Co-chair for "Collaboration and Evolution of Foundation and Specialized Models Workshop" @ACM MM Asia
2024
-
Co-chair for "Theory and Techniques of Large and Small Model Collaboration" forum @CCLD (大模型与决策智能大会)
2024
-
Collaborative Learning and Inference for Device-Cloud Heterogeneous Models @CNCC (中国计算机大会)
2024
-
Device-Cloud Collaborative Intelligence with Large and Small Models @ CCF(秀湖论坛)
2024
-
Collaboration and Evolution of Large and Small Models @ CSIG (多模态大模型高峰论坛暨第30期前沿讲习班)
2024
-
Knowledge Transfer and Collaborative Inference for Device-Cloud Heterogeneous Models @CAIDIC (中国人工智能数字创新大会)
2023
-
2023
-
Multi-modal Understanding and Sequential Modeling in RecSys @Huawei Noah's Ark Lab
2022
-
MCausal Multi-modal Understanding and Recommendation @NUS
2022
Selected Honors Awarded
Academic Service
-
Conference Reviewer: NeurIPS 2023|2024, ECCV 2024, SIGIR 2023|2024, KDD 2023|2024, IJCAI 2023|2024, AAAI 2023|2024, ACM MM 2023, WSDM 2023, ACL ARR Reviewer.
-
Journal Reviewer: TPAMI, TKDE, TOIS, TCSVT, TMM, TNNLS, TCYB, FITEE, Journal of Supercomputing, Neurocomputing, Computers in Human Behavior, etc.