Yuqi Sun (孙玉齐)
Hi! My name is Yuqi Sun. I am currently a 5-th PhD candidate in School of Computer Science, Fudan University and under the supervision of Dr.Bo Yan. I received my B.Sc also from Fudan University in 2020.
My research interest lies in leveraging artificial intelligence techniques for data governance (AI for data)—including,
but not limited to, data management, data filtering, and data synthesis—to establish a data foundation for
building low-cost, high-performing AI models. My previous research focused on multi-view imaging, face images,
and rendered images, with a recent shift toward scientific fields such as medical imaging. I strongly believe
that data governance is one of the most critical directions for AI innovation, essential for reducing model
training costs, and I aim to extend its application to more scientific domains in the future.
Email  / 
CV  / 
Google Scholar  / 
Github
|
|
Updates
2025-03: Our new work has been published in Nature Biomedical Engineering (IF: 28.0)!
2024-07: Two paper are accepted for ACM MM24
2023-09: One paper is accepted for AAAI-24
|
Research
I represent some of my publication here, and more works are ongoing. (*Equal contribution)
|
|
A data-efficient strategy for building high-performing medical foundation models
Yuqi Sun*, Weimin Tan*, Zhuoyao Gu, Ruian He, Siyuan Chen, Miao Pang, Bo Yan
Nature Biomedical Engineering, 2025-03-05
Code
/
Paper
Medical Foundation models typically require massive datasets, but medical data collection is costly, slow, and privacy-sensitive.
We demonstrate that synthetic data, generated with disease labels, can effectively pretrain medical foundation models.
Our retinal model, pretrained on one million synthetic retinal images and just 16.7% of the real-world data used by RETFound (904,170 images),
matches or exceeds RETFound’s performance across nine public datasets and four diagnostic tasks. We also validate this data-efficient approach by building
a tuberculosis classifier on chest X-rays. Text-conditioned synthetic data boosts medical model performance and generalizability with less real data.
|
|
Audio-Driven Identity Manipulation for Face Inpainting
Yuqi Sun*, Qing Lin*, Weimin Tan, Bo Yan
ACM MM, 2024
Code
/
Paper
Our main insight is that a person's voice carries distinct identity markers, such as age and gender,
which provide an essential supplement for identity-aware face inpainting. By extracting identity information from audio as guidance,
our method can naturally support tasks of identity preservation and identity swapping in face inpainting.
|
|
A Medical Data-Effective Learning Benchmark for Highly
Efficient Pre-training of Foundation Models
Wenxuan Yang, Weimin Tan, Yuqi Sun, Bo Yan
ACM MM, 2024
Paper
This paper introduces a comprehensive benchmark specifically for evaluating data-effective learning in the medical field. This benchmark includes
a dataset with millions of data samples from 31 medical centers
(DataDEL), a baseline method for comparison (MedDEL), and a new
evaluation metric (NormDEL) to objectively measure data-effective
learning performance.
|
|
Low-Latency Space-Time Supersampling for Real-Time Rendering
Ruian He*, Shili Zhou*, Yuqi Sun, Ri Cheng, Weimin Tan, Bo Yan
AAAI, 2024
Code
/
Paper
We recognize the shared context and mechanisms between frame supersampling and extrapolation,
and present a novel framework, Space-time Supersampling (STSS). By integrating them into a unified framework,
STSS can improve the overall quality with lower latency. Notably, the performance is
achieved within only 4ms, saving up to 75% of time against the conventional two-stage pipeline that necessitates 17ms.
|
|
Instruct-NeuralTalker: Editing Audio-Driven Talking Radiance Fields with
Instructions
Yuqi Sun, Reian He, Weimin Tan, Bo Yan
Arxiv, 2023
Paper
We propose Instruct-NeuralTalker, the first interactive framework
to semantically edit the audio-driven talking radiance fields
with simple human instructions. It supports various taking face editing
tasks, including instruction-based editing, novel view synthesis,
and background replacement. In addition, Instruct-NeuralTalker
enables real-time rendering on consumer hardware.
|
|
Geometry-Aware Reference Synthesis for Multi-View Image
Super-Resolution
Ri Cheng, Yuqi Sun, Bo Yan, Weimin Tan, Chenxi Ma
ACM MultiMedia, 2022
Code
/
Paper
This paper proposes a Multi-View Image SuperResolution (MVISR) task. It aims to increase the resolution of multiview images captured from the same scene. One solution is to apply
image or video super-resolution (SR) methods to reconstruct HR
results from the low-resolution (LR) input view.
|
|
Learning Robust Image-Based Rendering on Sparse Scene Geometry
via Depth Completion
Yuqi Sun, Shili Zhou, Ri Cheng, Weimin Tan, Bo Yan*, Lang Fu
CVPR, 2022
Code
/
Video
/
Paper
Recent image-based rendering (IBR) methods usually
adopt plenty of views to reconstruct dense scene geometry.
However, the number of available views is limited in practice.
When only few views are provided, the performance
of these methods drops off significantly, as the scene geometry
becomes sparse as well. Therefore, in this paper, we
propose Sparse-IBRNet (SIBRNet) to perform robust IBR
on sparse scene geometry by depth completion.
|
|
Space-Angle Super-Resolution for Multi-View Images
Yuqi Sun*, Ri Cheng*, Bo Yan, Shili Zhou
ACM MultiMedia, 2021
Code
/
Paper
The limited spatial and angular resolutions in multi-view multimedia
applications restrict their visual experience in practical use. In
this paper, we first argue the space-angle super-resolution (SASR)
problem for irregular arranged multi-view images. It aims to increase
the spatial resolution of source views and synthesize arbitrary
virtual high resolution (HR) views between them jointly.
|
Experience
Some internship experiences
|
|