|
LoRKD: Low-Rank Knowledge Decomposition for Medical Foundation Models
HaoLin Li,
Yuhang Zhou,
Ziheng Zhao,
Siyuan Du,
Jiangchao Yao,
Weidi Xie,
Ya Zhang,
Yanfeng Wang,
Under Review, 2024.
In this paper, we we propose a novel framework named Low-Rank Knowledge Decomposition (LoRKD), to break down the foundation model into multiple lightweight expert models, each tailored to a specific domain. The goal of this paradigm is to improve the specialization of deployment models within a specific domain, while simultaneously reducing deployment costs.
|
|
One Model to Rule them All: Towards Universal Segmentation for Medical Images with Text Prompts
Ziheng Zhao,
Yao Zhang,
Chaoyi Wu,
Xiaoman Zhang,
Ya Zhang,
Yanfeng Wang,
Weidi Xie,
Under Review, 2024.
In this paper, we build up a universal medical segmentation model, driven by text prompts (SAT).
|
|
RadGenome-Chest CT: A Grounded Vision-Language Dataset for Chest CT Analysis
Xiaoman Zhang,
Chaoyi Wu,
Ziheng Zhao,
Jiayu Lei,
Ya Zhang,
Yanfeng Wang,
Weidi Xie,
Under Review, 2024.
In this paper, we introduce RadGenome-Chest CT, a comprehensive, large-scale, region-guided 3D chest CT interpretation dataset based on CT-RATE. It includes: Organ-level segmentation for 197 categories; 665K multi-granularity grounded reports; 1.3M grounded VQA pairs.
|
|
Can GPT-4V(ision) Serve Medical Applications? Case Studies on GPT-4V for Multimodal Medical Diagnosis
Chaoyi Wu*,
Jiayu Lei*,
Qiaoyu Zheng*,
Weike Zhao*,
Weixiong Lin*,
Xiaoman Zhang*,
Xiao Zhou*,
Ziheng Zhao*,
Yanfeng Wang,
Ya Zhang,
Weidi Xie,
Technical Report, 2023.
We evaluate the GPT-4V on 92 radiographic cases, 20 pathoglogy cases and 16 location cases across 17 medical systems covering 8 imaging modalities. In general, as the cases shown, GPT-4V is still far from clinical usage.
|
|
PMC-VQA: Visual Instruction Tuning for Medical Visual Question Answering
Xiaoman Zhang*,
Chaoyi Wu*,
Ziheng Zhao,
Weixiong Lin,
Yanfeng Wang ,
Ya Zhang,
Weidi Xie,
Under Review, 2023.
In this paper, we focus on the problem of Medical Visual Question Answering (MedVQA). We propose a generative medical VQA model, MedVInT, together with a large scale MedVQA Dataset, PMC-VQA.
|
|
PMC-CLIP: Contrastive Language-Image Pre-training using Biomedical Documents
Weixiong Lin*,
Ziheng Zhao*,
Xiaoman Zhang,
Chaoyi Wu,
Yanfeng Wang ,
Ya Zhang,
Weidi Xie,
International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), 2023.
We collect a biomedical dataset, PMC-OA with 1.6M image-caption pairs collected from PubMedCentral's OpenAccess subset.
|
|
K-Space Transformer for Undersampled MRI Reconstruction
Ziheng Zhao*,
Xiaoman Zhang,
Tianjiao Zhang,
Weidi Xie,
Yanfeng Wang ,
Ya Zhang,
British Machine Vision Conference (BMVC), 2022.
We propose a novel Transformer-based framework to reconstruct undersampled MRI directly in k-space.
|
* denotes equal contribution.
|
|