AI for medical image computing was one of the first fields in which data-driven deep learning was applied. Though this field has witnessed significant successes in recent years, scientific challenges still exist before AI models can be readily used for precise personalized diagnosis in real-world clinical practice. In the IPD program, we focus on developing AI solutions in image-driven data analysis for cancer patients, with high standards in precision, robustness, interpretability, and knowledge integration with domain experts. Unlike existing AI systems that handle an average population, our new IPD program emphasizes personalized diagnosis, treatment analysis, and prognosis, which can account for specific situations, as well as the needs and rights of each patient individually.
Personalized patient diagnosis requires a comprehensive assessment not only of anatomical, but also of functional, and even molecular, information in images. The key point is to fuse informative, complementary and decisive features from multi-modal image data in the latent space via learning strategies. In this project, we will study methods of cross-modality feature disentanglement of share-specific components for combined analysis of multi-modal data, thus capturing discriminative and complementary diagnosis features in streaming IPD practices.
Previously, we have developed a prognostic model for Intracerebral Hemorrhage (ICH) that leverages a variational autoencoder to integrate imaging and clinical data, overcoming biases from non-randomized trial data. Our approach shows significant gains in predicting treatment outcomes over current methods.
Model interpretability is an inseparable component of our IPD to give reasons for diagnosis results predicted for individual patients. Such model transparency can essentially help to correct potential modal discrepancies and improve the model properties for addressing safety-critical diagnosis and prognosis tasks. As mentioned, our IPD involves multiple modalities and relies on joint training of multiple modules.
We have proposed a framework to enhance safety-critical scene segmentation by optimizing a reward function for better uncertainty estimation, fine-tuning network parameters for risk calibration, and increasing model confidence. It outperforms on surgical scene tasks, improving both uncertainty quality and segmentation accuracy.
We will fulfill the crucial need to enrich and refine the outputs of AI models by continuously incorporating human expertise through human-in-the-loop mechanisms for medical applications. For conventional machine learning models, the interaction between humans and machine stops when the data preparation is completed by humans and the model learning starts. Such an approach separates the model learning from the human knowledge, resulting in inefficient training requiring large amounts of costly annotation and unreliable prediction when being deployed “in the wild”. We aim to actively involve the interactive refinement of model predictions in response to clinician feedback in an efficient fashion, during model training and testing.
Previous work introduced a human-in-the-loop approach for transferring deep learning models between medical datasets, using an igniter network for initial annotations and a sustainer network for iterative updates. A flexible labelling strategy reduces annotation effort, achieving a 19.7% Dice score improvement and cutting labelling time significantly in CT multi-organ segmentation.