Age-related macular degeneration (AMD) is the leading cause of irreversible vision loss in the elderly population of industrialized nations [1]. AMD primarily affects the macula, which is the cone-photoreceptor-rich central part of the retina. Early AMD is characterized by lipoprotein-rich deposits called drusen. Advanced AMD is classified as neovascular (or ‘wet’) and atrophic (or ‘dry’) AMD, with the former characterized by proliferative neovascularization and the latter characterized by atrophy of the retinal pigment epithelium and the outer neurosensory retina. Optical coherence tomography (OCT) provides high-resolution cross-sectional imaging of the retina and is widely used in disease diagnosis, treatment-decision guidance, and therapy-response assessment. The accurate quantification of OCT image features is thus crucial for the identification and tracking of AMD.
The image features of AMD are highly complex, including changes in retinal layers and various types of pathological lesions [2]. First, retinal layers have a strict anatomical ordering and are separated by gradual gradients. The task is further complicated by the fact that certain retinal layers may undergo atrophy in specific diseases (e.g., atrophy of retinal pigment epithelium or neurosensory retina). Second, the location of retinal layers and lesions can influence each other. Lesions are confined by the topology of retinal layers, while the presence of lesions can also alter the shape of retinal layers. The accurate segmentation of retina layers and pathological lesions in OCT remains a highly challenging task.
Recent developments in deep neural networks have brought profound progress in medical image analysis. Several studies have proposed simultaneous segmentation of retinal layers and fluid lesions, with some of them incorporating constraints between these two tasks (see Section 2.1 for details). However, these methods impose simple one-way constraints, which are unable to capture the mutual interaction between retinal layers and fluid lesions. To address this limitation, we propose a novel two-way constraint strategy in this study, enabling the mutual influence and interaction between retinal layers and fluid lesions.
- 1.
Hierarchical multi-task architecture: we propose a three-branch, hierarchical multi-task framework, allowing for the joint regression of seven retinal layers and segmentation of three pathological lesions. A regression guidance module is introduced to provide explicit shape guidance between sub-tasks.
- 2.
Cross-dataset learning: we propose a cross-dataset learning strategy based on pseudo-label to leverage public datasets with partial labels.
- 3.Extensive evaluation: the proposed method was extensively evaluated through ablation and comparative studies on a real-world clinical dataset and two public datasets. The results showed that the proposed framework outperforms state-of-the-art methods across a range of metrics. The clinical dataset, manual labels, and code are publicly available on GitHub (https://github.com/xjtu-mia/OCT-Multitask).