Optical coherence tomography (OCT) of the posterior segment of the eye provides high-resolution cross-sectional images that allow visualization of individual layers of the posterior eye tissue (the retina and choroid), facilitating the diagnosis and monitoring of ocular diseases and abnormalities. The manual analysis of retinal OCT images is a time-consuming task; therefore, the development of automatic image analysis methods is important for both research and clinical applications. In recent years, deep learning methods have emerged as an alternative method to perform this segmentation task. A large number of the proposed segmentation methods in the literature focus on the use of encoder-decoder architectures, such as U-Net, while other architectural modalities have not received as much attention. In this study, the application of an instance segmentation method based on region proposal architecture, called the Mask R-CNN, is explored in depth in the context of retinal OCT image segmentation. The importance of adequate hyper-parameter selection is examined, and the performance is compared with commonly used techniques. The Mask R-CNN provides a suitable method for the segmentation of OCT images with low segmentation boundary errors and high Dice coefficients, with segmentation performance comparable with the commonly used U-Net method. The Mask R-CNN has the advantage of a simpler extraction of the boundary positions, especially avoiding the need for a time-consuming graph search method to extract boundaries, which reduces the inference time by 2.5 times compared to U-Net, while segmenting seven retinal layers.
Keywords: deep learning; optical coherence tomography; region proposal; semantic segmentation.