Pretrained subtraction and segmentation model for coronary angiograms

Yunjie Zeng^1,2^na1,
Han Liu³^na1,
Juan Hu⁴^na1,
Zhengbo Zhao¹ &
…
Qiang She¹

330 Accesses
Explore all metrics

Abstract

This study introduces a novel self-supervised learning method for single-frame subtraction and vessel segmentation in coronary angiography, addressing the scarcity of annotated medical samples in AI applications. We pretrain a U-Net model on a large dataset of unannotated coronary angiograms using an image-to-image translation framework, then fine-tune it on a limited set of manually annotated samples. The pretrained model excels at comprehensive single-frame subtraction, outperforming existing DSA methods. Fine-tuning with just 40 samples yields a Dice coefficient of 0.828 for vessel segmentation. On the public XCAD dataset, our model sets a new state-of-the-art benchmark with a Dice coefficient of 0.755, surpassing both unsupervised and supervised learning approaches. This method achieves robust single-frame subtraction and demonstrates that combining pretraining with minimal fine-tuning enables accurate coronary vessel segmentation with limited manual annotations. We successfully apply this approach to assist physicians in visualizing potential vascular stenosis sites during coronary angiography. Code, dataset, and a live demo will be available available at: https://github.com/newfyu/DeepSA.

Encoder-Decoder Architectures for Clinically Relevant Coronary Artery Segmentation

Cross-Domain Transfer Learning for Vessel Segmentation in Computed Tomographic Coronary Angiographic Images

CFNet: A Coarse-to-Fine Framework for Coronary Artery Segmentation

Introduction

Digital subtraction angiography (DSA) effectively isolates vascular structures in X-ray angiography by subtracting a “mask image” (without a contrast agent) from “live images” (with a contrast agent). However, especially in coronary angiography, challenges arise due to patient movement, leading to motion artefacts and blurred vessel appearances (Unlike natural images, the structures on an X-ray film overlap and are transparent. Herein, “background” refers to nonvascular anatomical structures, while “foreground” pertains to blood vessels.). Traditional image registration algorithms aim to remove these artefacts through approaches such as deforming one image before executing subtraction to minimize motion disturbances^1,2,3 or using probabilistic models that treat images as superpositions of layered motions^4,5. Despite the demonstrated advancements, including deep learning adaptations^6,7, these approaches still cannot fully eliminate nonvascular structures from coronary angiograms.

In vessel segmentation tasks, a parallel domain is used to classify each pixel to identify vessels. This approach facilitates observation and lays the groundwork for automated tasks such as stenosis detection. The traditional segmentation process relies on morphological features and uses techniques including Hessian matrix-based⁸, morphological⁹, and Gabor filter enhancements¹⁰. However, these methods struggle to differentiate vessels from complex backgrounds, such as bones or cardiopulmonary movements, particularly in angiocardiography images.

Convolutional neural networks (CNNs) have shown promise in the image segmentation field¹¹. However, their training procedures require the manual annotation of coronary angiograms, which is a painstaking process. Even with significant efforts, such as Du et al.’s¹² work with 20,612 annotated samples, challenges persist. Manual annotation struggles with overlapping structures or low-contrast frames, resulting in models that might overlook smaller vessels. Given these hurdles, unsupervised or weakly supervised techniques that leverage unannotated data seem promising^13,14. Notable works include those of Plourde and Luc¹⁵, who employed machine learning after applying Hessian, enhancement, and Vlontzos and Mikolajczyk¹⁶, who used the TopHat operation alongside the U-Net training process^17,18. Additionally, Ma et al.¹⁹ proposed a self-supervised vessel segmentation method via adversarial learning. However, these methods still cannot compete with supervised learning.

Addressing these challenges, we propose a novel method for learning vascular representations from unannotated samples. This approach achieves single-frame subtraction that surpasses DSA in background removal efficiency. By effectively eliminating nonvascular structures, our method requires only a minimal number of annotated samples to fine-tune a vessel segmentation model, outperforming purely supervised learning approaches. Our work was motivated by the significant hurdles in coronary angiography vessel segmentation and single-frame subtraction, particularly the extreme difficulty and high cost of annotating fine vessels. The resulting technique leverages large amounts of unannotated data while minimizing manual annotation requirements. This innovative learning paradigm offers a practical solution to real-world challenges in medical image analysis, potentially providing new perspectives for vascular segmentation.

Methods

The traditional digital subtraction approach involves subtracting X-ray images captured before and after the injection of a contrast agent into blood vessels. This process aims to eliminate anatomical structures from the produced images. However, this technique is highly sensitive to motion. Any deviation from a stable position manifests as a visible artefact, diminishing the diagnostic utility of this method. Various registration and layering techniques have been proposed to improve the quality of subtraction images. While these methods exhibit some robustness against minor body motions, they struggle to cope with the continuous nonlinear movements of the heart and lungs. As a result, subtraction images derived from coronary angiograms often suffer from motion artefacts. Furthermore, DSA cannot generate subtraction images from a single frame.

We recognize that subtraction can be framed as an image-to-image (I2I) translation problem, which is a popular and versatile paradigm in machine learning. I2I translation has demonstrated remarkable utility in diverse applications ranging from computer graphics and style transfer to satellite imagery and photo enhancement. The central objective in a typical I2I translation problem is to learn a mapping function that translates images from one domain to a corresponding image in another domain. The existing methods often leverage either paired²⁰ or unpaired²¹ training samples for this purpose. In our research, we extend this notion by treating a “mask image” (an image taken prior to the administration of a contrast agent) and a “live image” (an image captured after administering the contrast agent) as entities from two distinct domains. This setting is essential because the key difference between these two types of frames lies in whether blood vessels are displayed. If a neural network can learn to switch between them, it would inherently need to learn the representations of the blood vessels. With this framework, we are able to generate a corresponding mask for any frame within a coronary angiography sequence.

All methods were carried out in accordance with relevant guidelines and regulations.All experimental protocols were approved by Ethics Committee of The Second Affiliated Hospital of Chongqing Medical University. Anonymous data were used in this study, and exempt informed consent was obtained from the Ethics Committee of the Second Affiliated Hospital of Chongqing Medical University.

Data

We collect 58,128 coronary angiographic DICOM files from 3756 patients sourced from The Second Affiliated Hospital of Chongqing Medical University. For each patient, the angiography procedure typically involves multiple imaging positions, including RAO (Right Anterior Oblique), LAO (Left Anterior Oblique), CRA (Cranial), CAU (Caudal), AP (Anteroposterior), and LAT (Lateral). These positions are used to visualize different coronary arteries and their branches, including the Left Main Coronary Artery (LM), Left Anterior Descending Branch (LAD), Left Circumflex Branch (LCX), Right Coronary Artery (RCA), and their subdivisions. This comprehensive imaging approach typically results in about 10–15 DICOM segments per patient, with each segment representing a different angle or coronary artery. Each DICOM file consists of a single-angle continuous sequence captured after a single contrast agent injection. From each sequence, we extract the first frame, which has not yet been subjected to contrast injection, to serve as a mask image. Subsequent frames are then randomly selected from the middle portion of each sequence, with each frame varying in terms of its degree of visible vascular structures. These frames are divided into two domains: X and Y. We exclude samples where the site is not the coronary artery or where the first frame already exhibits the presence of a contrast agent. The dataset is finalized with 17,398 mask images and 38,930 live images. We designate this dataset as the Live-Mask Coronary Angiograms Dataset (LM-CAD), which is intended for pretraining neural networks. The significant reduction from 58,128 original DICOM files to 17,398 mask images and 38,930 live images is due to our rigorous selection process. We initially selected the first frame of each DICOM as a potential mask frame, but many were discarded due to the presence of contrast agent. Live frames were extracted at a 2:1 ratio to mask frames, but also underwent manual screening to ensure quality. This process, while reducing the dataset size, ensures high-quality, representative images for both background and vascular structures. Finally, we randomly select 50 live images and manually annotate them with high granularity, identifying vessels with diameters as fine as a single pixel. We refer to these samples as the Fine Segmentation Coronary Angiograms Dataset (FS-CAD), which is suitable for fine-tuning or quantitative evaluation tasks. To facilitate ongoing research in this area, we have made the datasets publicly available.

Moreover, we utilize another publicly available dataset, known as XCAD¹⁹, for comparative analyses and evaluations against other methods. We restrict our usage to the test set of this dataset, which includes 126 manually annotated coronary angiograms for segmentation. Unlike FS-CAD, which focuses on finer vessels, XCAD mainly contains annotations for larger blood vessels.

Pretrained model

Initially, two neural networks are pretrained on the LM-CAD dataset, each with distinct objectives. The first network, denoted as \({G}_{yx}\), aims to transform a mask image into a live image by incorporating vascular structures. Conversely, the second network, \({G}_{xy}\), is designed to erase vascular structures from a live image, effectively creating a mask image.

Both \({G}_{xy}\) and \({G}_{yx}\) are based on a U-Net architecture with a base dimensionality of 32, comprising approximately 4.3 million parameters. This architecture allows for efficient learning of hierarchical features while maintaining spatial information through skip connections. The discriminators Dx and Dy use a PatchGAN structure to classify whether 46 × 46 overlapping image patches are real or fake.

These networks operate via cycle-consistent adversarial learning, as depicted in Fig. 1. The overall objective function combines adversarial losses for both generators and a cycle consistency loss. This formulation encourages the generators to produce realistic images while preserving the content of the input images. The detailed mathematical formulations of these loss functions and the complete objective function are provided in the Supplementary materials A.

Prior to feeding images into these networks, several data augmentation techniques are applied, including random TopHat, random cropping and resizing, and colour jitter. These augmentations help the model learn invariant features and improve generalization. The training process uses the Adam optimizer with a learning rate of 2e-4 for 60 epochs, with linear decay after the 40th epoch.

This training process is referred to as pretraining. Specifically, \({G}_{xy}\), which we term the PT-Model, assimilates vascular representation features through extensive pretraining. These learned features can either be leveraged for subsequent vascular segmentation tasks or directly applied to perform single-frame subtraction.

An additional replica network, \({\text{G}}_{\text{E}}\), is introduced as a clone of \({G}_{xy}\), updated using the exponential moving average (EMA) method for weight updating²². This approach provides more stable model outputs and serves as the final inference network.

Vessel segmentation

To adapt the PT-Model for use in vessel segmentation tasks, we explore two viable approaches:fine-tuning and automatic thresholding.

Finetuning for segmentation

After conducting cycle-consistent pretraining, our U-Net model excels at performing single-frame subtractions. However, fine-grained vessel segmentation requires additional refinements to the initial output. We fine-tune the network on the FS-CAD dataset, specifically by optimizing the \({\text{G}}_{\text{E}}\) parameter. We employ a composite loss function that combines the binary cross-entropy loss and Dice loss, thus aligning with the standard training methodologies that are commonly used in conventional segmentation models. Upon fine-tuning the model, its outputs can be directly thresholded at a value of 0.5 to yield the final segmentation results.

Automatic thresholding for segmentation

For cases when additional fine-tuning data are not available, we propose an alternative approach that enables direct vascular segmentation to be performed using the subtraction images generated by the PT-Model. However, the pixel value distributions in the subtraction images produced by the PT-Model closely resemble those in the original images rather than converging to polarized values of 0 or 1, as expected for segmentation targets. Therefore, using a fixed threshold value for segmentation is not feasible. To address this limitation, we introduce a method called AutoTresh, which employs a joint segmentation approach that combines the Threshold-yen²³ and Threshold-Local²⁴ methods. This approach can effectively transform the outputs of the PT-Model into segmented results. The implementation details of this method can be found in the Supplementary materials E.

Experiments and results

After training the model, it is further fine-tuned on the FS-CAD or XCAD. The former is used to produce the best-performing model for the fine-grained segmentation of small vessels; this model is referred to as the FS-Model. The latter serves as the basis for quantitative comparisons with models trained using other methods and is denoted as the XCAD-Model. Due to the limited sample size of the FS-CAD or XCAD dataset, we employ previously described data augmentation strategies to mitigate overfitting during fine-tuning. The number of epochs used for fine-tuning is set to 100. For additional details regarding the training and testing procedures, readers are directed to our source code repository.

Deep subtraction results

It's important to note that our 'deep subtraction' method differs fundamentally from traditional DSA. While we use the term 'subtraction' for ease of comparison, our method does not perform pixel-wise subtraction between two frames. Instead, our generator \({G}_{xy}\) learns to produce a 'virtual mask' for any given input frame, effectively addressing the challenges posed by cardiac motion and other dynamic factors in angiographic imaging.

Both PT-Model and FS-Model generate what we term “deep subtraction” outputs. These processed images exhibit subtracted angiograms in which the majority of nonvascular tissue is effectively eliminated. Given the absence of any quantitative standards for assessing subtraction techniques, Fig. 2 serves as a visual comparative analysis between deep subtraction and DSA.

The second column of Fig. 2 displays the results achieved using DSA, a technique that relies on the identification of a mask frame from a given continuous video sequence for subtraction. While DSA can successfully remove static anatomical structures such as the ribs and vertebrae, it falters in terms of addressing artefacts induced by motion, particularly those originating from cardiopulmonary activities. Such artefacts become markedly visible in areas including lung markings and the diaphragm.

In sharp contrast, the third and fourth columns highlight the advantages of deep subtraction. This innovative approach eliminates the necessity of utilizing a predetermined mask, leveraging I2I translation to implicitly generate a corresponding mask for each frame. Deep subtraction overcomes the limitations of DSA. One primary issue with DSA is its inability to use an optimal mask for each frame. This leads to an incomplete background removal effect. This shortcoming has fostered a certain hesitancy among cardiovascular physicians to fully embrace DSA. Deep subtraction excels at more cleanly removing the background, attaining a performance level previously only observed in anatomically stable regions, such as in cerebral angiograms. Moreover, the fine-tuned FS-Model demonstrates superior subtraction outcomes compared to those of the PT-Model, as evidenced by its clearer displays of small blood vessels and more comprehensive removal of catheters. In summary, both models markedly outperform DSA in coronary angiography.

Segmentation results

Evaluation metrics

Common metrics employed for evaluating medical image segmentation performance include the pixel accuracy (PA), intersection over union (IoU), and Dice coefficient. The PA quantifies the proportion of correctly classified pixels within an image. However, its reliability can be compromised in cases with class imbalance; for instance, in our dataset, the background comprises a more significant portion of the data, thereby disproportionately influencing the PA score. The IoU is a prevalent metric in the realm of semantic segmentation. It measures the area of overlap between the ground truth and the predicted segmentation outcome, normalized by the area of their union. Due to its straightforwardness and efficacy, the IoU is widely utilized. The Dice score is another related metric that is calculated as twice the area of overlap divided by the total number of pixels in both the segmented and ground-truth images. While IoU and Dice scores are closely related, we present both to facilitate comparisons across medical and computer vision domains. For primary analysis, readers may focus on the Dice score, which is more commonly used in medical image segmentation.

Evaluation results

After applying thresholding to the outputs of the FS-Model, we obtain vessel segmentation images. Figure 3 shows the qualitative metrics, demonstrating the efficacy of our deep subtraction and segmentation algorithms on the test set of the LM-CAD dataset. Each column in the figure constitutes a sample organized as “original image”- “deep subtraction”- “egmentation.” The segmentation results clearly indicate that not only the primary branches of the coronary artery but also their secondary and tertiary vessels are precisely segmented. Furthermore, pathological alterations such as stenoses are effectively preserved in the segmentation outputs.

Given the limited sample size of the FS-CAD dataset, we resort to using a fivefold cross-validation strategy for performance evaluation purposes (Table 1). Specifically, in this fivefold cross-validation scheme, the dataset is partitioned into five equal subsets. One of these subsets is held out for testing, while the remaining four subsets are utilized for training. This process is iterated five times, with a different subset serving as the test set each time. The final performance metric is the average of the five individual test results. Our FS-Model achieves a Dice score of 0.828, further corroborating its robust segmentation capabilities. In comparison, the PT-Model, which is not fine-tuned on the FSCAD dataset, achieves a respectable Dice coefficient of 0.792 through the utilization of the AutoThresh method. A baseline U-Net model (with the same network structure but random initialization), which employs the same architectural underpinnings as those of the PT-Model but is trained exclusively on the FS-CAD dataset, records a significantly lower Dice score of 0.657. This underscores the utility of pretraining: the PT-Model already captures a majority of the vascular features and requires only minimal fine-tuning on a small sample set to effectively adapt to a specific task.

Table 1 Model performance achieved on the FS-CAD dataset.

Full size table

Considering that the ground-truth annotations for the vessels in the FS-CAD dataset are nearly within the limits of human visual discernment, the high Dice score achieved by the FS-Model attests to its ability to effectively segment even the most diminutive vascular structures.

Validation on the XCAD dataset

To further substantiate the advantages of pretraining, we extend our experimentation to the XCAD dataset, which is a publicly available coronary vessel segmentation dataset comprising 126 images with human-annotated vessel boundaries. Unlike in the FS-CAD dataset, in the XCAD dataset, the groundtruth annotations are primarily focused on larger vessels. Given that the FS-Model is specifically designed for comprehensive vessel detection, directly comparing its performance on the XCAD dataset, which emphasizes larger vessels, may not yield a fair assessment. However, the PT-Model is designed to learn generalized feature representations of coronary arteries, making it adaptable for use in various downstream tasks. To adapt our PT-Model to the specific characteristics of the XCAD dataset, we fine-tune it, resulting in what we refer to as the XCAD-Model. Based on the original authors’ training and evaluation protocols, we employ threefold cross-validation. The resulting scores are documented in Table 2. The data for the other methods and models are extracted directly from the work of¹⁹. The comparisons presented in this section, particularly in Tables 1 and 2, are designed to demonstrate the effectiveness of our approach in scenarios with limited annotated data. While traditional methods might achieve better results with larger annotated datasets, our goal is to show that good performance can be achieved with minimal manual annotation by leveraging pretraining on unannotated data. This approach addresses the common challenge of limited annotated data in medical imaging.

Table 2 Model performance achieved on the XCAD dataset.

Full size table

To obtain supervised learning scores, we conduct a threefold cross-validation evaluation on theXCAD dataset. Domain adaptation methods, such as MMD/citepbermudez2018domain and YNet²⁷, transfer knowledge from annotated datasets in the source domain to unannotated datasets in the target domain. For unsupervised learning, IIC²⁸ is a clustering-based method, while another method, named ReDO²⁸, utilizes an adversarial architecture to extract the object mask of the input. Self-supervised vessel segmentation (SSVS)¹⁹, proposed by Ma, employs adversarial learning to acquire vascular representations of unlabelled samples and includes a fractal synthetic module to generate synthetic vessels. SSVS was previously the best-performing unsupervised method on XCAD but fails to surpass the performance of supervised methods.

Our tests show that the XCAD-Model achieves the highest Dice score of 0.755, surpassing the solely supervised learning methods. The high Dice score produced by the XCAD-Model aligns with our expectations, as it is fine-tuned based on the PT-Model that had already learned vascular features through cycle-consistent training. Intriguingly, the PT-Model, which is never trained on the XCAD dataset, still achieves a Dice coefficient of 0.715 after the AutoThresh method is implemented. This result slightly lags behind the performance achieved through supervised learning but confirms the robust generalization ability of the PT-Model and establishes it as the best-performing unsupervised learning method, significantly surpassing SSVS. In contrast, alternative methods such as MMD, YNet, IIC, and ReDO register Dice scores below 0.6, revealing a substantial performance gap between them and supervised learning.

Figure 4 shows a visual representation of the segmentation results produced on the XCAD dataset. Compared to the ground truth, all the models display marginal differences when segmenting larger vessels; these disparities lie mainly in the identification of secondary and tertiary vessels as well as conduits. Remarkably, the XCAD-Model outperforms solely supervised learning methods in terms of recognizing conduits. The PT-Model, which has never been trained on this specific dataset, also approximates the performance of supervised learning methods and demonstrably outperforms the previously best unsupervised or self-supervised method, i.e., SSVS.

Stenosis detection

In addition, we develop an additional task with potential clinical value. On the basis of the PTModel, we fine-tune a network that is capable of identifying the vascular stenosis locations within coronary angiography images, which is referred to as SDNet. More precisely, we select 60 coronary angiography images with stenosis and manually annotate the stenosis sites. A subset of 10 images is designated as a test set, while the remaining images are allocated for training and validation purposes. Figure 5 illustrates the efficacy of SDNet in terms of detecting coronary stenosis within the test set. Despite the limited number of available training samples, SDNet proficiently identifies stenosis sites, attaining a Dice coefficient of 0.56 on the test set. For comparison, a U-Net model trained from scratch (without pretraining) yields a slower convergence rate and inferior performance, as evidenced by its Dice coefficient of merely 0.35. This outcome underscores the utility of the PT-Model as an exceptionally robust pretrained model. Comprehensive details concerning the training process and additional results acquired from this experiment are provided in the Supplementary materials F.

Discussion

Our study introduces a novel pretraining approach and model that are specifically tailored for coronary angiograms. We uniquely employ an image-to-image (I2I) framework to conduct pretraining on large unlabelled angiographic image datasets. This overarching strategy resembles the highly successful “self-supervised plus fine-tuning” paradigm found in seminal models such as BERT²⁹, GPTs³⁰, and ViTs³¹. To enrich the vascular features extracted from angiograms, our pretraining tasks are intentionally designed to toggle between mask images and live images, thereby allowing us to acquire robust vascular representations.

Significance of deep subtraction

DSA has not gained widespread acceptance in cardiology, primarily because of its motion artefact removal limitations. As a result, clinicians are often more inclined to directly examine original angiograms. In contrast, our pretrained model inherently possesses single-frame subtraction capabilities, dramatically surpassing DSA in this specific context. This advancement allows for significantly enhanced subtraction images to be obtained, offering a more reliable alternative in clinical settings.

Fine-tuning

This pretrained model enables competitive vessel segmentation results to be obtained with a minimum of annotated samples. On the FS-CAD dataset, we realize fine-grained annotation for coronary angiography with just 40 samples, achieving a Dice score of 0.825 when benchmarked against meticulous human annotations. With respect to the images in the XCAD dataset, our methodology surpasses the traditional supervised learning techniques, establishing a new state-of-theart (SOTA) approach.

Clinical applications

The model is inherently unaffected by motion, providing clearer coronary artery subtraction images. It can serve as an optional tool, offering radiologists and cardiologists a novel imaging display mode and allowing them to observe vessels without the interference of background noise in real time. Using an RTX3090, the inference speed is 0.3 s per frame (however, using a single 2.0 GHz CPU for inference takes approximately 4 s per frame. Running on a CPU still requires further optimization of speed with inference frameworks such as ONNX). Moreover, the model provides highly accurate segmentation results for major vessels, making it invaluable for quantitative cardiovascular analyses that inform medical decision-making. While our implementation mitigates some of the computational challenges associated with U-Net architectures, future work could explore parallel computation strategies to further optimize performance and resource utilization. Recent studies have demonstrated the effectiveness of such approaches in medical image processing^32,33,34,35.

An additional significant aspect is that the pretrained model can significantly reduce the data requirements for downstream tasks. We conduct vessel stenosis detection with a small sample dataset, with the proposed approach functioning as a real-time, automatic alert system for doctors during coronary angiography, enabling them to pinpoint potential vessel stenosis. This feature can also be further utilized to automatically and quantitatively analyse vessel stenosis. Similarly, the model can be conveniently transferred to other downstream tasks without the need for large manually annotated datasets.

Limitations and future directions

While our model achieves competitive performance, particularly regarding the identification of larger vessels, it does encounter limitations when segmenting extremely fine distal vessels, as evidenced in Fig. 6. These overlooked vessels, represented in green, are often only 1–2 pixels in size and exhibit low contrast levels. Even upon performing meticulous manual annotation through image magnification, these diminutive vessels are easily neglected. Although these minor vessels may not be of paramount clinical interest, enhancing the ability of the model to operate in this area could offer advantages for automated analyses. Despite our efforts to mitigate class imbalance and improve small vessel detection, these remain ongoing challenges in medical image segmentation. Future work could explore advanced techniques to further address these issues, drawing inspiration from recent studies in other medical domains. For instance, Chandrasekar et al. have conducted comprehensive analyses of machine learning approaches to handle imbalanced datasets in the context of drug permeability studies, both across the placenta³⁶ and the blood–brain barrier³⁷. Their work on data balancing techniques and machine learning model investigations could provide valuable insights for improving our approach to small vessel detection and addressing class imbalance in coronary angiogram segmentation.

Additional limitations of this study include the relatively limited scale and diversity of our dataset, the lack of multi-center validation, and the need for further clinical validation before widespread adoption. Future work should address these limitations through larger, more diverse datasets and extensive clinical trials.

In addition to refining the recognition abilities of the model, future research avenues include extending the model to other downstream tasks, such as multicategory vessel segmentation and the detection of various vascular abnormalities. This approach may also find applications in other medical vascular analysis tasks, thereby reducing the dependence on labelled data. Future research could explore the integration of stochastic resonance techniques and other machine learning technologies to further enhance image contrast and potentially improve segmentation performance, particularly for fine vessel structures. Recent work has shown promise in applying SR to medical image analysis tasks^38,39,40,41.

Conclusion

In this study, we introduce a novel vessel segmentation and extraction method for coronary angiography, specifically engineered to capture intricate vessel representations. A model pretrained via this approach excels in terms of background removal and single-frame subtraction, achieving effects similar to digital subtraction angiography (DSA) with only a single frame. Upon fine-tuning the model on a small annotated dataset, its vessel segmentation performance is further enhanced, even surpassing the performance of purely supervised methods. This approach significantly elevates the clarity of subtraction images and the accuracy of major vessel segmentation. These advances are not only academically significant but also have direct clinical value. They facilitate real-time vessel observation without interference from background noise and enable more precise quantitative cardiovascular analyses. Most importantly, this method offers a pathway for autonomously learning complex vessel representations from samples without the need for manual annotations, effectively reducing the demand for supervised samples in downstream tasks.

Data availability

All the data generated or analysed during this study, including the LM-CAD and FS-CAD datasets, as well as the associated code for training and evaluation the model and a live demo, are publicly available. These resources can be accessed via https://github.com/newfyu/DeepSA. We are committed to open science and encourage the use of our dataset and code for further research and development.

References

Bentoutou, Y., Taleb, N., El Mezouar, M. C., Taleb, M. & Jetto, L. An invariant approach for image registration in digital subtraction angiography. Pattern Recognit. 35, 2853–2865 (2002).
Article ADS Google Scholar
Bentoutou, Y. & Taleb, N. Automatic extraction of control points for digital subtraction angiography image enhancement. In 2003 IEEE Nuclear Science Symposium. Conference record (IEEE Cat. No.03CH37515) 2771–2775 (IEEE, 2003).
Bentoutou, Y., Taleb, N., Bounoua, A. & Serief, C. A results recapitulation of image registration techniques in digital subtraction angiography. In 2008 IEEE Nuclear Science Symposium Conference Record 4403–4408 (IEEE, 2008).
Fischer, P., Pohl, T., Köhler, T., Maier, A. & Hornegger, J. A robust probabilistic model for motion layer separation in x-ray fluoroscopy. In Information Processing in Medical Imaging (eds Ourselin, S. et al.) 288–299 (Springer International Publishing, 2015).
Chapter Google Scholar
Ma, H. et al. Layer separation for vessel enhancement in interventional x-ray angiograms using morphological filtering and robust PCA. In Augmented Environments for Computer-Assisted Interventions (eds Linte, C. A. et al.) 104–113 (Springer International Publishing, 2015).
Chapter Google Scholar
Gao, Y. et al. Deep learning-based digital subtraction angiography image generation. Int. J. Comput. Assist. Radiol. Surg. 14, 1775–1784 (2019).
Article MathSciNet PubMed Google Scholar
Hao, H., Ma, H. & Walsum, T. Vessel layer separation in x-ray angiograms with fully convolutional network. Robot. Interv. Model. 10576, 105761 (2018).
Google Scholar
Tsai, Y. C., Lee, H. J. & Chen, M. Y. C. Automatic segmentation of vessels from angiogram sequences using adaptive feature transformation. Comput. Biol. Med. 62, 239–253 (2015).
Article CAS PubMed Google Scholar
Angulo, J. Morphological colour operators in totally ordered lattices based on distances: application to image filtering, enhancement and analysis. Comput. Vis. Image Underst. 107, 56–73 (2007).
Article Google Scholar
Cruz-Aceves, I., Oloumi, F., Rangayyan, R. M., Aviña-Cervantes, J. G. & Hernandez-Aguirre, A. Automatic segmentation of coronary arteries using gabor filters and thresholding based on multiobjective optimization. Biomed. Signal Process. Control 25, 76–85 (2016).
Article Google Scholar
Krizhevsky, A., Sutskever, I. & Hinton, G. E. ImageNet classification with deep convolutional neural networks. Commun. ACM 60, 84–90 (2017).
Article Google Scholar
Du, T. et al. Training and validation of a deep learning architecture for the automatic analysis of coronary angiography. EuroIntervention 17, 32–40 (2021).
Article CAS PubMed PubMed Central Google Scholar
Chen, X., Fan, H., Girshick, R. & He, K. Improved baselines with momentum contrastive learning. arXiv preprint arXiv:2003.04297 (2020).
Chen, T., Kornblith, S., Swersky, K., Norouzi, M. & Hinton, G. E. Big self-supervised models are strong semi-supervised learners. Adv. Neural Inf. Process. Syst. 33, 22243–22255 (2020).
Google Scholar
Plourde, M. & Luc, D. Multi scale classification approach for coronary artery detection from X-ray angiography. In 2012 11th International Conference on Information Science, Signal Processing and Their Applications (ISSPA) 181–186 (IEEE, 2012).
Vlontzos, A. & Mikolajczyk, K. Deep segmentation and registration in x-ray angiography video. arXiv preprint arXiv:1805.06406 (2018).
Yin, X. X., Sun, L., Fu, Y., Lu, R. & Zhang, Y. U-net-based medical image segmentation. J. Healthc. Eng. 2022, 4189781 (2022).
Article PubMed PubMed Central Google Scholar
Bai, X., Zhou, F. & Xue, B. Image enhancement using multi scale image features extracted by top-hat transform. Opt. Laser Technol. 44, 328–336 (2012).
Article ADS Google Scholar
Ma, Y. et al. Self-supervised vessel segmentation via adversarial learning. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 7516–7525 (IEEE, 2021).
Isola, P., Zhu, J. Y., Zhou, T. & Efros, A. A. Image-to-image translation with conditional adversarial networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 5967–5976 (IEEE, 2017).
Zhu, J. Y., Park, T., Isola, P. & Efros, A. A. Unpaired image-to-image translation using cycle-consistent adversarial networks. In 2017 IEEE International Conference on Computer Vision (ICCV) 2242–2251 (IEEE, 2017).
Yaz, Y. et al. The unusual effectiveness of averaging in GAN training. In International Conference on Learning Representations (2018).
Jui-Cheng, Y., Fu-Juay, C. & Shyang, C. A new criterion for automatic multilevel thresholding. IEEE Trans. Image Process. 4, 370–378 (1995).
Article ADS Google Scholar
Gonzalez, R. C. & Richard, E. W. Digital Image Processing 2nd edn. (Prentice Hall, 2002).
Google Scholar
Ronneberger, O., Fischer, P. & Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Medical image Computing and Computer-Assisted Intervention–MICCAI 2015 (eds Navab, N. et al.) 234–241 (Springer International Publishing, 2015).
Google Scholar
Bermúdez-Chacón, R., Márquez-Neila, P., Salzmann, M. & Fua, P. A domain-adaptive two-stream U-Net for electron microscopy image segmentation. In 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018) 400–404 (IEEE, 2018).
Roels, J., Hennies, J., Saeys, Y., Philips, W. & Kreshuk, A. Domain adaptive segmentation in volume electron microscopy imaging. In 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019) 1519–1522 (IEEE, 2019).
Ji, X., Vedaldi, A. & Henriques, J. Invariant information clustering for unsupervised image classification and segmentation. In 2019 IEEE/CVF international conference on computer vision 9864–9873 (IEEE, 2019).
Devlin, J., Chang, M. W., Lee, K. & Toutanova, K. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT 4171–4186 (2019).
Brown, T. et al. Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 33, 1877–1901 (2020).
Google Scholar
Dosovitskiy, A. et al. An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations (2020).
Zhai, X. et al. Real-time automated image segmentation technique for cerebral aneurysm on reconfigurable system-on-chip. J. Comput. Sci. 27, 35–45 (2018).
Article MathSciNet Google Scholar
Esfahani, S. S. et al. Lattice-Boltzmann interactive blood flow simulation pipeline. Int. J. Comput. Assist. Radiol. Surg. 15, 629–639 (2020).
Article PubMed Google Scholar
Zhai, X. et al. Zynq SoC based acceleration of the lattice Boltzmann method. Concurr. Comput. Pract. Exp. 31, e5184 (2019).
Article Google Scholar
Zhai, X. et al. Heterogeneous system-on-chip-based lattice-boltzmann visual simulation system. IEEE Syst. J. 14, 1592–1601 (2020).
Article ADS Google Scholar
Chandrasekar, V. et al. Investigating the use of machine learning models to understand the drugs permeability across placenta. IEEE Access 11, 52726–52739 (2023).
Article Google Scholar
Ansari, M. Y., Chandrasekar, V., Singh, A. V. & Dakua, S. P. Re-routing drugs to blood brain barrier: a comprehensive analysis of machine learning approaches with fingerprint amalgamation and data balancing. IEEE Access 11, 9890–9906 (2023).
Article Google Scholar
Regaya, Y., Amira, A. & Dakua, S. P. Development of a cerebral aneurysm segmentation method to prevent sentinel hemorrhage. Netw. Model. Anal. Health Inform. Bioinform. 12, 18 (2023).
Article Google Scholar
Mohanty, S. & Dakua, S. P. Toward computing cross-modality symmetric non-rigid medical image registration. IEEE Access 10, 24528–24539 (2022).
Article Google Scholar
Dakua, S. P. et al. Moving object tracking in clinical scenarios: application to cardiac surgery and cerebral aneurysm clipping. Int. J. Comput. Assist. Radiol. Surg. 14, 2165–2176 (2019).
Article PubMed PubMed Central Google Scholar
Dakua, S. P., Abinahed, J. & Al-Ansari, A. A PCA-based approach for brain aneurysm segmentation. Multidimens. Syst. Signal Process. 29, 257–277 (2018).
Article MathSciNet Google Scholar

Download references

Acknowledgements

This work was supported by the Key Project of Technology Innovation and Application Development in Chongqing (Grant No. CSTB2023TIAD-KPX0048), the Chongqing Natural Science Foundation (Grant No. CSTB2024NSCQ-MSX0251), and the Scientific and Technological Research Program of Chongqing Municipal Education Commission (Grant No. KJQN202402805).

Author information

These authors contributed equally: Yunjie Zeng, Han Liu and Juan Hu.

Authors and Affiliations

Department of Cardiology, The Second Affiliated Hospital of Chongqing Medical University, Chongqing, 400010, China
Yunjie Zeng, Zhengbo Zhao & Qiang She
Department of Cardiology, The Affiliated Dazu’s Hospital of Chongqing Medical University, Chongqing, 402360, China
Yunjie Zeng
Department of Neurology, Jiulongpo District People’s Hospital, Chongqing, 400050, China
Han Liu
The First Affiliated Hospital of Chongqing Medical and Pharmaceutical College, Chongqing, 400060, China
Juan Hu

Authors

Yunjie Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Han Liu
View author publications
You can also search for this author in PubMed Google Scholar
Juan Hu
View author publications
You can also search for this author in PubMed Google Scholar
Zhengbo Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Qiang She
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Conceptualization: Y-JZ, HL and QS. Methodology: All. Data curation: All. Validation: Y-JZ, HL, JH, and Z-BZ. Formal analysis: JH, HL, Y-JZ, and QS. Investigation: JH,Z-BZ, HL, and Y-JZ. Writing—original draft: Y-JZ, and HL. Visualization: HL, JH, and Y-JZ. Supervision: HL,Y-JZ and QS. Project administration: QS. Funding acquisition: QS. Writing—review and editing: All. Final approval: All.

Corresponding author

Correspondence to Qiang She.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Zeng, Y., Liu, H., Hu, J. et al. Pretrained subtraction and segmentation model for coronary angiograms. Sci Rep 14, 19888 (2024). https://doi.org/10.1038/s41598-024-71063-5

Download citation

Received: 19 July 2024
Accepted: 23 August 2024
Published: 27 August 2024
DOI: https://doi.org/10.1038/s41598-024-71063-5
Springer Nature Limited