Melanoma Skin Cancer Identification with Explainability Utilizing Mask Guided Technique
<p>Taxonomy of techniques.</p> "> Figure 2
<p>High level architecture.</p> "> Figure 3
<p>Sample skin images in HAM10000 (Bottom row: Two image types considered for this study; Top row: Other skin types available in the dataset).</p> "> Figure 4
<p>Sample attribute masks of melanoma skin images from ISIC 2018 task 2 dataset.</p> "> Figure 5
<p>Sample binary masks of melanoma skin images from ISIC 2017 dataset.</p> "> Figure 6
<p>Illustration of U2-Net architecture.</p> "> Figure 7
<p>Sample melanoma skin image after the background removal.</p> "> Figure 8
<p>The overall architecture of our proposed SM-ViT.</p> "> Figure 9
<p>Explainability model architecture using Grad-CAM++.</p> "> Figure 10
<p>Qualitative comparison of the U2-Net and U-Net.</p> "> Figure 11
<p>Confusion matrix of mask-guided CNN models (<b>a</b>) Xception, (<b>b</b>) ResNet, and (<b>c</b>) VGG16.</p> "> Figure 12
<p>Learning curves of mask-guided CNN models (<b>a</b>) Xception, (<b>b</b>) ResNet, and (<b>c</b>) VGG16.</p> "> Figure 13
<p>Comparison of ROC curves of the mask-guided CNNs.</p> "> Figure 14
<p>Confusion matrix of (<b>a</b>) Base-ViT model (<b>b</b>) SM-ViT model (proposed).</p> "> Figure 15
<p>Comparison of Grad-CAM ++ and Grad-CAM for the same image.</p> "> Figure 16
<p>Example for IoU quantitative evaluation.</p> "> Figure 17
<p>Expandability method results of each CNN model for images predicted as melanoma.</p> "> Figure 18
<p>Expandability method results of each ViT-based model.</p> "> Figure 19
<p>Comparison of IOU values for each model.</p> "> Figure 20
<p>The generated medical report.</p> "> Figure 21
<p>Classification accuracy of the sample set: proposed model vs. real users.</p> "> Figure 22
<p>The percentage of (<b>a</b>) usefulness of the system and (<b>b</b>) degree of enhancement.</p> ">
Abstract
:1. Introduction
- 1.
- Apply extensive data augmentation to address the imbalanced datasets;
- 2.
- Train the U2-Net model using the ISIC 2017 Task1 dataset to generate the segmentation masks to separate the foreground object from the background in an image;
- 3.
- Comparative study for different CNNs and ViT-based models for the melanoma and nevus skin cancer classification;
- 4.
- Identify the performances by utilizing different hyperparameter tuning;
- 5.
- Enhance the performance of ViT in fine-grained visual categorization (FGVC) using SM-VIT;
- 6.
- Provide integrability to fine-tune on top of a ViT-based backbone that leverages the standard self-attention mechanism;
- 7.
- Improve the trustworthiness of the system using explainability methods such as Grad-CAM and Grad-CAM++ heat maps;
- 8.
- Qualitative and quantitative model evaluation using intersection over union (IOU) and skin cancer feature masks dataset (ISIC 2018 TASK 2);
- 9.
- Develop a web application to use the proposed model as a support tool.
2. Background
2.1. Explainable Artificial Intelligence
2.2. Related Studies
3. Methodology
3.1. Model Design
3.2. Dataset
3.3. U2-Net Based Segmentation Model
Algorithm 1: Proposed segmentation module training pipeline |
3.4. CNN-Based Classification
Algorithm 2: Proposed CNN-based classification and explainability pipeline |
3.5. ViT-Based Classification
Algorithm 3: Proposed ViT-based classification and explainability pipeline |
3.6. Explainability
3.7. Implementation Details
- TensorFlow: provides the core framework for building and training DL models. We used these libraries to implement the CNNs for classifying melanoma skin cancer images;
- Keras: simplifies the process of building and training these models and places as a high-level API built on top of TensorFlow. We used CNN architectures such as ResNet50, VGG16, and Xception, which can be conveniently implemented in Keras with pre-trained weights, facilitating transfer learning;
- PyTorch: as another widely used framework for building and training DL model and supports tasks, such as data loading and pre-processing, model training, and implementing model architecture. The associated dynamic computational graph and efficient memory usage, enables large-scale DL tasks. We have used this library to develop ViT based approach;
- Scikit-learn (sklearn): is a Python library that supports classification tasks and interoperates with the Python numerical and scientific libraries NumPy and SciPy. We used Scikit-learn during the model evaluation process;
- NumPy: is a Python library that supports large scale, multi-dimensional arrays and matrices. This enables high-level mathematical functions to operate on these arrays. We used NumPy for numerical operations, manipulating arrays for data preprocessing, and integration with other libraries like TensorFlow and PyTorch;
- Matplotlib: is a plotting library in Python and its numerical mathematics extension NumPy. It provides an object-oriented API for embedding plots into applications using general-purpose GUI toolkits. We used Matplotlib for visualizing data, plotting the training history of models, or displaying the heatmaps generated by Grad-CAM and Grad-CAM++.
4. Result and Discussion
4.1. Segmentation Result
4.2. Classification Result
4.2.1. Results of CNN Classification
4.2.2. Results of ViT Classification
4.3. Explainability Results
4.3.1. Overall Explainability
4.3.2. Explainability of CNN Based Models
4.3.3. Explainability of ViT-Based Models
5. Results Validation and System Usability
5.1. Web Application Support Solution
5.2. Real-World Validation of Results
5.3. System Usability Study
6. Discussion
Study | XAI Method | Classifier | Accuracy |
---|---|---|---|
2019 [3] | CAM, Grad-CAM | Inception | 88% |
2020 [18] | Integrated gradient | ResNet18 | 64% |
2020 [39] | - | MobileNet | 81.24% |
2020 [37] | - | EfficientNet | 90.0% |
2020 [38] | - | CNN-based | 92.90% |
2020 [40] | - | ResNet + Inception | 92.83% |
2020 [35] | - | ResNet50 Ensemble | 93% |
2020 [41] | - | Seg + CNN | 95% |
2021 [32] | CAM | CNN-based | 74% |
2021 [36] | - | CNN Ensemble | 88% |
2021 [34] | - | CNN, Autoencoder | 92.5% |
2021 [42] | Soft Attention | ResNet + Inception | 93.4% |
2022 [33] | SHAP, Grad-CAM | Seg + CNN | 90.6% |
2023 [13] | Grad-CAM, Grad-CAM++ | Xception | 90.24% |
2023 [43] | Attention Map | ViT | 94.1% |
Our study | SM-ViT | 92.79% | |
Mask-guided | Grad-CAM | VGG16 | 97.37% |
and without | Grad-CAM++ | ResNet50 | 98.18% |
duplicates | Xception | 98.37% |
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Society, A.C. Melanoma Skin Cancer Statistics. 2023. Available online: https://www.cancer.org/ (accessed on 20 April 2023).
- Wu, Y.; Chen, B.; Zeng, A.; Pan, D.; Wang, R.; Zhao, S. Skin cancer classification with deep learning: A systematic review. Front. Oncol. 2022, 12, 893972. [Google Scholar] [CrossRef]
- Young, K.; Booth, G.; Simpson, B.; Dutton, R.; Shrapnel, S. Deep neural network or dermatologist? In Proceedings of the 9th International Workshop on Multimodal Learning for Clinical Decision Support, Shenzhen, China, 13–17 October 2019; Springer: Berlin/Heidelberg, Germany, 2019; pp. 48–55. [Google Scholar] [CrossRef]
- Wickramanayake, S.; Rasnayaka, S.; Gamage, M.; Meedeniya, D.; Perera, I. Explainable Artificial Intelligence for Enhanced Living Environments: A Study on User Perspective; Advances in Computers; Elsevier: Amsterdam, The Netherland, 2023. [Google Scholar] [CrossRef]
- Dasanayaka, S.; Shantha, V.; Silva, S.; Meedeniya, D.; Ambegoda, T. Interpretable machine learning for brain tumour analysis using MRI and whole slide images. Softw. Impacts 2022, 13, 100340. [Google Scholar] [CrossRef]
- Tschandl, P.; Rosendahl, C.; Kittler, H. The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Sci. Data 2018, 5, 180161. [Google Scholar] [CrossRef] [PubMed]
- Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 618–626. [Google Scholar] [CrossRef]
- Chattopadhay, A.; Sarkar, A.; Howlader, P.; Balasubramanian, V.N. Grad-CAM++: Generalized Gradient-Based Visual Explanations for Deep Convolutional Networks. In Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Tahoe, NV, USA, 12–15 March 2018; pp. 839–847. [Google Scholar] [CrossRef]
- Demidov, D.; Sharif, M.H.; Abdurahimov, A.; Cholakkal, H.; Khan, F.S. Salient Mask-Guided Vision Transformer for Fine-Grained Classification. In Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications—Volume 4 VISAPP: VISAPP, Lisbon, Portugal, 19–21 February 2023. [Google Scholar] [CrossRef]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In Proceedings of the International Conference on Learning Representations, Virtual Event, 3–7 May 2021. [Google Scholar]
- Munn, M.; Pitman, D. Explainable AI for Practitioners; O’Reilly Media, Inc.: Sebastopol, CA, USA, 2022. [Google Scholar]
- Shyamalee, T.; Meedeniya, D.; Lim, G.; Karunarathne, M. Automated Tool Support for Glaucoma Identification with Explainability Using Fundus Images. IEEE Access 2024, 12, 17290–17307. [Google Scholar] [CrossRef]
- Gamage, L.; Isuranga, U.; De Silva, S.; Meedeniya, D. Melanoma Skin Cancer Classification with Explainability. In Proceedings of the 3rd International Conference on Advanced Research in Computing (ICARC), Belihuloya, Sri Lanka, 23–24 February 2023; pp. 30–35. [Google Scholar] [CrossRef]
- Pereira, P.M.; Thomaz, L.A.; Tavora, L.M.; Assuncao, P.A.; Fonseca-Pinto, R.M.; Paiva, R.P.; de Faria, S.M.M. Melanoma classification using light-Fields with morlet scattering transform and CNN: Surface depth as a valuable tool to increase detection rate. Med. Image Anal. 2022, 75, 102254. [Google Scholar] [CrossRef] [PubMed]
- Shinde, S.; Tupe-Waghmare, P.; Chougule, T.; Saini, J.; Ingalhalikar, M. Predictive and discriminative localization of pathology using high resolution class activation maps with CNNs. PeerJ Comput. Sci. 2021, 7, e622. [Google Scholar] [CrossRef] [PubMed]
- Nunnari, F.; Kadir, M.A.; Sonntag, D. On the Overlap Between Grad-CAM Saliency Maps and Explainable Visual Features in Skin Cancer Images. In Proceedings of the International Cross-Domain Conference for Machine Learning and Knowledge Extraction, Virtual Event, 17–20 August 2021; pp. 241–253. [Google Scholar] [CrossRef]
- Murabayashi, S.; Iyatomi, H. Towards Explainable Melanoma Diagnosis: Prediction of Clinical Indicators Using Semi-supervised and Multi-task Learning. In Proceedings of the 2019 IEEE International Conference on Big Data (Big Data), Los Angeles, CA, USA, 9–12 December 2019; pp. 4853–4857. [Google Scholar] [CrossRef]
- Margeloiu, A.; Simidjievski, N.; Jamnik, M.; Weller, A. Improving interpretability in medical imaging diagnosis using adversarial training. arXiv 2020, arXiv:2012.01166. [Google Scholar] [CrossRef]
- Kaur, R.; GholamHosseini, H.; Sinha, R.; Lindén, M. Melanoma classification using a novel deep convolutional neural network with dermoscopic images. Sensors 2022, 22, 1134. [Google Scholar] [CrossRef] [PubMed]
- Wei, L.; Ding, K.; Hu, H. Automatic Skin Cancer Detection in Dermoscopy Images Based on Ensemble Lightweight Deep Learning Network. IEEE Access 2020, 8, 99633–99647. [Google Scholar] [CrossRef]
- Meedeniya, D. Deep Learning: A Beginners’ Guide; CRC Press LLC: Boca Raton, FL, USA, 2023; Available online: https://www.routledge.com/9781032473246 (accessed on 30 January 2024).
- Qin, X.; Zhang, Z.; Huang, C.; Dehghan, M.; Zaiane, O.R.; Jagersand, M. U2-Net: Going deeper with nested U-structure for salient object detection. Pattern Recognit. 2020, 106, 107404. [Google Scholar] [CrossRef]
- ISIC Challenge 2018 Dataset. Available online: https://challenge.isic-archive.com/landing/2018/46/ (accessed on 28 March 2023).
- Qin, Z.; Liu, Z.; Zhu, P.; Xue, Y. A GAN-based image synthesis method for skin lesion classification. Comput. Methods Programs Biomed. 2020, 195, 105568. [Google Scholar] [CrossRef]
- Chollet, F. Xception: Deep Learning with Depthwise Separable Convolutions. In Proceedings of the Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 1800–1807. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar] [CrossRef]
- Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2015, arXiv:1409.1556. [Google Scholar] [CrossRef]
- Keras. BayesianOptimization Tuner. Available online: https://keras.io/api/keras_tuner/tuners/bayesian/ (accessed on 8 March 2022).
- Nimalsiri, W.; Hennayake, M.; Rathnayake, K.; Ambegoda, T.D.; Meedeniya, D. Automated Radiology Report Generation Using Transformers. In Proceedings of the 3rd International Conference on Advanced Research in Computing (ICARC), Belihuloya, Sri Lanka, 23–24 February2023; pp. 90–95. [Google Scholar] [CrossRef]
- Deng, J.; Dong, W.; Socher, R.; Li, L.J.; Li, K.; Fei-Fei, L. ImageNet: A large-scale hierarchical image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 248–255. [Google Scholar] [CrossRef]
- Dasanayaka, S.; Silva, S.; Shantha, V.; Meedeniya, D.; Ambegoda, T. Interpretable Machine Learning for Brain Tumor Analysis Using MRI. In Proceedings of the 2022 2nd International Conference on Advanced Research in Computing (ICARC), Belihuloya, Sri Lanka, 23–24 February 2022; pp. 212–217. [Google Scholar] [CrossRef]
- Chowdhury, T.; Bajwa, A.R.; Chakraborti, T.; Rittscher, J.; Pal, U. Exploring the correlation between deep learned and clinical features in melanoma detection. In Proceedings of the 25th Annual Conference on Medical Image Understanding and Analysis (MIUA), Oxford, UK, 12–14 July 2021; pp. 3–17. [Google Scholar] [CrossRef]
- Wang, S.; Yin, Y.; Wang, D.; Wang, Y.; Jin, Y. Interpretability-based multimodal convolutional neural networks for skin lesion diagnosis. IEEE Trans. Cybern. 2021, 52, 12623–12637. [Google Scholar] [CrossRef]
- Ahmad, B.; Jun, S.; Palade, V.; You, Q.; Mao, L.; Zhongjie, M. Improving skin cancer classification using heavy-tailed Student t-distribution in generative adversarial networks (TED-GAN). Diagnostics 2021, 11, 2147. [Google Scholar] [CrossRef] [PubMed]
- Le, D.N.; Le, H.X.; Ngo, L.T.; Ngo, H.T. Transfer learning with class-weighted and focal loss function for automatic skin cancer classification. arXiv 2020, arXiv:2009.05977. [Google Scholar] [CrossRef]
- Rahman, Z.; Hossain, M.S.; Islam, M.R.; Hasan, M.M.; Hridhee, R.A. An approach for multiclass skin lesion classification based on ensemble learning. Inform. Med. Unlocked 2021, 25, 100659. [Google Scholar] [CrossRef]
- Pham, T.C.; Doucet, A.; Luong, C.M.; Tran, C.T.; Hoang, V.D. Improving skin-disease classification based on customized loss function combined with balanced mini-batch logic and real-time image augmentation. IEEE Access 2020, 8, 150725–150737. [Google Scholar] [CrossRef]
- Polat, K.; Koc, K.O. Detection of skin diseases from dermoscopy image using the combination of convolutional neural network and one-versus-all. J. Artif. Intell. Syst. 2020, 2, 80–97. [Google Scholar] [CrossRef]
- Lucius, M.; De All, J.; De All, J.A.; Belvisi, M.; Radizza, L.; Lanfranconi, M.; Lorenzatti, V.; Galmarini, C.M. Deep neural frameworks improve the accuracy of general practitioners in the classification of pigmented skin lesions. Diagnostics 2020, 10, 969. [Google Scholar] [CrossRef]
- Chaturvedi, S.S.; Tembhurne, J.V.; Diwan, T. A multi-class skin Cancer classification using deep convolutional neural networks. Multimed. Tools Appl. 2020, 79, 28477–28498. [Google Scholar] [CrossRef]
- Adegun, A.A.; Viriri, S. FCN-based DenseNet framework for automated detection and classification of skin lesions in dermoscopy images. IEEE Access 2020, 8, 150377–150396. [Google Scholar] [CrossRef]
- Datta, S.K.; Shaikh, M.A.; Srihari, S.N.; Gao, M. Soft attention improves skin cancer classification performance. In Proceedings of the 4th International Workshop on Interpretability of Machine Intelligence in Medical Image Computing, (iMIMIC), Singapore, Singapore, 22 September 2021; pp. 13–23. [Google Scholar] [CrossRef]
- Yang, G.; Luo, S.; Greer, P. A Novel Vision Transformer Model for Skin Cancer Classification. Neural Process. Lett. 2023, 55, 9335–9351. [Google Scholar] [CrossRef]
Melanoma | Nevus Lesions [Nevi] | |
---|---|---|
Before removing duplication | 1113 | 6705 |
After removing duplication | 614 | 5404 |
Balanced image count | 614 | 614 |
Skin Cancer | Training Set | Testing Set | Validation Set |
---|---|---|---|
Melanoma | 512 | 65 | 64 |
Nevus | 512 | 65 | 64 |
Parameter Name | Range |
---|---|
Epochs | 1–100 |
Number of search rounds | 4 |
Number of iterations per search | 3 |
Learning rate | 0.0000001–0.0001 |
Number of nodes in the Dense layer | 50–255 |
Dropout | 0.1–0.8 |
Optimizer function | Adam/SGD |
Parameter Name | Xception | ResNet50 | VGG16 |
---|---|---|---|
Epochs | 92 | 60 | 73 |
Number of searches | 4 | 4 | 4 |
Number of iterations per search | 3 | 3 | 3 |
Learning rate | 1.7 × 10−5 | 1.3 × 10−4 | 5.5 × 10−6 |
Number of nodes in the dense layer | 103 | 171 | 204 |
Dropout | 0.741 | 0.553 | 0.563 |
Optimizer function | Adam | Adam | Adam |
Model | IoU | Loss | Recall |
---|---|---|---|
U2-Net | 0.93 | 11.65% | 92.77% |
U-Net | 0.84 | 15.40% | 85.21% |
Model | Accuracy | Sensitivity | Specificity |
---|---|---|---|
Xception * | 98.37% | 95.92% | 99.01% |
Vanilla-Xception | 84.55% | 89.79% | 81.08% |
ResNet50 * | 98.18% | 97.95% | 98.65% |
Vanilla-ResNet50 | 86.91% | 91.83% | 83.78% |
VGG16 * | 97.56% | 97.82% | 97.40% |
Vanilla-VGG16 | 80.13% | 86.54% | 79.83% |
Inception * | 95.93% | 91.83% | 98.64% |
Inception | 61.78% | 99.9% | 36.48% |
MobileNet * | 96.56% | 93.87% | 98.99% |
MobileNet | 95.93% | 89.79% | 97.89% |
Model | Accuracy | Sensitivity | Specificity |
---|---|---|---|
Xception * | 90.94% | 91.83% | 89.18% |
Xception | 82.11% | 95.92% | 72.97% |
ResNet50 * | 91.05% | 99.90% | 85.13% |
ResNet50 | 88.61% | 93.87% | 85.13% |
VGG16 * | 93.49% | 91.83% | 94.45% |
VGG16 | 85.88% | 88.12% | 93.12% |
Inspection * | 92.68% | 97.95% | 89.18% |
Inspection | 84.55% | 87.75% | 82.43% |
MobileNet * | 91.05% | 95.91% | 87.83% |
MobileNet | 88.61% | 93.87% | 85.13% |
Model | Accuracy | Sensitivity | Specificity |
---|---|---|---|
Xception * | 81.17% | 86.09% | 75.93% |
ResNet50 * | 83.41% | 88.70% | 77.78% |
VGG16 * | 82.96% | 91.03% | 74.07% |
Inspection * | 79.82% | 91.30% | 71.30% |
MobileNet * | 81.16% | 95.52% | 64.81% |
Model | Accuracy | Sensitivity | Specificity |
---|---|---|---|
Base-ViT | 84.68% | 83.87% | 85.48% |
SM-ViT | 92.79% | 91.09% | 93.54% |
Model | Average IoU (Grad-CAM++) | Average IoU (Grad-CAM) |
---|---|---|
ResNet50 * | 0.5093 | 0.4932 |
Xception * | 0.4745 | 0.4664 |
VGG16 * | 0.3968 | 0.3981 |
Model | Average IoU (Grad-CAM++) | Average IoU (Grad-CAM) |
---|---|---|
Base-ViT | 0.5463 | 0.5822 |
SM-ViT | 0.6220 | 0.6947 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Gamage, L.; Isuranga, U.; Meedeniya, D.; De Silva, S.; Yogarajah, P. Melanoma Skin Cancer Identification with Explainability Utilizing Mask Guided Technique. Electronics 2024, 13, 680. https://doi.org/10.3390/electronics13040680
Gamage L, Isuranga U, Meedeniya D, De Silva S, Yogarajah P. Melanoma Skin Cancer Identification with Explainability Utilizing Mask Guided Technique. Electronics. 2024; 13(4):680. https://doi.org/10.3390/electronics13040680
Chicago/Turabian StyleGamage, Lahiru, Uditha Isuranga, Dulani Meedeniya, Senuri De Silva, and Pratheepan Yogarajah. 2024. "Melanoma Skin Cancer Identification with Explainability Utilizing Mask Guided Technique" Electronics 13, no. 4: 680. https://doi.org/10.3390/electronics13040680