Identification of Breast Cancer by Using Matlab: A Minor Project Report On
Identification of Breast Cancer by Using Matlab: A Minor Project Report On
On
BACHELOR OF ENGINEERING
in
ELECTRONIC AND COMMUNICATION ENINEERING
Supervisor Name
Ms. JANANI M
Submitted By
NAVEEN V(18BEC4107)
SURIYA M(18BEC4172)
YAATHASH B(18BEC4193)
2.Sathish Kumar S
3.Suriya M
4.Yaathash B
Vision
To emerge as a leader among the top institutions in the field of technical education
Mission
M1: Produce smart technocrats with empirical knowledge who can surmount the global
challenges.
M2: Create a diverse, fully-engaged, learner-centric campus environment to provide quality
education to the students.
M3: Maintain mutually beneficial partnerships with our alumni, industry and professional
associations
Vision and Mission of the Department
Vision
To empower the Electronics and Communication Engineering students with emerging
technologies, professionalism, innovative research and social responsibility.
Mission
M1: Attain the academic excellence through innovative teaching learning process, research
areas & laboratories and Consultancy projects.
M2: Inculcate the students in problem solving and lifelong learning ability.
M3: Provide entrepreneurial skills and leadership qualities.
M4: Render the technical knowledge and industrial skills of faculties.
PROGRAM EDUCATIONAL OBJECTIVES (PEOs)
PEO 1: Graduates will have successful career in software industries and R&D divisions through
continuous learning.
PEO 2: Graduates will provide effective solutions for real world problems in the key domain of
computer science and engineering and engage in lifelong learning.
PEO 3: Graduates will excel in their profession by being ethically and socially responsible.
Title COs
CO1 CO2 CO3 CO4 CO5
POs
Minor PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10
Project – 1I
PSOs
PSO1 PSO2
ABSTRACT
Breast cancer is one of the most common cancer affecting women around the
world. Mammography is the most known and effective method to detect early signs
of breast cancer. However, due to some weaknesses in mammography such as
painful procedures and radiation, researches introduce another diagnosis method
which is by analysing thermal image. The purpose of this project is to design a
system to detect the signs shown in mammogram and thermal image using Image
Processing Technique applied to MATLAB. Image processing techniques can be
divided into several elements. The elements are image acquisition, image pre-
processing, image processing, feature extraction, object classification and
classification decision. Both type of images then analyse step by step according to
the elements. Mammogram images are analyse using morphology technique before
features extraction which then lead to the classification of the image into 3 classes
(„Normal Fatty breast‟, „Abnormal Fatty breast‟ and „Glandular breast). For thermal
image, the distribution of heat around the breast will be the features extracted and
analysed. The different range of heat in the image will be used to specify the possible
area of cancer. This project also includes the construction of Graphical User Interface
(GUI) so that the system is more users friendly.
TABLE OF CONTENT
1 INTRODUCTION 1
2 LITERATURE REVIEW 4
3 RESEARCH METHODOLOGY 10
4 CONCLUSION 16
4.1 Result 16
REFERENCE 17
LIST OF TABLES
LIST OF FIGURES
INTRODUCTION
Breast can be classified into 2 types due to its density which are „Fatty breast‟
and „Glandular breast‟. When the amount of fat tissues exceed the amount of fibro-
glandular tissues, the breast can be classified as „Fatty breast‟ and when the amount
of fibro-glandular tissues exceed the amount of fatty tissues, the breast can be
classified as „Glandular breast‟. Breast cancer occurs when breast tissues grow,
change and multiply rapidly without control which may form lump or mass of extra
tissues as shown in Figure 1.1. These masses are called tumor and can be either
cancerous (malignant) or non-cancerous (benign) [1].
Breast cancer is one of the most common cancers affecting woman and the
most common source of death among middle aged women. Based on the World
Health Statistics 2011 by Global Health Observatory (GHO), the mortality among
female population all over the world cause by malignant neoplasma is about 11.81%
and breast cancer are the highest with 15.80% compare to other types of cancer [2].
Successful treatment of breast cancer depends on early detection. Currently, two
imaging method uses to detect masses are mammography and thermography.
2
Due to the need of overcoming the problem that cause high rate of false
positive and false negative detection, a Computer Assisted Detection (CAD) system
is develop to provide assistant for clinician to identify cancerous tissue in
mammogram and thermal image. The system will be design based on image
processing technique on MATLAB platform.
1.2 Objectives
This project proposes to develop a system for breast cancer using image
processing technique.
3
1.2.1 Sub-objective
The 150 breast mammogram images for this study are obtained from trusted
online database (MIAS database) [11]. Breast thermal image with abnormalities
obtain from 6 case studies by Pacific Chiropractic and Research Centre Infrared
Imaging in California [4]. Both type of image analyse using Matlab software.
4
CHAPTER 2
LITERATURE REVIEW
Several researches have been done to develop CAD system to detect breast
cancer. The references for this paper are taken from journal, books and conferences
regarding the mammogram image and thermal image.
next chapter which is noise chapter. Noise is degradation in the image cause by
image disturbance during transferring and during image acquisition. Cleaning noise
is important to restore image to its original state and to analyse the image. The type
of noise discuss in this module are salt and pepper noise and Gaussian noise. The
image is filtered using fspecial function to clean up the noise. The next chapter
explain one of the most useful information in an image which is edge. The uses of
finding the edge are to measure size of the object, to isolate object from background,
to recognise and classify object in the image. There are numbers of edge detection
method discuss in this module such as Robert edge detection, Sobel edge detection
and Prewitt edge detection. The module also discuss on topic morphology.
Morphology is an operation in image processing to analyse shape in image.
Morphology consist of many types of operations and some of it such as dilation,
erosion, opening, closing, hit or miss transform, region filing and connected
components are discuss in this module. Topic colour processing is discussed in the
next chapter. In this chapter, the main topic discuss are what is colour mean in image
processing, colour models, colour image in MATLAB, pseudo colouring and colour
images processing . For example, to extract RGB component in RGB image, imshow
function can be used. [17]
Hao Yuen Kueh et. al. entitlement, biological image contain a lot of patterns
and objects which may convey information about biology mechanism. This tutorial
6
discusses the process to extract data from raw microscopic image using MATLAB.
The advantages to extract and quantify objects and patterns using automated image
analysis compare to manual methods of analysis is automated image analysis will
provide unbiased approach to extract information from images and testing
hypotheses. Automated system analysis also has advantages to facilitate the
collections of large data collections for statistical analysis. The topics discuss in the
first section of this tutorial are how to read, display, write and convert images.
Besides that, the author also discuss on how MATLAB represent image and how to
convert between different types of image. The second section discuss on contrast
adjustment. As majority of biological image have low dynamic range and the
features are difficult to be analysis, there is a need to enhance the appearance of the
image by using different intensity transformation. This step may improve the
performance of image segmentation algorithm and feature recognition. Next section
discuss on spatial filtering technique. The filters explained are smoothing filters
(average filter and Gaussian filter), edge detection filter (Prewitt filter and Sobel
filter), Laplacian filter and median filter. Mathematical morphology which uses to
extract features and components in images discussed on the next chapter. The
operations are dilation, erosion, opening (erosion followed by dilation), closing
(dilation followed by erosion), filling holes and clearing border objects. Image
segmentation process to subdivide image into regions and images discuss in the next
section. The quantitative information is processed and analysed using segmentation
technique for extraction. The techniques of segmentation are edge detection and
morphological watershed. This tutorial also discuss on analysis of dynamic and
motion in biological images. The techniques use to visualize dynamical behaviour
are kymographs (two-dimensional analog of times traces), difference images,
maximum intensity projections, image cross-correlation and particle tracking. [19]
due to human limitation computer system have to take the major role in detecting
abnormal tissue. The challenges that have been faced by the system are the wide
range of abnormalities features and the indistinguishable from surrounding cell. Most
of system developed involves algorithms which consist of two stages. The first stage
is to detect suspicious lesion and second stage is to reduce the number of false
positives. In BI-RADS system which discussed in this paper, the detected lesions are
classified as masses, calcifications, architectural distortion and bilateral asymmetry.
Masses are classified as benign or malignant based on density (fat containing, low
density, isodense and high density), margins (circumscribed, microlobular, obscured,
indistinct and spiculated) and shape (round, oval, lobular and irregular).
Calcifications classified as benign, malignancy suspicious and malignancy highly
suspicious based on the distribution of cluster, size, shape, and variability. For
architectural distortion, the lesion classified as malignant when integrated with other
lesion such as masses and classified as benign when scar and self-tissue damage due
to trauma detected. Bilateral asymmetry analyse based on its texture, shape
measurement, topology, brightness distribution, roughness, pattern assymetry and
directionality. [3]
Ranjeet Singh Tomar et. al. (2009) entitlement that image processing
techniques that been mention in their journal are more radiologist friendly. The
system is designed using image processing technique on MATLAB. The techniques
uses are edge detection and morphology. The detection process designed will start
with detecting entire cell in the image, followed by filling gaps, dilating gaps,
removing border, smoothing the objects, finding structures and lastly extracting large
objects. For feature extraction to find the wanted area, 3 steps were suggested which
were reduce uneven illumination, determine size distribution in Top-hat Image and
calculate first derivatives. The result for feature extraction then plotted into graph to
be analysing for classification. [6]
Hala Al-Shamlan and Ali El-Zaart (2010) entitlement, the features extraction
in mammogram is an important key for early detection of breast cancer. In this study,
they aim to determine the features extraction range. Before the range was
determined, image pre-processing and image segmentation process applied to the
images. Image pre-processing done to the image to suppress noise and improve the
contrast of the image. Image segmentation is for detect the suspicious lesion. The
8
features used are based on three main categories; Geometric, Texture and Gradient
features. For Geometric category, the features measured are area, perimeter and
compactness [20]. Features in Texture category mostly obtain from image histogram.
The features are mean, mean global area, mean local area, uniformity, standard
deviations, smoothness, skewness, entropy, correlation and inverse. The last category
is Gradient category. Features that classified under this category are Sobel-mean,
Sobel-mean global area, Sobel-mean local area, Sobel-uniformity, Sobel-standard
deviations, Sobel-smoothness, Sobel-skewness, Sobel-entropy, Sobel-correlation and
Sobel-inverse. After applying up to 23 types of features extraction to 80
mammograms, they manage to obtain the range value for each feature extraction
which may be used for further process in their breast cancer CAD system. [7]
Monique Frize, Christophe Herry and Rober Roberge (2002) entitlement, the
3 technique in Head et al‟s methods [21]. The study shows the third method provided
reliable result compare to the first and second method when applied to 9 patient‟s
sample (6 with a diagnosis of normal and 3 with cancer). One of the analysis done is
by increasing the threshold value in the methods and the result obtain are no false
negatives or false positives on the sample. Therefore by looking at this preliminary
result, they concluded future work should focus on improving third method to
enhance thermogram diagnosis and decrease false negatives or false positives. [8]
and match the result to clinical finding. The system then interface with developed
graphical user interface (GUI) to allow easier thermal image analysis by the
radiologist or clinician. [9]
CHAPTER 3
RESEARCH METHODOLOGY
This study will be using image processing main elements which are image
acquisition, image pre-processing, image processing, feature extraction, object
classification and classification decision as shown in Figure 3.1 for developing CAD
system for mammograms and thermal images.
Image acquisition step involves the camera and its connection to the
computer or processors. Computer or processors will receive the image in digital
format. Image pre-processing step is a step to improve and enhance the image for
processing step. Image processing step is a further step in analysing the image to
obtain desired object. A lot of image processing techniques can be used in this step
such as morphological processing, edge detection and compression. Feature
extraction is where a set of desired features extracted from data pixels of the image
which are good for classification. Object classification and classification decision are
steps to make decision based on test and analysis done on the image [12]. The above
methods is use to develop the following system;
3.1.1 System 1.
In this study, The CAD system will be tested using 150 mammography
images (65 „Fatty breast‟ image and 85 „Glandular breast‟ image). The digital
mammography images acquired from online mammogram database (MAIS
database). The image resolution of the image is 1024 x 1024 and in PGM (Portable
Graymap) format. A sample of the database is shown in Figure 3.3.
12
3.1.2 System 2.
For detection for abnormalities, the system will test the 65 „Fatty breast‟
mammography image that had been classified among the 150 mammogram images
tested before. The digital mammography images acquired from online mammogram
database (MAIS database). The image resolution of the image is 1024 x 1024 and in
PGM (Portable Graymap) format.
3.1.3 System 3.
Thermal images of breast can be obtained using infrared camera. One type of
camera that can be used is FLIR A615 manufactured by FLIR Systems, Inc. as
shown in Figure 3.4. FLIR A615 is a perfect instrument for industries when the
temperature changes over time is quiet fast. FLIR A615 also complies with standards
like GigE Vision that allow this camera to interface using the Gigabit Ethernet
communication protocol and fast image transfer using low cost standard cables even
over long distances. This camera also complies with GenICam protocol which allows
the camera to be use with third party software. Due to its compliance to standards,
FLIR A615 is a Plug&Play device within 3rd parties Machine Vision softwares like
NIs IMAQ Vision™ and the MVTecs Halcon™ software. By using this camera,
image with resolution 640 x 480 pixels can be obtained [14].
13
For thermal image analysis, the system will test the 6 image at front position
that taken from 6 case studies by Pacific Chiropractic and Research Centre Infrared
Imaging website. The image resolution of the image is around 300 x 200 and in JPG
(Joint Photographic Expert Group) format. A sample of thermal image is shown in
Figure 3.5.1.
3.2.1 System 1.
(a) (b)
(c) (d)
Figure 3.6 Image Pre-processing and Image Processing. (a) original image, (b)
BW image, (c) Sobel gradient of (b), (d) Cropped image based on Hough parameter.
The original image (Figure 3.5.2(a)) is a grayscale image. The image then
convert to BW (black and white) image (Figure 3.5.2(b)) using im2bw function.
im2bw (I, level) function converts grayscale image to a binary image. The output
image occur after replacing all pixels in the input image with luminance greater than
level with the value 1 (white) and replaces all other pixels with the value 0 (black).
The level for this function is between 0 and 1 which is relative to the signal levels
possible for the image's class.
Then, the edge of the image is finding by using Sobel gradient as shown in
Table 3.1. After Sobel gradient apply to image in Figure 3.5.2(b), image in Figure
3.5.2(c) obtained. By using this image, the parameters of Hough function and
Houghpeaks function obtain in order to crop the image automatically and produce
output image (Figure 3.5.2(d)). Hough (BW) computes the Standard Hough
Transform (SHT) of the binary image. Hough function is used to detect lines in the
image and generate Hough transform matrix. Houghpeaks function will locate peaks
in the Hough transform matrix and this value will be used to crop the ROI (Region of
Interest) of the image.
15
-1 -2 -1 -1 0 1
0 0 0 -2 0 2
1 2 1 -1 0 1
Table 3.1: Sobel approximation to the derivatives
3.2.2 System 2.
Figure 3.7 Image Pre-processing and Image Processing. (a) original image, (b)
BW image, (c) imclearborder of image (b), (d) result after Morphological opening
apply on image (c), (e) output image after multiplication between image (a) and (d),
(f) sharpened image obtained by top-hat filtering and adjusting.
The image is crop to obtain new image with resolution 1001 x 1001 pixels.
The images then change to BW using im2bw function (Figure 3.6(b)) before the
noise removal function (imclearborder) use on the image (Figure 3.6(c)).
Imclearborder function use to suppress structures that are lighter than their
surroundings and that are connected to the image border. This step is to allow more
accurate further analysis.
16
CHAPTER 4
CONCLUSION
4.1 Result
In pre-processing stage, the images read and saved as gray scale. Many filters were
applied such as white and black filter, sharp image and log transformation. In this
stage, the noise was removed followed by sharpening of the images and median filter
application. The median and log transformation give best results comparing with the
others. They work through mathematic equations to get rid the unwanted values of
pixels and substituting them with wanted ones. They work in extemporized manner
(non-linear) that increases the diagnostic and quality values of the images.
American Cancer Society (1999). Breast Cancer Facts & Figures 1997-
1998.