[go: up one dir, main page]

0% found this document useful (0 votes)
104 views19 pages

"Detection and Classification of Consumed Food Items" Using Deep Learning Algorithm

This project aims to detect and classify 22 different food items from input images using a deep learning algorithm. A dataset containing 2600 total images across 22 food classes is collected from online sources and preprocessed. A MobileNetV2 model is used for transfer learning with the food images. The model is implemented in Python and a web app is created using Streamlit to allow users to upload an image for detection. The app outputs the predicted food item name, its category (healthy or unhealthy), and calories retrieved from an online API. Model performance is evaluated using metrics like accuracy and confusion matrix.

Uploaded by

jeffery lawrence
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
104 views19 pages

"Detection and Classification of Consumed Food Items" Using Deep Learning Algorithm

This project aims to detect and classify 22 different food items from input images using a deep learning algorithm. A dataset containing 2600 total images across 22 food classes is collected from online sources and preprocessed. A MobileNetV2 model is used for transfer learning with the food images. The model is implemented in Python and a web app is created using Streamlit to allow users to upload an image for detection. The app outputs the predicted food item name, its category (healthy or unhealthy), and calories retrieved from an online API. Model performance is evaluated using metrics like accuracy and confusion matrix.

Uploaded by

jeffery lawrence
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 19

A CAL Project Report

Project Report
on
“DETECTION AND CLASSIFICATION OF
CONSUMED FOOD ITEMS”
USING DEEP LEARNING ALGORITHM
to be submitted in partial fulfilling of the requirements for the course on
Data Mining and Business Intelligence – ITA5007
(B2+TB2)

by
Ashok Rajbanshi (20MCA0271)
Jeffery M. Lawrence (21MCA0137)
Frank Therattil (21MCA0184)

Winter Semester 2021-2022

1
TABLE OF CONTENTS
ABSTRACT
1. Introduction ……………………………………………………………. 04
2. Review 1 (Survey, Analysis)...………………………………….……… 05
a. Problem definition
b. Dataset Description
3. Review 2 (Design)…………………….………………….. 06
a. Methodology
i. Module Description
1. Data exploration
2. Pre-processing
3. …….. (include other modules like that)
ii. Algorithms used
1. Justification for choosing the models
iii. Flow diagram of your model
iv. Dataset after preprocessing
v. Dataset split(train and test)
4. Review 3 (Code)…………………………………. 11
a. Implementation
i. Software and hardware description
ii. Output screenshots
b. Confusion Matrix
c. Comparison of the models used
d. Comparison graph
5. Conclusion ………………………………………………………....…… 18
6. References ………………………………………………………………. 19

2
ABSTRACT
This project is aimed at detecting and classifying 22 different food items from the input images.
The system will accept the input as images of various food items and will identify and classify
them based on previous images with which the system was trained and tested.
The project will be using an image dataset of 22 different classes of food items and each class
containing 100 images of a single food item of that particular class. The system developed for
the project will make use of deep learning algorithm-MobileNetV2 to detect and
categorise/classify the input given by the user such as the name of the food item, the category of
the food item-(healthy or unhealthy).

3
1. INTRODUCTION
Food-related photos have become popular, due to social networks, food recommendation and
dietary assessment systems. Social networking sites are nowadays flooded with Food related
photos. For instance, new trend is sharing dining-out experiences on social networks. In fact,
people are increasingly interested in discovering and sharing new cuisines, and knowing more
about different aspects of the food they consume. Many works on food recognition have been put
forward in recent years based on different visual representations most of them are limited to a
few food classes in controlled settings. Accurate food recognition from only visual information is
still a troublesome task. In contrast to objects, food items are deformable and with high interclass
variability, e.g. diverse cooking styles and seasonings will lead to different appearances of the
same food. Moreover, different foods share many ingredients and often differences between
some foods classes are difficult to detect. Also the difference in appearance and presentation of
same dish at various restaurants add to the complexity of recognizing the dish.

A few techniques that exist for multi-class image classification are SVM, KNN, and Artificial
Neural Networks. Transfer Learning technique has shown promising results in the field of image
classification. Transfer learning is a deep learning technique where a model is trained to learn
and store the knowledge from one problem and use the same model to other similar problems.
Convolution Neural Network is a deep learning technique that has gained popularity in image
recognition tasks due to its high accuracy and robustness.

The purpose of this project is to present a system which can detect and classify different food
items that we consume on a normal basis by using a pre-trained MobileNetV2 model. A
comparison of the above model is made based on accuracy and loss. In this the user will be
giving an input as an image so now the system shall perform detection and classification based
on the input image give and the output generated will be the name of the food item. We have
used Streamlit which is an open source app framework in Python language. It helps us create
web apps for data science and machine learning in a short time. It is compatible with major
Python libraries such as scikit-learn, Keras, PyTorch, SymPy(latex), NumPy, pandas, Matplotlib
etc. so with the help of this we have created a web application which allows the user to insert an
image and it will send the image to the food detection model which performs the detection and

4
sends the name of the food item which it has detected, now in our web application we have
created two list one for health food items and another for un-healthy food item. It now checks in
which category does the food name generated lies and based on that it specifies the category of
the food item. We have also used web scrapping to fetch the calories of the food item (per 100g),
with the help of a Google API, and the output generated by the web applications are category,
prediction and calories.

2. Survey & Dataset Collection


1. Problem Definition:
Social networking sites are nowadays flooded with Food related photos. For instance,
new trend is sharing dining-out experiences on social networks. In fact, people are
increasingly interested in discovering and sharing new cuisines, and knowing more about
different aspects of the food they consume. Many works on food recognition have been
put forward in recent years based on different visual representations most of them are
limited to a few food classes in controlled settings. Accurate food recognition from only
visual information is still a troublesome task.

2. Dataset Description:
Data Source: https://www.kaggle.com/datasets/kmader/food41
Data Source: https://www.kaggle.com/datasets/kritikseth/fruit-and-vegetable-image-
recognition?select=validation
We have used two different dataset food-101 and fruit-and-vegetable dataset and created
our own custom dataset by combining few classes from the above two dataset and our
dataset consist of 22 different classes of food items. The dataset consist of total 2600
images in which each class consist of 100 training images, 10 test images and 10
validation images. The food classes 'pizza' 'french_fries' 'chicken_curry' 'cauliflower'
'burger' 'tomato' 'omlette' 'hot_dog' 'ice_cream' 'samosa' 'pineapple' 'banana'
'cheese' 'cabbage' 'apple' 'grapes' 'mango' 'corn' 'momos' 'donuts' 'carrot' 'soup'.

5
3. Methodology:

The proposed work is implemented in Python using a Convolutional Neural Network (CNN)
model and Transfer Learning. The models were trained on 100 images for 22 classes and then
used to predict food class. A new input goes through stages of image processing like resizing and
colour space conversion etc, before it is fed to the trained model. After comparing the features of
the input with the features of each trained class, the output is predicted. The User uploads a food
image from the system using a GUI. The image goes through the stages of image preprocessing
and is then passed on to the Mobilenet_v2 (CNN) model for classification.
i. Module Description:
1. Data Exploration: In this step we are creating labels for each class to in
order to make the model understand that to which class a given image belongs
to.

2. Pre-Processing:
For pre-processing of our data we have used MobilenetV2.preprocess_input
along with that ImageDataGenerator() which is a Keras Image augmentation
library. This step is performed for both the train, test and validation dataset.
Image Augmentation techniques with Keras ImageDataGenerator
 Rotations
 Shifts
 Flips
 Brightness
 Zoom
Code: for preprocessing
6
train_generator = tf.keras.preprocessing.image.ImageDataGenerator(
preprocessing_function=tf.keras.applications.mobilenet_v2.preprocess_inp
ut
)

test_generator = tf.keras.preprocessing.image.ImageDataGenerator(
preprocessing_function=tf.keras.applications.mobilenet_v2.preprocess_inp
ut
)
ii. Algorithms used:
In this model we have used Mobilenet_V2. It is a convolutional neural
network that is 53 layers deep. It is fast and more accurate that other object
detection models and its small in size. We have used the pretrained weights of
MobileNet-V2 for transfer learning of our model.

MobileNetV2:
 In MobileNetV2, there are two types of blocks. One is residual block with stride
of 1. Another one is block with stride of 2 for downsizing.
 There are 3 layers for both types of blocks.
 This time, the first layer is 1×1 convolution with ReLU6.
 The second layer is the depthwise convolution.
 The third layer is another 1×1 convolution but without any non-linearity. It is
claimed that if ReLU is used again, the deep networks only have the power of a
linear classifier on the non-zero volume part of the output domain.

 And there is an expansion factor t. And t=6 for all main experiments.


 If the input got 64 channels, the internal output would get 64×t=64×6=384
channels.

7
Overall architecture:

8
 where t: expansion factor, c: number of output channels, n: repeating number, s:
stride. 3×3 kernels are used for spatial convolution.

Impact of overall Linear Bottleneck:

 With the removal of ReLU6 at the output of each bottleneck module, accuracy is
improved.
 ImageNet Classification:

ImageNet Top-1 Accuracy


 MobileNetV2 outperforms MobileNetV1 and ShuffleNet with comparable
model size and computational cost.
 When we compare MobileNetV2 with other deep learning algorithm on
ImageNet dataset we find that it the total CPU time consumed for the

9
detection by MobileNetv2 was 75ms which is the minimum among all
algorithms in the image shown above.

iii. Flow Diagram of the Model:

iv. Dataset split:


In our dataset we have total of 2600 images for 22 different food classes out
which we have used 100 images of each class for Training and 10 images of each
class for Test and Validation.

10
4. Review 3:
a) Implementation:
i. Software and hardware description:
Software:
 Python:3.9
 Jupyter-notebook
Python Libraries:-
 Tensorflow library
 Keras Library
 Matplotlib library
 Numpy library
 Pandas library
 Sklearn library
Hardware:
Os: Windows 8,10,11, linux, macOs
R.A.M: 4gb.

ii. Output Screenshots:

Input: apple.jpg

11
Input:Burger.jpeg

Input:French_fries.jpg

12
Input: Mango.jpg

Input:omelette.jpg

13
Input: pizza.jpeg

Input: samosa.jpg

14
Input: soup.jpg

Image: pineapple.jpg

15
b) Confusion Matrix:

16
c) Comparison Graph:
Accuracy vs val_accuracy:

Loss vs val_loss:

17
5. Conclusion:
In this study, we have learned the working and how object detection is done using MobileNetV2
model based on their accuracy in multi-class classification for food image dataset. It is seen that
the MobileNetV2 model outperforms the other CNN model in terms of accuracy. It can be
concluded that when a large dataset is not available it is better to use Transfer Learning than
Conventional CNN. We have also seen how we can use Google api along with web scrapping to
extract the calories of food items that is predicted by our model. In future work, ingredient
identification in the particular class of food can be obtained. A more sophisticated tool for image
classification can be developed using more than 22 classes.

18
6. References:
 https://www.irjet.net/archives/V8/i8/IRJET-V8I8102.pdf
 https://paperswithcode.com/method/mobilenetv2
 https://books.aijr.org/index.php/press/catalog/book/114/chapter/1068
 https://www.researchgate.net/publication/
341129298_Food_Image_Classification_with_Improved_MobileNet_Architectur
e_and_Data_Augmentation

19

You might also like