[go: up one dir, main page]

0% found this document useful (0 votes)
136 views33 pages

Automatic Evaluation of Descriptive Answers

The project report titled 'Automatic Answer Evaluation for Descriptive Answers' presents a model aimed at automating the grading of descriptive answers to reduce time and errors associated with manual evaluation. The authors, Chamarathi Sai Chethan C, Balaji R, and Sujith Kumar A, developed a deep learning-based system that utilizes hierarchical classification for effective assessment. The report outlines the project's objectives, methodology, and the necessity for automated grading in the context of increasing online education demands.

Uploaded by

Nicole Camat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Topics covered

  • text classification,
  • project scope,
  • deep learning,
  • student assessment,
  • model testing,
  • user experience,
  • evaluation model,
  • collaborative learning,
  • training datasets,
  • neural networks
0% found this document useful (0 votes)
136 views33 pages

Automatic Evaluation of Descriptive Answers

The project report titled 'Automatic Answer Evaluation for Descriptive Answers' presents a model aimed at automating the grading of descriptive answers to reduce time and errors associated with manual evaluation. The authors, Chamarathi Sai Chethan C, Balaji R, and Sujith Kumar A, developed a deep learning-based system that utilizes hierarchical classification for effective assessment. The report outlines the project's objectives, methodology, and the necessity for automated grading in the context of increasing online education demands.

Uploaded by

Nicole Camat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Topics covered

  • text classification,
  • project scope,
  • deep learning,
  • student assessment,
  • model testing,
  • user experience,
  • evaluation model,
  • collaborative learning,
  • training datasets,
  • neural networks

AUTOMATIC ANSWER CHECKER FOR

DESCRIPTIVE ANSWERS

A PROJECT REPORT

Submitted by

CHAMARATHI SAI CHETHAN C (17BCS020)

BALAJI R (17BCS061)

SUJITH KUMAR A (17BCS115)

In partial fulfillment for the award of the degree

of

BACHELOR OF ENGINEERING
in
COMPUTER SCIENCE & ENGINEERING

KUMARAGURU COLLEGE OF TECHNOLOGY

COIMBATORE-641 049
(An Autonomous Institution Affiliated to Anna University, Chennai)

May 2021
KUMARAGURU COLLEGE OF TECHNOLOGY

COIMBATORE 641 049

(An Autonomous Institution Affiliated to Anna University, Chennai)

BONAFIDE CERTIFICATE

Certified that this project report “Automatic


Answer Evaluation for Descriptive answers” is the bonafide work of
“Chamarathi Sai Chethan C(17BCS020), Balaji R (17BCS061 ) & Sujith Kumar
A(17BCS115)” who carried out the project work under my supervision.

SIGNATURE
SIGNATURE

Dr. Devaki. P, Ph.D., [Link], Associate Professor

HEAD OF THE DEPARTMENT SUPERVISOR


Department of Computer Science and Department of Computer Science and
Engineering, Engineering,
Kumaraguru College of Technology Kumaraguru College of Technology
Coimbatore – 641 049 Coimbatore – 641 049

The candidates with University register numbers 17BCS020, 17BCS061 and


17BCS115 were examined in the Project Viva-Voce examination held on
25.05.2021.

Internal Examiner External Examiner


DECLARATION

We affirm that the project work titled “Automatic Answer Evaluation for
Descriptive answers” being submitted in partial fulfillment for the award of
B.E. - Computer Science and Engineering is the original work carried out by us. It
has not formed the part of any other project work submitted for the award of any
degree or diploma, either in this or any other University.

CHAMARATHI SAI CHETHAN C(17BCS020) BALAJI R(17BCS061)

SUJITH KUMAR A(17BCS115)

I certify that the declaration made above by the candidates is true.

[Link]

Associate Professor,

Department of Computer Science and Engineering,

Kumaraguru College of Technology,

Coimbatore – 641 049.


ACKNOWLEDGEMENT

We express our profound gratitude to the management of Kumaraguru College of


Technology for providing us with the required infrastructure that enabled us to
successfully complete the project.

We extend our gratefulness to our Principal, Dr.J. Srinivasan, for providing us the
necessary facilities to pursue the project.

We would like to acknowledge Dr. P. Devaki, Professor and Head, Department of


Computer Science and Engineering, for her support and encouragement throughout
this project.

We thank our project coordinator [Link], Professor, Department of Computer


Science and Engineering and guide [Link], Associate Professor, Department
of Computer Science and Engineering, for their constant and continuous effort,
guidance and valuable time.

Our sincere and hearty thanks to staff members of Department of Computer Science
and Engineering of Kumaraguru College of Technology for their well wishes, timely
help and support rendered to us during our project. We are greatly indebted to our
family, relatives and friends, without whom life would have not been shaped to this
level.

- CHAMARATHI SAI CHETHAN C

BALAJI R

SUJITH KUMAR A
TABLE OF CONTENTS

CHAPTER TITLE PAGE NO.


NO.

ABSTRACT 3

1 INTRODUCTION 3

1.1 CONCEPTUAL STUDY OF THE PROJECT 4

1.2 OBJECTIVES OF THE PROJECT 5

1.3 SCOPE OF THE PROJECT 5

2 LITERATURE REVIEW 6

2.1 LITERATURE REVIEW OF JOURNALS 6

3 PROBLEM DEFINITION 9

4 SYSTEM ANALYSIS 9

4.1 EXISTING SYSTEM 10

4.2 PROPOSED SYSTEM 10

5 METHODOLOGY 11

5.1 INTRODUCTION TO CONCEPTS 11

5.2 FLOW DIAGRAM 12

6 IMPLEMENATION 13

6.1 TEXT PREPROCESSING 13

6.2 MODEL BUILDING 14


6.3 MODEL SUMMARY 16
6.4 TESTING 17
6.5 WEB PAGE BUILDING 18

7 SYSTEM REQUIREMENTS 20

7.1 HARDWARE REQUIREMENTS 20

1
7.2 SOFTWARE REQUIREMENTS 20

8 CONCLUSION 21

9 APPENDIX 22

9.1 SOURCE CODE 22


10 REFERENCES 27

2
ABSTRACT

One of the foremost vital aspects of the educational method is the assessment of the
information learned by the learner. In an ordinary classroom assessment (e.g., an exam,
assignment, or quiz), a teacher or evaluator accesses the student answers with suitable
marks and feedback on their answers to queries associated with the topic matter
manually. However, in today’s world, there are ‘n’ number of online education
platforms globally with a very less percentage of teachers to guide or evaluate their
opinions, online web tests, and individual or cluster study sessions are done where a
tutor might not be there. In these instances, students still want some guidance and
evaluation of their information on the topic, and so, we believe the automated evaluation
is the key to all. Descriptive answers are unit crucial testing tools for assessing tutorial
accomplishment, integration of ideas, and talent to recall. However, manual correction
is high-priced and takes overwhelming time to grade manually. Manual grading of
descriptive answers takes up a big quantity of instructors' valuable time and therefore is
a rich method. So, we are restricted to multiple-choice standardized tests. We believe
that automatic grading systems will yield quick, effective, and reasonable solutions that
will enable faculties and schools to introduce essays and different refined testing tools.
Automatic grading, if it matches or exceeds the outputs of manual correction, can reduce
the cost factor and time factor effectively. Hence, in order to beat these issues an
automatized model using neural networks has been developed.

1. INTRODUCTION

3
1.1 CONCEPTUAL STUDY OF THE PROJECT
Innovation was a substantial word 15 years back however the new age
is all forward-thinking now. We could see the effect of innovation on our
generation and how it has helped us as humans achieve extraordinary things.
Innovation obviously has helped a human achieve impeccable things, but even
in this 21st century we are still evaluating tests manually and even where it has
been automated we are only using multiple-choice questions to test a person’s
knowledge. Tests and evaluations have always been in the process of assessing
one person's knowledge and capacity for many years. Estimates say that at present
every one in four people is a student of some kind, so you can imagine how many
people are writing tests and how many teachers and instructors are needed to
correct these test papers manually. While some forms of CAA do not require
sophisticated text understanding (e.g., multiple-choice or true/false questions can
be easily graded by a system if the correct solution is available), there are also
student answers made up of free text that may require textual analysis In school,
colleges, and even for joining any job, for getting a promotion in a job, you have
to take a test, and you may have noticed that the results of these tests are published
after a month. Since the starting, it has always been corrected manually by
teachers and various educational institutions. Grading of descriptive answers
takes up a significant amount of instructors' valuable time and hence is an
expensive process. Also, in the manual system, it may be possible that the marks
given to the two same answers are different. So in this project, we are trying to
create a model for evaluating descriptive answers.

1.2 OBJECTIVES OF THE PROJECT

4
Our efforts in this project are determined to achieve a model that could
possibly evaluate and predict or assign the marks for descriptive answers. To
reduce evaluation time and human error occurrence in assessing answers of
students. To create a system that evaluates the paper and assigns marks to the
level of manual correction using hierarchical classification using an approach we
call Hierarchical Deep Learning for Text classification based answer correction
system of deep learning architectures. After analyzing the results of the model we
are going to create a user-friendly interface for automated answer evaluation.

1.3 SCOPE OF THE PROJECT

● Text Preprocessing

● To build a deep learning model

● Analyzing the predicted results

● Creating a user-friendly web interface

5
2. LITERATURE REVIEW

● Sargur Srihari, Rohini Srihari, Pavithra babu, Harish Shrinivasa[1],Published


in IRJMETS, Vol 02, Issue 03, 2015. They proposed a paper on the “Automatic
scoring of the handwritten essay”.Automatic evaluation of written essays
involves the mixing of optical handwriting recognition and automatic essay mark
assigning methodologies. Handwriting recognition is power-assisted by
constraints provided by the reading passage, question, and rubric. Prediction of
marks is based on supported latent linguistics analysis (LSA) which is strong in
comparison to recognition inadequacies. Results on a tiny low testing set show
that with manually transcribed (MT) essays, LSA mark prediction has a mean
but a 2 point difference from manual correction.

● Sheeba Praveen,[2] Published in International Journal of Innovative Research


in Computer and Communication Engineering. Vol. 2, Issue 11, November
2014..As determined that these systems contain solely multiple selection queries
and there was no provision to increase these systems to subjective queries. The
paper presents an associate degree approach to examine the degree of learning
of the student/learner, by evaluating their descriptive communicating answer
sheets. By representing the descriptive answer within the variety of graphs and
examining it with the correct answer are the key steps in our approach. The main
downside of the system is Non Mathematical subjects solely. Less potency in
similarity matching. Multiple sentence answers are tough to grade.

● Michael Mohler, Razvan Bunescu, Rada Mihalcea June 2011 Grading of short
answers Published on Association for Computational Linguistics([Link]).
In this paper, we tend to explore the chance of building upon existing bag-of-
words (BOW) approaches to short answer grading by utilizing machine learning
techniques. Even More, in a trial to mirror the flexibility of humans to grasp
structural (e.g. syntactic) variations between sentences, we tend to use a
rudimentary dependency-graph alignment module, just like those usually
utilized in the textual entailment community. The system is supplied with the
dependency graphs for every combination of the teacher (Ai) and student (As)
answers. For every node within the instructor’s dependency graph, we tend to

6
calculate a similarity score for every node within the student’s dependency graph
primarily based upon a collection of lexical, semantic, and grammar options
applied to each combination of nodes and their corresponding subgraphs. The
node similarity scores calculated within the previous stage are accustomed to
weight the sides, the perimeters(edges) during a bipartite graph representing the
nodes in Ai on one side and therefore the nodes in As on the opposite. We turn
out AN overall grade primarily based upon the alignment found within the
previous stage as we as the results of several bags of words measure. When
assessing based on correlative measures yields predictably poor results, however
evaluating the error rate indicates that it's equivalent to (or higher than) a lot of
intelligent BOW metrics. Third, the rudimentary alignment options we've got
introduced here aren't spare to act as a standalone grading system. However, even
with a really primitive try at alignment detection, we tend to show that it's doable
to boost upon grade learning systems that solely think about BOW options.

● J.Z. Sukkarieh, S.G. Pulman, and N. Raikes. 2014. . International Association of


Educational Assessment, Philadelphia. Proposed a short answer grading system
requiring manually crafted patterns which, if matched, indicate that a question
has been answered correctly. If associate degree annotated corpus is offered,
these patterns may be supplemented by learning further patterns semi-
automatically. The Oxford-UCLES system (Sukkarieh et al., 2004) bootstraps
patterns by beginning with a group of keywords and synonyms and looking
through windows of a text for brand new patterns. Later implementation of the
system (Pulman and Sukkarieh, 2005) compares many machine learning
techniques, also decision tree learning, inductive logic programming, Bayesian
learning, to the sooner pattern matching approach, with encouraging results.

● M. Mohler and R. Mihalcea. 2017. Text to text semantical similarity analysis


for automatic short answer grading. In Proceedings of the European Association
for Computational Linguistics (EACL 2017), Athens, Greece. Proposed a text
similarity approach wherever given on the basis of the level of relatedness
between the scholar’s answer and the instructor's answer. Many measures have
been compared, as well as knowledge-based and corpus primarily based
measures, with the most effective results being obtained with a corpus-based
standard measure with Wikipedia combined with a “relevance feedback”
approach that repeatedly augments the answer by integrating the scholar’s
answers that receive the best grades.

7
● R.D. Nielsen, W. Ward, and J.H. Martin. 2019..Recognizing entailment in
intelligent tutoring systems. natural language Engineering, 15(04):479–501.
Proposes a system within which the dependency-based classification part of the
Intelligent Tutoring lecturer/teacher answers are unit parsed, enhanced, and
manually born-again into a collection of content-bearing dependency triples and
facets. Each and every triples or facet of the teacher’s answer each student’s
answer is labeled to point whether or not it's addressed that aspect and whether
or not or not the solution was contradictory. This model system particularly uses
a decision tree trained on dependency varieties, part-of-speech tags, word count,
associate degreed different options to aim to find out however best to classify an
answer/ facet pair.

● I. Dagan, O. Glickman, and B. Magnini. 2015. The PASCAL recognizing textual


entailment challenge. In Proceedings of the PASCAL Workshop. Proposes a
system that targets the identification of a directional inferential relation between
texts. Given a combination of 2 texts as input, usually spoken as hypothesis and
text, a textual entailment system mechanically finds if the hypothesis is entailed
by the text particularly. Both the pair of input texts are converted and regenerated
into a graph by the dependency relations obtained from a parser. And then,
identical scores are calculated by combining all the separate vertex- and edge-
matching scores. The vertex matching functions use word-level lexical and
linguistic options to work out the standard of the match whereas the edge
matching functions take into consideration the categories of relations and
therefore the distinction in lengths between the aligned methods.

8
3. PROBLEM DEFINITION

Descriptive paper evaluation and assessment take up a significant amount


of the evaluator’s time and are even more expensive when you consider the
number of resources and time needed to correct and evaluate a paper manually.
Even when there are enough human resources to evaluate millions of answer
sheets, there is no guarantee that similar answers of the students will get the
similar marks because various factors like efficiency of the people who are
correcting and their mood plays an important role since it is a time consuming
process. Publication of results takes days , sometimes even months if done by the
manual method . And the important point is humans are prone to make errors, so
automation of answer correction is essential and highly necessary for the future.
And more than ever, we are in need of automated correction for descriptive
answers because of the so-called covid effects on our daily routine so we are
pushed to conduct all the exams online. So we have created a system that
evaluates the paper and assigns marks to the level of manual correction using
hierarchical classification using an approach we call Hierarchical Deep Learning
for Text classification based answer correction system of deep learning
architectures. Our efforts in this project are determined to achieve a model that
could possibly evaluate and predict or assign the marks for descriptive answers.
To reduce evaluation time and human error occurrence in assessing answers of
students.

9
[Link] ANALYSIS

4.1 EXISTING SYSTEM


Most of the existing systems use Natural Language Processing Toolkit
and Vector machines to find the similarities between the given sentences which are to
be compared and find the similarity and originality of the answers and produce the
result. State-of-the-art measures of text similarities such as semantic vector similarity,
question demoting, ratio of length are combined with grading specific constructs such
as question demoting, length ratio which are used to produce top results on multiple
benchmarks. Some several other system uses deep learning techniques such as neural
networks and long short term memory to figure out a way to find the similarity between
the semantics. But there is no perfect system which can be used to correct a paper in a
way that it can be used in real time without having professional guidance. So, in this
work we have tried to create a system for automated descriptive correction which can
be used in real time.

4.2 PROPOSED SYSTEM


Because the growth of the educational sector has come to an exponential
increase in the number of categories, paper evaluation has become complicated. So we
have approached this problem in a different view from current document classification
methods that considers this issue as multi-class classification. So we have performed
hierarchical classification by using an approach that we call as Hierarchical Deep
Learning for Text classification-based answer correction system of Long term Short
term[LSTM] deep learning architectures in an attempt to reduce significant amount of
evaluation time and human error. We also plan to create a user friendly webpage to
evaluate the answers.

10
[Link]

5.1 INTRODUCTION TO CONCEPTS

In this we have used the deep learning model called the Long Short Term
Memory[LSTM]. The Long term short term model refers to several concepts brought
into light by past researchers. Implementation of the algorithm is based on the recurrent
nature and temporal understanding of LSTM. In this implementation, we extend this
into a bidirectional Long Short Term memory model for preprocessing and matching,
which can understand words both forwards and backwards in time. The vector created
by this type of bidirectional Long Short term memory further works on attention by
understanding certain parts of a context and questions the relationship between the
words in higher focus. The output layer is based on PointerNet, which allows
combinations of words to be chosen as outputs. A concept called dropout is
implemented to try to reduce overfitting in neural networks.

Technical use of all the neural network deep learning concepts for common
answer selection tasks, which doesn’t depend on manually applied features and
linguistic tools. Build the embeddings of answers and questions based on bidirectional
Long Short Term Memory models, and determine their relativity by cosine similarity.
In this we have defined a more aggregate representation for questions and answers by
combining convolutional hierarchical neural networks with basic framework. In
comparison with other neural networks models, Long Short Term Memory model has
high efficiency in identifying the respective positions of the words and will form a more
accurate understanding of the relationship between the words and their position in the
sentence and other semantic relationship, and since it also has a memory vector element
associated with it, it can reach back to the previous instances for referencing which in
turn leads to better understanding of the relationship between the words and can come
up with a conclusion far superior than other models.

11
5.2 FLOW DIAGRAM

Fig 1 Flow Diagram

12
6. IMPLEMENTATION

6.1 Text Preprocessing

Using the Natural language toolkit we have preprocessed the textual dataset to a
format which is easy to understand and evaluate. The first step is to convert all the text
to lowercase even though it seems like not a big deal, this can actually bring down the
complexity in the preprocessing stages of the model. Then we begin to dismiss other
unnecessary punctuations . the noises are removed thereafter. The preprocessed words
are then lemmatized to achieve better efficiency in model creation. After this, the text
is tokenized. Tokenization is the interaction of basically parting an expression, sentence,
section, or a whole text document archive into more modest units, like individual words
or terms. Every one of these more modest units are called tokens. One can think of
tokens as parts like a word is a token in a sentence, and a sentence is a token in a
paragraph. These tokens help in understanding the unique situation or fostering the
model for the NLP. The tokenization helps in deciphering the importance of the content
by breaking down the arrangement of the words.

The resources used for preprocessing are Keras,gensim library packages. Keras
is an open-source programming library that gives a Python interface to counterfeit
neural organizations. Keras goes about as an interface for the TensorFlow library. By
using keras preprocessing libraries we have essentially tokenized the text. And
furthermore we have converted the tokenized words into a vector format by using
[Link] is an open-source library for unsupervised topic modeling and natural
language processing, using modern statistical machine learning. And we have created
an embedded matrix containing word indexes and respective vectors from word vectors
by using gensim word2vec libraries.

13
6.2 MODEL BUILDING
To achieve the model we desired we have used dense, input , LSTM, Dropout
and Bidirectional layers. The batch group is normalized to accomplish better outcomes,
which is a method for deep profound recurrent neural organizations that normalizes the
input to a layer for each and every mini batch. It straightforwardly influences the impact
of settling the learning cycle and decreases the quantity of preparing epochs needed to
prepare and train networks.

Fig 2 Working of a neural network

First we have created a pair of word embedding layers that achieves the
embedding of the original answer and students answer respectively. Word embeddings
Pave a way to use an effective, dense representation in which similar words have a
similar encoding. The major advantage is we do not have to specify encoding by hand.
Embedding is a dense vector of point values which are normally floating points. Rather
than indicating the values using manual embedding, these are trainable parameters
whose weights are learned by the model during training, similarly the deep neural
model learns weights for a dense layer.

14
Fig 3 Architecture of this model.

Then we have created a two separate LSTM Encoder in order to make it


hierarchical and bidirectional where we encoded both the students and teachers answer
respectively. Merged these two encoded vectors and passed it to the dense layer while
implementing Batch normalization and [Link] this model, batch normalization has
been implemented because it exponentially improves the stability of our neural network
and normalises the output of the previous hidden activation layer by reducing the batch
mean and batch standard deviation by subtracting and dividing respectively . And this
type of hierarchical deep learning sometimes tends to overfit the model, we dropout is
used. Then we pass it to a dense layer which is a fully interconnected layer, meaning all
the neurons in a layer are connected to those in the next layer, which makes it easy for
the identification and learning of the relationships between the student and teacher's
answer. We tried with many parameters such as number of hidden states and activation
function of the LSTM cell and repeated numerous epochs. And in the dense layer we
have used the ReLu type of activation function to yield better results. Then we have
passed these dense layered vectors to an output sigmoid layer. And then we use the loss
of cross entropy to train weights. Thus we have fitted the model.

15
6.3 MODEL SUMMARY

Fig 4 Model Summary

Configuration of Long term Short term Model

Fig 5 Configuration of LSTM model.

16
6.4 TESTING

The built model is then tested by sample test inputs and it yields results which
are better than the models which use normal Natural Language processing techniques.
The accuracy of the model is currently 85 percent.

Fig.5 Model learning from the dataset

Fig 6 Model learning from the dataset

17
Fig 7 Testing of dataset

By testing we have learned that with the increase of better and bigger training
data, it shows good results. And when the no of words increases the accuracy of the
prediction of the scores sadly decreases. Furthermore training and testing is required to
acquire better results.

6.5 WEB PAGE BUILDING

The trained model is then integrated into a webpage for user friendly use using
django and javascript. Web Page is created by using html,css and javascript. In this the
user can upload the file which needs to be corrected and then pressing the enter option
will automatically make the uploaded file run into the model server and it will reveal
the results of the paper after evaluating. In this way, everybody can work on this model
and the evolution and progress of the model can be seen visually.

18
Login Page:

Fig 8 Login Page

The user can upload the file by pressing the choose file option.

Fig 9 File upload

19
File is evaluated and the marks are shown to users in the website.

Fig 10 Webpage displaying results

20
7. SYSTEM REQUIREMENTS

7.1 HARDWARE REQUIREMENTS

● System : i3 Processor
● Hard Disk : 500 GB.
● Monitor : 15” LED
● Ram : 4GB

7.2 SOFTWARE REQUIREMENTS

● Operating system : Windows 7 or above.


● Tool: Anaconda Navigator – 64 bit
● Scripting Tool: Jupyter Notebook
● Language: Python 3.0

● Library packages: Keras, Gensim, Tensorflow,Sckkit,Numpy.

21
8. CONCLUSION

This project focused on creation of an automated descriptive answer


evaluation on par with human manual correction. Preprocessing techniques
such as tokenization, lemmatization and character segmentation have been
done. The words in the sentences are converted into vectors using Word 2
Vector functions. Then the system is trained by hierarchical convolutional
neural network models by using dense,long term short term, dropout layers.
Then the systems are tested with the real time answers. And for easy usage we
have integrated this model to a website. In Future, further training of better
datasets can yield results much superior than what we are getting in the present
model. More training and more investment in this model is recommended.

22
9. APPENDIX

9.1. SOURCE CODE

from [Link] import pad_sequences


from [Link] import Tokenizer
from [Link] import Word2Vec
import numpy as np
import gc
from [Link] import EarlyStopping, ModelCheckpoint
from [Link] import BatchNormalization
from [Link] import Embedding
from [Link] import concatenate
from [Link] import TensorBoard
from [Link] import load_model
from [Link] import Model
import time
import gc
import os
from inputHandler import create_train_dev_set
from model import SiameseBiLSTM
from inputHandler import word_embed_meta_data, create_test_data
from config import siamese_config
from operator import itemgetter
from [Link] import load_model
import pandas as pd

MODEL
class SiameeseBiLSTM:
def __init__(self, embedingdim, max_sequnce_length, numberlstm,
numberdense, ratedroplstm,ratedropdense, hiddenactivation,
validsplitratio):
[Link] = embeddingdim
[Link] = max_sequencelength
[Link] = numberlstm
[Link] = ratedroplstm
[Link] = numberdense
[Link] = hiddenactivation
[Link] = ratedropdense
[Link] = validsplitratio

def train_model(self, sentencespair, issimilar, embeddingmetadata,


modelsavedirectory='./'):
tokenizer, embeddingmatrix =
embeddingmetadata['tokenizer'], embeddingmetadata['embedding_matrix']

23
traindatax1, traindatax2, trainlabels, leakstrain, \
valdatax1, valdatax2, vallabels, leaksval =
createtraindevset(tokenizer, sentencespair,

issimilar, [Link],

[Link])

if trainddatax1 is None:
print("++++ !!! Failure: Was Unable to train model ++++")
return None

nbwords = len([Link]) + 1

embeddinglayer = Embedding(nbwords, [Link],


weights=[embeddingmatrix],
inputlength=[Link], trainable=False)

lstmlayer = Bidirectional(LSTM([Link],
dropout=[Link], recurrentdropout=[Link]))

sequence1input =
Input(shape=(self.max_sequencelength,), dtype='int32')
embeded_sequences1 = embedinglayer(sequence1input)
x1 = lstmlayer(embededsequences1)

sequence2input = Input(shape=([Link],),
dtype='int32')
embeddedsequences2 = embeddinglayer(sequence2input)
x2 = lstmlayer(embeddedsequences2)

leaksinput = Input(shape=([Link][1],))
leaksdense = Dense(int([Link]/2),
activation=[Link])(leaksinput)

merge = concatenate([x1, x2, leaks_dense])


merge = BatchNormalization()(merge)
merge = Dropout([Link])(merge)
merge = Dense([Link],
activatian=[Link])(merge)
merge = BatchNormalization()(merge)
merge = Dropout(self.rate_drop_dense)(merge)
pred = Dense(1, activation='sigmoid')(merge)

model = Model(inputs=[sequence1input, sequence2input,


leaksinput], outputs=pred)
[Link](loss='binarycrossentropy', optimizer='nadam',
metrics=['acc'])

24
earlystopping = EarlyStopping(monitor='valloss', patience=3)

STAMP = 'lstm%d.2f' % ([Link],


[Link], [Link], [Link])

checkpointdir = modelsavedirectory + 'checkpoints/' +


str(int([Link]())) + '/'

if not [Link](checkpointdir):
[Link](checkpointdir)

bstmodelpath = checkpointdir + STAMP + '.h5'

modelcheckpoint = ModelCheckpoint(bstmodelpath,
savebestonly=True, saveweightsonly=False)

tenserboard = TensorBoard(logdir=checkpointdir +
"logs/{}".format([Link]()))

[Link]([traindatax1, traindatax2, leakstrain], trainlabels,


validationdata=([valdatax1, valdatax2, leaksval],
vallabels),
epoch=200, batchsize=64, shuffle=True,
callback=[earlystopping, modelcheckpoint,
tensorboard])

return bestmodelpath

def updatemodel(self, savedmodelpath, newsentencespair, issimilar,


embeddingmetadata):

tokenizer = embeddingmetadata['tokenizer']
traindatax1, traindatax2, trainlabels, leakstrain, \
valdatax1, valdatax2, vallabels, leaksval =
createtraineddevset(tokenizer, newsentences_pair,

issimilar, [Link],

[Link])
model = loadmodel(savedmodelpath)
modelfilename = [Link]('/')[-1]
newmodelcheckpointpath = [Link]('/')[:-2] +
str(int([Link]())) + '/'

newmodelpath = newmodelcheckpointpath + modelfilename


modelcheckpoint = ModelCheckpoint(newmodelcheckppoint_path +
modelfilename,
savebestonly=True,
saveweightsonly=False)

25
earlystopping = EarlyStopping(monitor='valloss', patience=3)

tensorboard = TensorBoard(logdir=newmodelcheckpointpath +
"logs/{}".format([Link]()))

[Link]([traindatax1, traindatax2, leakstrain], trainlabels,


validationdata=([valdatax1, valdatax2, leaksval],
vallabels),
epochs=50, batchsize=3, shuffle=True,
callbacks=[earlystopping, modelcheckpoint,
tensorboard])

return newmodelpath

MAIN
df = pd.read_csv('[Link]')

sentence1 = list(df['sentence1'])
sentence2 = list(df['sentence2'])
issimilar = list(df['issimilar'])
del df

tokenizer, embeddingmatrix = wordembedmetadata(sentence1 + sentence2,


siamese_config['EMBEDDINGDIM'])

embeddingmetadata = {
'tokenizer': tokenizer,
'embeddingmatrix': embeddingmatrix
}

sentencespair = [(x1, x2) for x1, x2 in zip(sentence1, sentences)]


del sentence1
del sentence2

from config import siamese_config

class Configuration(object):

CONFIGURATION = Configuration()
CONFIGURATION .embedding_dim = siamese_config['EMBEDDING_DIM']
CONFIGURATION .maxsequence_length = siamese_config['MAXSEQUENCELENGTH']
CONFIGURATION .numberlstmunits = siamese_config['NUMBERLSTM']
[Link] = siamese_config['RATEDROPLSTM']
CONFIGURATION .numberdenseunits = siamese_config['NUMBERDENSEUNITS']

26
CONFIGURATION .activationfunction =
siamese_config['ACTIVATIONFUNCTION']
CONFIGURATION .ratedropdense = siamese_config['RATEDROPDENSE']
CONFIGURATION .validationsplitratio = siamese_config['VALIDATIONSPLIT']

siamese = SiameseBiLSTM(CONFIGURATION .embeddingdim , CONFIGURATION


.maxsequencelength, CONFIGURATION .numberlstmunits , CONFIGURATION
.numberdenseunits, CONFIGURATION .ratedroplstm, CONFIGURATION
.ratedropdense, CONFIGURATION .activationfunction, CONFIGURATION
.validationsplitratio)

bestmodelpath = siamese.train_model(sentencespair, issimilar,


embeddingmetadata, modelsavedirectory='./')

model = load_model(bestmodelpath)
while 1:
test_sentense_pairs = [(input("input the student’s
answer?"),input('input the teacher’s answer?'))]

testdatax1, testdatax2, leakstest =


create_test_data(tokenizer,testsentencepairs,
siamese_config['MAXSEQUENCELENGTH'])

pred = list([Link]([testdatax1, testdatax2, leakstest],


verbose=1).ravel())
results = [(x, y, z) for (x, y), z in zip(testsentencepairs, pred)]
[Link](key=itemgetter.(2), reverse=True)
print(results)

27
REFERENCES

[Link], B.G. Tekade, V.S. Bhute, M.D. Chikhalkar and P. Vijaykar.,


2020, ‘AI based symmetric answer evaluation system for descriptive
answering’, International Research Journal of Modernization in Engineering
Technology and Science ( IRJMETS), Volume 02, Issue 03.

[Link] Rokade, Bhushan Patil, Sana Rajani, Surabhi Revandkar and Rajashree
Shedge., 2018, ‘Automated grading system using natural language
processing’,In proceedings of the 2nd International Conference on Inventive
Communication and Computational Technologies(ICICCT),Volume 03,pages
1123-11.

[Link] Sakhapara, Dipti Pawade, Bhakthi Chaudhari, Rishabh Gada, Aakash


Mishra and Shweta Bhanushali., 2018, ‘Subjective answer grader system based
on machine learning’, in Proceedings of International Conference on Soft
Computing and Signal Processing( ICSCSP), Volume 02, Issue 02.

[Link] M.S., Snap K.N., Sable R.G., Nannaware and P.S. and GhugeR.B.,
2017, ‘Automatic answer sheet checker’, International Journal of Advanced
Engineering & Science Research (IJAES), Volume 05, Issue 01.

5.M. Mohler and R. Mihalcea, 2017, ‘Text-to-text semantic similarity for


automatic short answer grading’. Proceedings of the 12th Conference of the
European Chapter of the ACL, pages 567–575, Athens, Greece.

[Link] Arafat Sultan, Cristobal Salazar and Tamar Sumner, 2016, ‘Fast and Easy
Short Answer Grading with High Accuracy’, Published on Proceedings of
Annual Conference of the North American Chapter of the Association for
Computational Linguistics (NAACL-HLT), pages 1070–1075,San Diego,
California.

7. Md Motiur Rahman, Ferdusee Akter, 2019, ‘An automated approach for


answer script valuation using natural language processing’, International
Journal of Computer Science & Engineering Technology(IJCSET), Volume
9,pages 9-47.

[Link] Mohler, Razvan Bunescu, Rada Mihalcea, 2016, ‘Learning to grade


short answer questions using semantic similarity measures and dependency
Graph Alignments’, Proceedings of the 49th Annual Meeting of the Association
for Computational Linguistics, pages 752–762, Portland, Oregon.

[Link] Sinha and Ayush Kaul, 2018, ‘Answer evaluation using machine
learning’, Published on McGraw-Hill Conference,Volume 02,Issue 03.

28
[Link] [Link] and Syed Zakir Ali, 2018, ‘Approaches for automation
in assisting evaluators for grading of answer scripts’, Published on 4th
International Conference on Computing Communication and
Automation(ICCCA),Volume 04,Issue 02 .

[Link] Berad, Pratiksha Jaybhaye and Sakshi Jawale, 2016, ‘AI answer
verifier’, Published on Intend Computational International Research Journal of
Engineering and Technology (IRJET),Volume 06, Issue 01.

[Link] thuy and Moschitti Alessandro, 2020, ‘An automatic evaluation


approach to question answering systems’, Published on Cornell University
arXiv, Volume 01,Issue 05.

29

You might also like