Skip to main content
Santanu Rath

    Santanu Rath

    Abstract—Agile software development process represents a major departure from traditional, plan-based approaches to soft-ware engineering. Estimating effort of agile software accurately in early stage of software development life cycle is... more
    Abstract—Agile software development process represents a major departure from traditional, plan-based approaches to soft-ware engineering. Estimating effort of agile software accurately in early stage of software development life cycle is a major challenge in the software industry. For improving the estimation accuracy, various optimization techniques are used. The Support Vector Regression (SVR) is one of these techniques that helps in getting optimal estimated values. The main objective of the research work carried out in this paper is to estimate the effort of agile softwares using story point approach. An attempt has been made to optimize the results obtained from story point approach using various SVR kernel methods to achieve better prediction accuracy. A performance comparison of the models obtained using various SVR kernel methods is also presented in order to highlight performance achieved by each method.
    Abstract- The modeling of a communication protocol for a distributed computing system requires making clear inputs and outputs of the remote communicating entities and the formal modeling of the communication channel in the aim of... more
    Abstract- The modeling of a communication protocol for a distributed computing system requires making clear inputs and outputs of the remote communicating entities and the formal modeling of the communication channel in the aim of protocol verification or validation. Petri nets are the formal tool which is mostly used for the purpose of modeling and validation. The coloured Petri Net model of a system can be an executable model representing the different states of the system as well as the events that can cause the system to change its states. In this paper we model a distributed topology adaptive clustering algorithm for mobile ad hoc networks by using the coloured petri net tools.
    Abstract. Security issues in software industries become more and more challenging due to malicious attacks and as a result, it leads to exploration of various security holes in software system. In order to se-cure the information assets... more
    Abstract. Security issues in software industries become more and more challenging due to malicious attacks and as a result, it leads to exploration of various security holes in software system. In order to se-cure the information assets associated with any software system, organizations plan to design the system based on a number of security patterns, useful to build and test new security mechanisms. These patterns are nothing but certain design guidelines. But they have certain limitations in terms of consistency and usability. Hence, these security patterns may sometimes act as insecure. In this study, an attempt has been made to compose security patterns for the web-based application. Subsequently, a formal modeling approach for the composition of security patterns is presented. In order to maximize comprehensibility, Unified Modeling Language (UML) notations are used to represent structural and behavioral aspects of a web-based system. A formal modeling language i.e., Alloy has ...
    Clustering in mobile ad hoc network (MANET) play a vital role in improving its basic network performance parameters like routing delay, bandwidth consumption and throughput. One-hop clustering scheme adopts the simple mech-anism to make... more
    Clustering in mobile ad hoc network (MANET) play a vital role in improving its basic network performance parameters like routing delay, bandwidth consumption and throughput. One-hop clustering scheme adopts the simple mech-anism to make the logical partition of the dynamic network where the network topology changes constantly resulting an unstable clustering. This paper makes a comprehensive survey of some bench-mark one-hop clustering algorithms to understand the research trends in this area. The literature provides the logic of cluster formation for different algorithms in achieving a linked cluster architecture and an intensive simulation survey of their performance on the cluster maintenance aspects such as cluster density, frequency of cluster reelection, frequency of cluster changes by the nodes and the granularity of cluster heads. This paper should facilitate the researchers as well as practitioners in choosing a suitable clustering algorithm on the basis of their formation ...
    We conduct an empirical analysis to investigate the relationship between thirty seven different source code metrics with fifteen different Web Service QoS (Quality of Service) parameters. The source code metrics used in our experiments... more
    We conduct an empirical analysis to investigate the relationship between thirty seven different source code metrics with fifteen different Web Service QoS (Quality of Service) parameters. The source code metrics used in our experiments consists of nineteen Object-Oriented metrics, six Baski and Misra metrics, and twelve Harry M. Sneed metrics. We apply Principal Component Analysis (PCA) and Rough Set Analysis for feature extraction and selection. The different sets of metrics are provided as input to the predictive model generated using Least Square Support Vector Machine (LSSVM) with three different types of kernel functions: RBF, Polynomial, and Linear. Our experimental results reveal that the prediction model developed using LSSVM method with RBF kernel function is more effective and accurate for prediction of QoS parameters than the LSSVM method with linear and polynomial kernel functions. Furthermore, we also observe that the predictive model created using object-oriented metri...
    Evaluating software development effort remains a complex issue drawing in extensive research consideration. The success of software development depends very much on proper estimation of effort required to develop the software. Hence,... more
    Evaluating software development effort remains a complex issue drawing in extensive research consideration. The success of software development depends very much on proper estimation of effort required to develop the software. Hence, correctly assessing the effort needed to develop a software product is a major concern in software industries. Random Forest (RF) technique is prevalently utilized machine learning techniques that aides in getting enhanced evaluated values. The main research work carried out in this paper is to accurately estimate the effort required in developing various software projects by using the optimized class point approach (CPA). Then, optimization of the effort parameters is achieved using the RF technique to obtain better prediction accuracy. Furthermore, performance comparisons of the models obtained using the RF technique with other machine learning techniques such as the Multi-Layer Perceptron (MLP), Radial Basis Function Network (RBFN), Support Vector Re...
    Software fault prediction models are employed to optimize testing resource allocation by identifying fault-prone classes before testing phases. We apply three different ensemble methods to develop a model for predicting fault proneness.... more
    Software fault prediction models are employed to optimize testing resource allocation by identifying fault-prone classes before testing phases. We apply three different ensemble methods to develop a model for predicting fault proneness. We propose a framework to validate the source code metrics and select the right set of metrics with the objective to improve the performance of the fault prediction model. The fault prediction models are then validated using a cost evaluation framework. We conduct a series of experiments on 45 open source project dataset. Key conclusions from our experiments are: (1) Majority Voting Ensemble (MVE) methods outperformed other methods (2) selected set of source code metrics using the suggested source code metrics using validation framework as the input achieves better results compared to all other metrics (3) fault prediction method is effective for software projects with a percentage of faulty classes lower than the threshold value (low - 54.82%, mediu...
    Securing an application based on Service Oriented Architecture provides defenses against a number of threats arising from exposing applications and data to the Internet. A good number of security guidelines are available to apply security... more
    Securing an application based on Service Oriented Architecture provides defenses against a number of threats arising from exposing applications and data to the Internet. A good number of security guidelines are available to apply security in web applications. But these guidelines are sometimes difficult to understand and generate inconsistencies. Security guidelines are often represented as security patterns to build and test new security mechanism. These patterns are nothing but design guidelines, but they have certain limitations in terms of consistency and usability. Hence, application of security patterns may be even insecure. To resolve this problem, a suitable modeling and analysis technique need to be required. In study, an ontology-based modeling and refinement framework is proposed for the web service security. In order to maximize comprehensibility, UML (Unified Modeling Language) notations are used to represent structural and behavioral aspects of a SOA-based system. Subs...
    The modeling of a communication protocol for a distributed computing system requires making clear inputs and outputs of the remote communicating entities and the formal modeling of the communication channel in the aim of protocol... more
    The modeling of a communication protocol for a distributed computing system requires making clear inputs and outputs of the remote communicating entities and the formal modeling of the communication channel in the aim of protocol verification or validation. Petri nets are the formal tool which is mostly used for the purpose of modeling and validation. The coloured Petri Net model of a system can be an executable model representing the different states of the system as well as the events that can cause the system to change its states. In this paper we model a distributed topology adaptive clustering algorithm for mobile ad hoc networks by using the coloured petri net tools.
    This study intends to predict the trends of price for a cryptocurrency, i.e. Ethereum based on deep learning techniques considering its trends on time series particularly. This study analyses how deep learning techniques such as... more
    This study intends to predict the trends of price for a cryptocurrency, i.e. Ethereum based on deep learning techniques considering its trends on time series particularly. This study analyses how deep learning techniques such as multi-layer perceptron (MLP) and long short-term memory (LSTM) help in predicting the price trends of Ethereum. These techniques have been applied based on historical data that were computed per day, hour and minute wise. The dataset is sourced from the CoinDesk repository. The performance of the obtained models is critically assessed using statistical indicators like mean absolute error (MAE), mean squared error (MSE) and root mean squared error (RMSE).
    Web services which are language and platform independent self-contained web-based distributed applic ation components represented by their interfaces can have differnt Quality of Service (QoS) characteristics such as performan ce,... more
    Web services which are language and platform independent self-contained web-based distributed applic ation components represented by their interfaces can have differnt Quality of Service (QoS) characteristics such as performan ce, reliability and scalability. One of the major objectives of a web service provider and implementer is to be able to estimat e and improve the QoS parameters of their web service as its clients application are dependent on the overall quality of the service. We hypothesize that the QoS parameters have a correlation with several source code metrics and hence can be estimated by analyzing the source code. We investigate th e predictive power of 37 different software metrics (Chidamber and Kemerer, Harry M. Sneed, Baski & Misra) to estimate 15 QoS attributes. We develop QoS prediction models using Extreme Learning Machines (ELM) with various kernel methods. Since the performance of the classifiers depends on the softw are metrics that are used to build the pre...
    Software fault prediction model are employed to optimize testing resource allocation by identifying fault-prone classes before testing phases. Several researchers' have validated the use of different classification techniques to... more
    Software fault prediction model are employed to optimize testing resource allocation by identifying fault-prone classes before testing phases. Several researchers' have validated the use of different classification techniques to develop predictive models for fault prediction. The performance of the statistical models are proven to be influenced by the training and testing dataset. Ensemble method learning algorithms have been widely used because it combines the capabilities of its constituent models towards a dataset to come up with a potentially higher performance as compared to individual models (improves generalizability). In the study presented in this paper, three different ensemble methods have been applied to develop a model for predicting fault proneness. The efficacy and usefulness of a fault prediction model also depends on the source code metrics which are considered as the input for the model. In this paper, we propose a framework to validate the source code metrics ...
    Validation of protocols is essential to ensure that the protocol is unambiguous, complete and functionally correct. One approach to ensure the correctness of an existing protocol is to create a formal model for the protocol, and analyze... more
    Validation of protocols is essential to ensure that the protocol is unambiguous, complete and functionally correct. One approach to ensure the correctness of an existing protocol is to create a formal model for the protocol, and analyze the model to determine if indeed the protocol provides the defined services correctly. Due to the dynamic nature, concurrency and different levels of abstraction associated with the Mobile AdHoc Network (MANET) protocols, it is difficult to design a model for MANET protocols with existing techniques like UML and other modeling languages. Colored Petri Nets (CPN) is a suitable modeling language for this purpose. This is a promising tool for describing and studying information processing systems that are characterized as being concurrent, asynchronous, distributed, parallel, nondeterministic and stochastic. As a graphical tool, CPNs can be used as a visualcommunication aid similar to flow charts, block diagrams and networks. In addition, tokens are use...
    The job of software effort estimation is a critical one in the early stages of the software development life cycle when the details of requirements are usually not clearly identified. Various optimization techniques help in improving the... more
    The job of software effort estimation is a critical one in the early stages of the software development life cycle when the details of requirements are usually not clearly identified. Various optimization techniques help in improving the accuracy of effort estimation. The Support Vector Regression (SVR) is one of several different soft-computing techniques that help in getting optimal estimated values. The idea of SVR is based upon the computation of a linear regression function in a high dimensional feature space where the input data are mapped via a nonlinear function. Further, the SVR kernel methods can be applied in transforming the input data and then based on these transformations, an optimal boundary between the possible outputs can be obtained. The main objective of the research work carried out in this paper is to estimate the software effort using use case point approach. The use case point approach relies on the use case diagram to estimate the size and effort of software...
    Stock prediction is one of the emerging applications in the field of data science which help the companies to make better decision strategy. Machine learning models play a vital role in the field of prediction. In this paper, we have... more
    Stock prediction is one of the emerging applications in the field of data science which help the companies to make better decision strategy. Machine learning models play a vital role in the field of prediction. In this paper, we have proposed various machine learning models which predicts the stock price from the real-time streaming data. Streaming data has been a potential source for real-time prediction which deals with continuous ow of data having information from various sources like social networking websites, server logs, mobile phone applications, trading oors etc. We have adopted the distributed platform, Spark to analyze the streaming data collected from two different sources as represented in two case studies in this paper. The first case study is based on stock prediction from the historical data collected from Google finance websites through NodeJs and the second one is based on the sentiment analysis of Twitter collected through Twitter API available in Stanford NLP pac...
    Analyzing the structure of a social network helps in gaining insights into interactions and relationships among users while revealing the patterns of their online behavior. Network centrality is a metric of importance of a network node in... more
    Analyzing the structure of a social network helps in gaining insights into interactions and relationships among users while revealing the patterns of their online behavior. Network centrality is a metric of importance of a network node in a network, which allows revealing the structural patterns and morphology of networks. We propose a distributed computing approach for the calculation of network centrality value for each user using the MapReduce approach in the Hadoop platform, which allows faster and more efficient computation as compared to the conventional implementation. A distributed approach is scalable and helps in efficient computations of large-scale datasets, such as social network data. The proposed approach improves the calculation performance of degree centrality by 39.8%, closeness centrality by 40.7% and eigenvalue centrality by 41.1% using a Twitter dataset.
    Purpose The purpose of this study underpins investigation of the impact of human IT capabilities (comprising business functions, interpersonal management and technology management expertise) on organizational agility (in terms of sensing... more
    Purpose The purpose of this study underpins investigation of the impact of human IT capabilities (comprising business functions, interpersonal management and technology management expertise) on organizational agility (in terms of sensing and responding agilities). The moderating influence of IT infrastructure spending on this human IT–agility linkage is also thoroughly investigated. Design/methodology/approach Primary data collected from 300 IT personnel working in various publicly owned banking groups functioning across India are used for this study and structural equation modeling (SEM) is used to assess the human IT–agility link. Findings The two-fold research findings highlight the following: first, human IT capabilities enable both the sensing and responding components of agility and second, firms need to focus on translating huge and impudent IT investments into building superior capabilities to effectively shape agility. Originality/value This study greatly contributes to the...
    Sentiment analysis helps to determine hidden intention of the concerned author of any topic and provides an evaluation report on the polarity of any document. The polarity may be positive, negative or neutral. It is observed that very... more
    Sentiment analysis helps to determine hidden intention of the concerned author of any topic and provides an evaluation report on the polarity of any document. The polarity may be positive, negative or neutral. It is observed that very often the data associated with the sentiment analysis consist of the feedback given by various specialists on any topic or product. Thus, the review may be categorized properly into any sort of class based on the polarity, in order to have a good knowledge about the product. This article proposes an approach to classify the review dataset made on basis of sentiment analysis into different polarity groups. Four machine learning algorithms viz., Naive Bayes (NB), Support Vector Machine (SVM), Random Forest, and Linear Discriminant Analysis (LDA) have been considered in this paper for classification process. The obtained result on values of accuracy of the algorithms are critically examined by using different performance parameters, applied on two differe...
    Microarray-based gene expression profiling has emerged as an efficient technique for classification, prognosis, diagnosis, and treatment of cancer. Frequent changes in the behavior of this disease generates an enormous volume of data.... more
    Microarray-based gene expression profiling has emerged as an efficient technique for classification, prognosis, diagnosis, and treatment of cancer. Frequent changes in the behavior of this disease generates an enormous volume of data. Microarray data satisfies both the veracity and velocity properties of big data, as it keeps changing with time. Therefore, the analysis of microarray datasets in a small amount of time is essential. They often contain a large amount of expression, but only a fraction of it comprises genes that are significantly expressed. The precise identification of genes of interest that are responsible for causing cancer are imperative in microarray data analysis. Most existing schemes employ a two-phase process such as feature selection/extraction followed by classification. In this paper, various statistical methods (tests) based on MapReduce are proposed for selecting relevant features. After feature selection, a MapReduce-based K-nearest neighbor (mrKNN) class...
    ABSTRACT The success of software development depends very much on proper estimation of effort required to develop the software. There is no simple way to make an accurate estimate of these parameters required to develop a software system.... more
    ABSTRACT The success of software development depends very much on proper estimation of effort required to develop the software. There is no simple way to make an accurate estimate of these parameters required to develop a software system. There are basically some points approach which are available for software effort estimation such as Function Point, Use Case Point, Class Point, Object Point etc. In this paper, our aim is to estimate the cost of various software projects using Class Point Approach. The parameters are optimized using various soft computing techniques such as fuzzy logic and adaptive neuro-fuzzy logic so as to achieve better accuracy. Also a comparative analysis of software effort estimation using the techniques such as Artificial Neural Network (ANN), Fuzzy Logic (FL) and Adaptive Neuro-Fuzzy Inference System (ANFIS) has been provided.
    ABSTRACT Evaluating software development effort remains a complex issue drawing in extensive research consideration. The success of software development depends very much on proper estimation of effort required to develop the software.... more
    ABSTRACT Evaluating software development effort remains a complex issue drawing in extensive research consideration. The success of software development depends very much on proper estimation of effort required to develop the software. Hence, correctly assessing the effort needed to develop a software product is a major concern in software industries. Random Forest (RF) technique is prevalently utilized machine learning techniques that aides in getting enhanced evaluated values. The main research work carried out in this paper is to accurately estimate the effort required in developing various software projects by using the optimized class point approach (CPA). Then, optimization of the effort parameters is achieved using the RF technique to obtain better prediction accuracy. Furthermore, performance comparisons of the models obtained using the RF technique with other machine learning techniques such as the Multi-Layer Perceptron (MLP), Radial Basis Function Network (RBFN), Support Vector Regression (SVR) and Stochastic Gradient Boosting (SGB) techniques are presented in order to highlight the performance achieved by each technique.
    ABSTRACT Software Requirement Specification (SRS) statements are the formal document through which the customers share their requirements with the development team. SRS is usually written in any of the natural language (NL), convenient to... more
    ABSTRACT Software Requirement Specification (SRS) statements are the formal document through which the customers share their requirements with the development team. SRS is usually written in any of the natural language (NL), convenient to the customer. But, the text written in SRS is observed to be incomplete and ambiguous for the developer in many cases. From these incomplete and ambiguous SRS statements, the requirement analyst team, try to make an intelligent analysis and identify mostly the element of Object Oriented Analysis, using Natural Language Processing (NLP) techniques. This paper proposes an approach to help the analysis phase, particularly conducting object oriented (OO) analysis by generating class diagram and all its details from SRS statements, in an automated manner. A case study of ATM operation in Bank is considered for accessing NLP technique for conducting OO analysis.

    And 50 more