Skip to main content
  • Dublin, United States
The purpose of this paper is to point out the need for performance evaluation measures and techniques suitable for the evaluation of specialized architectural features in nonnumeric applications. Toward this end, problems associated with... more
The purpose of this paper is to point out the need for performance evaluation measures and techniques suitable for the evaluation of specialized architectural features in nonnumeric applications. Toward this end, problems associated with the use of data base machines are examined at three levels of detail: the user level, the system level and the device level.
This dissertation focuses on three aspects of computer network security. First, the dissertation introduces several novel visualization approaches that can effectively be used in network anomaly detection research and development. It... more
This dissertation focuses on three aspects of computer network security. First, the dissertation introduces several novel visualization approaches that can effectively be used in network anomaly detection research and development. It introduces advanced visualization methods using machine learning, statistical and artificial intelligence sciences. These methods include Principal Component Analysis, Self-Organizing Maps, k-means clustering, hierarchical clustering, Independent Component Analysis, Bi-plots, Mosaic plots and stars plots. The methods are compared in terms of their computational complexity, visual effectiveness and inherent ability to detect selected network attacks. One method that uses Principal Component Analysis for dimensionality reduction of input data and Bi-plots for visualization achieved 100% detection rate given a proposed threshold criterion. Other methods like Self-Organizing Maps and k-means demonstrated the ability to effectively distinguish normal traffic from anomalous traffic using effective graphical means. Second, it introduces a proposed unified software environment for developing, testing and evaluating network anomaly detection systems. This is aimed at addressing the lack of a homogeneous platform for developing and testing these systems. The proposed environment uses the S Language as a unified platform for developing an anomaly detection system with extensive visualization capabilities. The strength of the proposed system is demonstrated by using several machine learning and statistical methods to detect and visualize selected network attacks. The results of evaluating seven exploratory multivariate analysis methods for anomaly detection and visualization show the effectiveness of the proposed platform in streamlining the processes involved in developing and testing anomaly detection systems. Finally, low performances issues associated with software-based clustering methods are addressed by developing a hardware-based approach to anomaly detection. A circuit is developed using synthesizable Verilog Hardware Description Language that implements the k-means clustering algorithm in hardware. The circuit clusters network packet information into normal and anomalous traffic and generates an interrupt to indicate that the clustering process has finished. The circuit performance is 300 times faster when measured against a typical software-based version of the algorithm. The circuit was synthesized to produce a total gate count of 50K gates and can run with a clock frequency of 40MHZ.
By accurately profiling the users via their unique attributes, it is possible to view the intrusion detection problem as a classification of authorized users and intruders. This paper demonstrates that artificial neural network (ANN)... more
By accurately profiling the users via their unique attributes, it is possible to view the intrusion detection problem as a classification of authorized users and intruders. This paper demonstrates that artificial neural network (ANN) techniques can be used to solve this classification problem. Furthermore, the paper compares the performance of three neural networks methods in classifying authorized users and intruders using synthetically generated data. The three methods are the gradient descent back propagation (BP) with momentum, the conjugate gradient BP, and the quasi-Newton BP.
In this paper, an analysis of a method proposed for anomaly detection is presented. The method uses a multivariate statistical method called
In this paper, an analysis of a method proposed for anomaly detection is presented. The method uses a multivariate statistical method called Principal Component Analysis to detect selected Denial-of-Service and network Probe attacks using... more
In this paper, an analysis of a method proposed for anomaly detection is presented. The method uses a multivariate statistical method called Principal Component Analysis to detect selected Denial-of-Service and network Probe attacks using the 1998 DARPA Intrusion Detection data set. The Principal Components are calculated for both attack and normal traffic, and the loading values of the various feature vector components are analyzed with respect to the Principal Components. The variance and standard deviation of the Principal Components are calculated and analyzed. A method for identifying an attack based on the Principal Component Analysis results is proposed. After presenting related work in the field of intrusion detection using multivariate analysis methods, the paper introduces Denial-of-Service and network Probe attacks and describes their nature. A brief introduction to Principal Component Analysis and the merits of using it for detecting intrusions are presented. The paper d...
Presented are issues in designing smart, believable software agents capable of playing strategy games, with particular emphasis on the design of an agent capable of playing Cyberwar XXI, a complex war game. The architecture of a... more
Presented are issues in designing smart, believable software agents capable of playing strategy games, with particular emphasis on the design of an agent capable of playing Cyberwar XXI, a complex war game. The architecture of a personality-rich, advise-taking game playing agent that learns to play is described. The suite of computational-intelligence tools used by the advisers include evolutionary computation and neural nets. I. CONFLICT SIMULATIONS Strategy games, in general, and conflict simulations in particular, offer a fertile ground to study the power of computational intelligence (CI). Board games like Chess or Checkers are widely studied strategy games because the environment in which the user interacts with the game is not a simulation of the problem domain; it is the problem domain. As a result, many vexing problems like imperfect effectors and sensors, incomplete or uncertain data, ill-defined goal states can be bypassed. However, games like Chess are only highly stylize...
Using the 1998 DARPA BSM data set collected at MIT’s Lincoln Labs to study intrusion detection systems, the performance of robust support vector machines (RSVMs) was compared with that of conventional support vector machines and nearest... more
Using the 1998 DARPA BSM data set collected at MIT’s Lincoln Labs to study intrusion detection systems, the performance of robust support vector machines (RSVMs) was compared with that of conventional support vector machines and nearest neighbor classifiers in separating normal usage profiles from intrusive profiles of computer programs. The results indicate the superiority of RSVMs not only in terms of high intrusion detection accuracy and low false positives but also in terms of their generalization ability in the presence of noise and running time.
Automated web-based knowledge acquisition can play a useful role in developing systematic methods for solving inverse problems arising in the context of healthcare management. As inverse problems are ill-posed, they are normally solved by... more
Automated web-based knowledge acquisition can play a useful role in developing systematic methods for solving inverse problems arising in the context of healthcare management. As inverse problems are ill-posed, they are normally solved by using some regularization procedure- a mathematical strategy that seeks to supply the "missing data. " We seek to fill the missing data by an automated knowledge discovery process via mining the WWW. This novel procedure is applied by first restoring missing information via web mining and next learning the structure and parameters of the unknown system from the restored data. We learn the Bayesian network structure by looking at various possible interconnection topologies. The parameters, i.e. the probabilities associated with the causal relationships in the network, are deduced using the knowledge mined from the WWW in conjunction with the data available on hand. Using heart disease data sets from the UC Irvine Machine Learning Repositor...
Research Interests:
When a variety of multimedia images of different types (natural, synthetic, compound, medical, etc.) are compressed using a fixed wavelet filter, it is observed that the peak SNR (PSNR) values for a given compression ratio vary widely by... more
When a variety of multimedia images of different types (natural, synthetic, compound, medical, etc.) are compressed using a fixed wavelet filter, it is observed that the peak SNR (PSNR) values for a given compression ratio vary widely by as much as 30 dB from image to image. In this letter, it is shown that most of the gray-level histogram statistics
The determination of the sequence of all nucleotide base-pairs in a DNA molecule, from restriction-fragment data, is a complex task and can be posed as the problem of finding the optima of a multimodal function. A genetic algorithm that... more
The determination of the sequence of all nucleotide base-pairs in a DNA molecule, from restriction-fragment data, is a complex task and can be posed as the problem of finding the optima of a multimodal function. A genetic algorithm that uses multiniche crowding permits us to do this. Performance of this algorithm is first tested using a standard suite of test functions. The algorithm is next tested using two data sets obtained from the Human Genome Project at the Lawrence Livermore National Laboratory. The new method holds promise in automating the sequencing computations.
A multivariate statistical method called Principal Component Analysis is used to detect Denial-of-Service and Network Probe attacks using the 1998 DARPA data set. Visualization of network activity and possible intrusions is achieved using... more
A multivariate statistical method called Principal Component Analysis is used to detect Denial-of-Service and Network Probe attacks using the 1998 DARPA data set. Visualization of network activity and possible intrusions is achieved using Bi-plots, which are used as a graphical means for summarizing the statistics. The principal components are calculated for both attack and normal traffic, and the loading values of the various feature vector components are analyzed with respect to the principal components. The variance and standard deviation of the principal components are calculated and analyzed. A brief introduction to Principal Component Analysis and the merits of using it for detecting the selected intrusions are discussed. A method for identifying an attack based on these results is proposed. The results obtained using the proposed threshold value for detecting the selected intrusions show that a detection rate of 100% can be achieved using this method.
Detection of anomalies in data is one of the fundamental machine learning tasks. Anomaly detection provides the core technology for a broad spectrum of security-centric applications. In this dissertation, we examine various aspects of... more
Detection of anomalies in data is one of the fundamental machine learning tasks. Anomaly detection provides the core technology for a broad spectrum of security-centric applications. In this dissertation, we examine various aspects of anomaly based intrusion detection in computer security. First, we present a new approach to learn program behavior for intrusion detection. Text categorization techniques are adopted to convert each process to a vector and calculate the similarity between two program activities. Then the k-Nearest Neighbor classifier is employed to classify program behavior as normal or intrusive. We demonstrate that our approach is able to effectively detect intrusive program behavior while a low false positive rate is achieved. Second, we describe an adaptive anomaly detection framework that is designed to handle concept drift and online learning for dynamic, changing environments. Through the use of unsupervised evolving connectionist systems, normal behavior changes are efficiently accommodated while anomalous activities can still be recognized. We demonstrate the performance of our adaptive anomaly detection systems and show that the false positive rate can be significantly reduced. Third, we study methods to efficiently estimate the generalization performance of an anomaly detector and the training size requirements. An error bound for support vector machine based anomaly detection is introduced. Inverse power-law learning curves, in turn, are used to estimate how the accuracy of the anomaly detector improves when trained with additional samples. Finally, we present a game theoretic methodology for cost-benefit analysis and design of IDS. We use a simple two-person; nonzero-sum game to model the strategic interdependence between an IDS and an attacker. The solutions based on the game theoretic analysis integrate the cost-effectiveness and technical performance tradeoff of the IDS and identify the best defense and attack strategies.
ABSTRACT
ABSTRACT
ABSTRACT
Research Interests:
Research Interests:

And 106 more