Skip to main content
  • Dublin, United States
The purpose of this paper is to point out the need for performance evaluation measures and techniques suitable for the evaluation of specialized architectural features in nonnumeric applications. Toward this end, problems associated with... more
The purpose of this paper is to point out the need for performance evaluation measures and techniques suitable for the evaluation of specialized architectural features in nonnumeric applications. Toward this end, problems associated with the use of data base machines are examined at three levels of detail: the user level, the system level and the device level.
This dissertation focuses on three aspects of computer network security. First, the dissertation introduces several novel visualization approaches that can effectively be used in network anomaly detection research and development. It... more
This dissertation focuses on three aspects of computer network security. First, the dissertation introduces several novel visualization approaches that can effectively be used in network anomaly detection research and development. It introduces advanced visualization methods using machine learning, statistical and artificial intelligence sciences. These methods include Principal Component Analysis, Self-Organizing Maps, k-means clustering, hierarchical clustering, Independent Component Analysis, Bi-plots, Mosaic plots and stars plots. The methods are compared in terms of their computational complexity, visual effectiveness and inherent ability to detect selected network attacks. One method that uses Principal Component Analysis for dimensionality reduction of input data and Bi-plots for visualization achieved 100% detection rate given a proposed threshold criterion. Other methods like Self-Organizing Maps and k-means demonstrated the ability to effectively distinguish normal traffic from anomalous traffic using effective graphical means. Second, it introduces a proposed unified software environment for developing, testing and evaluating network anomaly detection systems. This is aimed at addressing the lack of a homogeneous platform for developing and testing these systems. The proposed environment uses the S Language as a unified platform for developing an anomaly detection system with extensive visualization capabilities. The strength of the proposed system is demonstrated by using several machine learning and statistical methods to detect and visualize selected network attacks. The results of evaluating seven exploratory multivariate analysis methods for anomaly detection and visualization show the effectiveness of the proposed platform in streamlining the processes involved in developing and testing anomaly detection systems. Finally, low performances issues associated with software-based clustering methods are addressed by developing a hardware-based approach to anomaly detection. A circuit is developed using synthesizable Verilog Hardware Description Language that implements the k-means clustering algorithm in hardware. The circuit clusters network packet information into normal and anomalous traffic and generates an interrupt to indicate that the clustering process has finished. The circuit performance is 300 times faster when measured against a typical software-based version of the algorithm. The circuit was synthesized to produce a total gate count of 50K gates and can run with a clock frequency of 40MHZ.
By accurately profiling the users via their unique attributes, it is possible to view the intrusion detection problem as a classification of authorized users and intruders. This paper demonstrates that artificial neural network (ANN)... more
By accurately profiling the users via their unique attributes, it is possible to view the intrusion detection problem as a classification of authorized users and intruders. This paper demonstrates that artificial neural network (ANN) techniques can be used to solve this classification problem. Furthermore, the paper compares the performance of three neural networks methods in classifying authorized users and intruders using synthetically generated data. The three methods are the gradient descent back propagation (BP) with momentum, the conjugate gradient BP, and the quasi-Newton BP.
In this paper, an analysis of a method proposed for anomaly detection is presented. The method uses a multivariate statistical method called
In this paper, an analysis of a method proposed for anomaly detection is presented. The method uses a multivariate statistical method called Principal Component Analysis to detect selected Denial-of-Service and network Probe attacks using... more
In this paper, an analysis of a method proposed for anomaly detection is presented. The method uses a multivariate statistical method called Principal Component Analysis to detect selected Denial-of-Service and network Probe attacks using the 1998 DARPA Intrusion Detection data set. The Principal Components are calculated for both attack and normal traffic, and the loading values of the various feature vector components are analyzed with respect to the Principal Components. The variance and standard deviation of the Principal Components are calculated and analyzed. A method for identifying an attack based on the Principal Component Analysis results is proposed. After presenting related work in the field of intrusion detection using multivariate analysis methods, the paper introduces Denial-of-Service and network Probe attacks and describes their nature. A brief introduction to Principal Component Analysis and the merits of using it for detecting intrusions are presented. The paper d...
Presented are issues in designing smart, believable software agents capable of playing strategy games, with particular emphasis on the design of an agent capable of playing Cyberwar XXI, a complex war game. The architecture of a... more
Presented are issues in designing smart, believable software agents capable of playing strategy games, with particular emphasis on the design of an agent capable of playing Cyberwar XXI, a complex war game. The architecture of a personality-rich, advise-taking game playing agent that learns to play is described. The suite of computational-intelligence tools used by the advisers include evolutionary computation and neural nets. I. CONFLICT SIMULATIONS Strategy games, in general, and conflict simulations in particular, offer a fertile ground to study the power of computational intelligence (CI). Board games like Chess or Checkers are widely studied strategy games because the environment in which the user interacts with the game is not a simulation of the problem domain; it is the problem domain. As a result, many vexing problems like imperfect effectors and sensors, incomplete or uncertain data, ill-defined goal states can be bypassed. However, games like Chess are only highly stylize...
Using the 1998 DARPA BSM data set collected at MIT’s Lincoln Labs to study intrusion detection systems, the performance of robust support vector machines (RSVMs) was compared with that of conventional support vector machines and nearest... more
Using the 1998 DARPA BSM data set collected at MIT’s Lincoln Labs to study intrusion detection systems, the performance of robust support vector machines (RSVMs) was compared with that of conventional support vector machines and nearest neighbor classifiers in separating normal usage profiles from intrusive profiles of computer programs. The results indicate the superiority of RSVMs not only in terms of high intrusion detection accuracy and low false positives but also in terms of their generalization ability in the presence of noise and running time.
Automated web-based knowledge acquisition can play a useful role in developing systematic methods for solving inverse problems arising in the context of healthcare management. As inverse problems are ill-posed, they are normally solved by... more
Automated web-based knowledge acquisition can play a useful role in developing systematic methods for solving inverse problems arising in the context of healthcare management. As inverse problems are ill-posed, they are normally solved by using some regularization procedure- a mathematical strategy that seeks to supply the "missing data. " We seek to fill the missing data by an automated knowledge discovery process via mining the WWW. This novel procedure is applied by first restoring missing information via web mining and next learning the structure and parameters of the unknown system from the restored data. We learn the Bayesian network structure by looking at various possible interconnection topologies. The parameters, i.e. the probabilities associated with the causal relationships in the network, are deduced using the knowledge mined from the WWW in conjunction with the data available on hand. Using heart disease data sets from the UC Irvine Machine Learning Repositor...
Research Interests:
When a variety of multimedia images of different types (natural, synthetic, compound, medical, etc.) are compressed using a fixed wavelet filter, it is observed that the peak SNR (PSNR) values for a given compression ratio vary widely by... more
When a variety of multimedia images of different types (natural, synthetic, compound, medical, etc.) are compressed using a fixed wavelet filter, it is observed that the peak SNR (PSNR) values for a given compression ratio vary widely by as much as 30 dB from image to image. In this letter, it is shown that most of the gray-level histogram statistics
The determination of the sequence of all nucleotide base-pairs in a DNA molecule, from restriction-fragment data, is a complex task and can be posed as the problem of finding the optima of a multimodal function. A genetic algorithm that... more
The determination of the sequence of all nucleotide base-pairs in a DNA molecule, from restriction-fragment data, is a complex task and can be posed as the problem of finding the optima of a multimodal function. A genetic algorithm that uses multiniche crowding permits us to do this. Performance of this algorithm is first tested using a standard suite of test functions. The algorithm is next tested using two data sets obtained from the Human Genome Project at the Lawrence Livermore National Laboratory. The new method holds promise in automating the sequencing computations.
A multivariate statistical method called Principal Component Analysis is used to detect Denial-of-Service and Network Probe attacks using the 1998 DARPA data set. Visualization of network activity and possible intrusions is achieved using... more
A multivariate statistical method called Principal Component Analysis is used to detect Denial-of-Service and Network Probe attacks using the 1998 DARPA data set. Visualization of network activity and possible intrusions is achieved using Bi-plots, which are used as a graphical means for summarizing the statistics. The principal components are calculated for both attack and normal traffic, and the loading values of the various feature vector components are analyzed with respect to the principal components. The variance and standard deviation of the principal components are calculated and analyzed. A brief introduction to Principal Component Analysis and the merits of using it for detecting the selected intrusions are discussed. A method for identifying an attack based on these results is proposed. The results obtained using the proposed threshold value for detecting the selected intrusions show that a detection rate of 100% can be achieved using this method.
Detection of anomalies in data is one of the fundamental machine learning tasks. Anomaly detection provides the core technology for a broad spectrum of security-centric applications. In this dissertation, we examine various aspects of... more
Detection of anomalies in data is one of the fundamental machine learning tasks. Anomaly detection provides the core technology for a broad spectrum of security-centric applications. In this dissertation, we examine various aspects of anomaly based intrusion detection in computer security. First, we present a new approach to learn program behavior for intrusion detection. Text categorization techniques are adopted to convert each process to a vector and calculate the similarity between two program activities. Then the k-Nearest Neighbor classifier is employed to classify program behavior as normal or intrusive. We demonstrate that our approach is able to effectively detect intrusive program behavior while a low false positive rate is achieved. Second, we describe an adaptive anomaly detection framework that is designed to handle concept drift and online learning for dynamic, changing environments. Through the use of unsupervised evolving connectionist systems, normal behavior changes are efficiently accommodated while anomalous activities can still be recognized. We demonstrate the performance of our adaptive anomaly detection systems and show that the false positive rate can be significantly reduced. Third, we study methods to efficiently estimate the generalization performance of an anomaly detector and the training size requirements. An error bound for support vector machine based anomaly detection is introduced. Inverse power-law learning curves, in turn, are used to estimate how the accuracy of the anomaly detector improves when trained with additional samples. Finally, we present a game theoretic methodology for cost-benefit analysis and design of IDS. We use a simple two-person; nonzero-sum game to model the strategic interdependence between an IDS and an attacker. The solutions based on the game theoretic analysis integrate the cost-effectiveness and technical performance tradeoff of the IDS and identify the best defense and attack strategies.
ABSTRACT
ABSTRACT
ABSTRACT
Research Interests:
Research Interests:
ABSTRACT
Research Interests:
Research Interests:
Wavelet-based image coding algorithms, that have become popular in recent years, use a fixed perfect reconstruction filter bank built into the algorithm for coding and decoding of all multimedia images. Initial results from our... more
Wavelet-based image coding algorithms, that have become popular in recent years, use a fixed perfect reconstruction filter bank built into the algorithm for coding and decoding of all multimedia images. Initial results from our experiments conducted by compressing a variety of such images through wavelet filters and integer wavelet transforms suggest that the coding performance for both lossy and lossless
ABSTRACT
This paper demonstrated that neural network (NN) techniques can be used in detecting intruders logging onto a computer network when computer users are profiled accurately. Next, the paper compares the performance of the five neural... more
This paper demonstrated that neural network (NN) techniques can be used in detecting intruders logging onto a computer network when computer users are profiled accurately. Next, the paper compares the performance of the five neural network methods in intrusion detection. The NN techniques used are the gradient descent back propagation (BP), the gradient descent BP with momentum, the variable learning-
Numerical methods of solving partial differential equations (PDEs) using analog or hybrid computers fall into three broad categories. Assuming, for concreteness, that one of the independent variables is time and the rest are spatial, the... more
Numerical methods of solving partial differential equations (PDEs) using analog or hybrid computers fall into three broad categories. Assuming, for concreteness, that one of the independent variables is time and the rest are spatial, the continuous-space and discrete-time (or CSDT) methods envisage to keep the space-like variable continuous and discretize the time-like variable. Similarly, the terms discrete-space and continuous time
There is a long felt need for a general theoretic frame work for hydrologic systems research which imbeds dynamical, structural, spatial and behavioral aspects of modeling with predictive powers. Many of these problems can be effectively... more
There is a long felt need for a general theoretic frame work for hydrologic systems research which imbeds dynamical, structural, spatial and behavioral aspects of modeling with predictive powers. Many of these problems can be effectively studied by systems analysis. The process of system modeling for large-scale, nonlinear, time-lag systems can be rationalized by suitably identifying and modeling subsystems. When computers are used as modeling tools methods of formulating the problem, choice of the computer used, and choice of performance criteria greatly influence the results. This in turn imposes certain limits on the validity of computer simulated models. These aspects are discussed by studying the nature of hydrologic systems and the nature of the associated inverse problems and requirements for validating the models.
Kirchhoff's current law equations of a one-dimensional network of passive resistors yield a coefficient matrix which has a tridiagonal structure. If the magnitudes of the ‘ vertical ’ resistors in the ladder satisfy the relation in... more
Kirchhoff's current law equations of a one-dimensional network of passive resistors yield a coefficient matrix which has a tridiagonal structure. If the magnitudes of the ‘ vertical ’ resistors in the ladder satisfy the relation in (1), then the entries of the inverse of the resulting Steltjes matrix can be expressed as Gegenbauer polynomials. Thus, the network could be used
The past few years have witnessed a rise in the use of AI and Machine Learning techniques to a variety of application areas, such as image understanding and autonomous vehicle driving. Wireless and cloud technologies have also made it... more
The past few years have witnessed a rise in the use of AI and Machine Learning techniques to a variety of application areas, such as image understanding and autonomous vehicle driving. Wireless and cloud technologies have also made it possible for millions of people to access and use services available via the internet. During the same period, the world has also witnessed a rise in cyber-crime, with criminals continually expanding their methods of attack. Weapons like ransomware, botnets, and attack vectors became popular forms of malware attacks. This paper examines the state-of-the-art in computer security and the use of machine learning techniques therein. True, machine learning did make an impact on some narrow application areas such as spam filtering and fraud detection. However – in spite of extensive academic research – it did not seem to make a visible impact on the problem of intrusion detection in real operational settings. A possible reason for this apparent failure is that computer security is inherently a difficult problem. Difficult because it is not just one problem; it is a group of problems characterized by a diversity of operational settings and a multitude of attack scenarios. This is one reason why machine learning has not yet found its niche in the cyber warfare armory. This paper first summarizes the state-of-the-art in computer security and then examines the process of applying machine learning to solve a sample problem.
Research Interests:
ABSTRACT
ABSTRACT
ABSTRACT
ABSTRACT
Research Interests:
ABSTRACT
Research Interests:
ABSTRACT
ABSTRACT
ABSTRACT
Discovery of new knowledge, that is, knowledge that we do not already possess, is the focus of this research. This problem can be formulated as an inverse problem, where the new knowledge can be represented by the parameters of a black... more
Discovery of new knowledge, that is, knowledge that we do not already possess, is the focus of this research. This problem can be formulated as an inverse problem, where the new knowledge can be represented by the parameters of a black box model. The solution can then be viewed as the culmination of a sequence of problem solving steps: search, composition, integration and discovery. A well designed cognitive agent capable of learning, adaptation and optimization can accomplish this task.
An implementation of a network based Intrusion Detection System using a SelfOrganizing Map (SOM) as a clustering tool is described. The system uses a SOM to classify Ethernet data in real-time. A graphical tool constantly displays the... more
An implementation of a network based Intrusion Detection System using a SelfOrganizing Map (SOM) as a clustering tool is described. The system uses a SOM to classify Ethernet data in real-time. A graphical tool constantly displays the clustered data to reflect network activities. The impact of using different techniques for data collection, data preprocessing and classifier design is discussed. The system shows promise in its ability to classify regular traffic verses irregular traffic as a result of a Denial of Service (DoS) attack on a given host.
Research Interests:
Many intrusion detection projects employ a multitude of statistical methods and machine learning techniques to achieve their goal. However there is a lack of a unified framework for developing, testing and comparing the results. This... more
Many intrusion detection projects employ a multitude of statistical methods and machine learning techniques to achieve their goal. However there is a lack of a unified framework for developing, testing and comparing the results. This study introduces the S Language and its environment as a potential candidate for this unified framework, which can also be used to develop new methods, alter existing ones or create hybrid solutions. The S Language, originally developed at Bell Laboratories, provides a powerful environment for statistical and graphical analysis of data. This study illustrates the power of S in computer security research. Three clustering algorithms are used to detect selected Denial-of-Service and Network Probe attacks from the 1998 DARPA Intrusion Detection Evaluation data sets. The results of applying agglomerative, hierarchical and k-means clustering methods to the data sets are presented. Visualization of the clustering results provides a simple, yet powerful, means...
Wireless Mesh Networks hold great promise for the future of network computing. They allow for dynamic reconfiguration in the face of unstable topologies, as well as a decreased reliance on traditional infrastructure models based-upon more... more
Wireless Mesh Networks hold great promise for the future of network computing. They allow for dynamic reconfiguration in the face of unstable topologies, as well as a decreased reliance on traditional infrastructure models based-upon more static networking architectures. This paper conveys how leveraging observed nodal characteristics in this chaotic environment can be used for optimal clustering structures, allowing for more reliable transport and management. We show that by using Reinforcment Learning to observe nodal stability, the average time a particular node is associated with a given cluster is lengthened, over more traditional approaches such as standard weighted clustering. This result leads to a decrease in the overhead needed to maintain network structure, as well as a 15-20% decrease in the amount of re-clustering for a given deployment.
ABSTRACT SUMMARY Today's network control systems have very limited ability to adapt to changing network conditions. The addition of reinforcement learning-based network management agents can improve quality of service by... more
ABSTRACT SUMMARY Today's network control systems have very limited ability to adapt to changing network conditions. The addition of reinforcement learning-based network management agents can improve quality of service by reconfiguring the network layer protocol parameters in response to observed network performance conditions. This paper presents a closed-loop approach to tuning the layer three protocol based upon current and previous network state observations, specifically the Hello Interval and Active Route Timeout parameters of the AODV routing protocol (AODV-Q). Simulation results demonstrate that the self-configuration method proposed here demonstrably improves the performance of the original Ad-Hoc On-Demand Distance Vector (AODV) protocol, reducing protocol overhead by 43% and end-to-end delay 29% while increasing the packet delivery ratio by up to 11%. Copyright © 2012 John Wiley & Sons, Ltd.
ABSTRACT
ABSTRACT
ABSTRACT
Multiple-objective optimization problems naturally arise in resource management projects. A chief difficulty with multiple-objective optimization is that it is no longer clear what one means by an optimal solution. A possible remedy to... more
Multiple-objective optimization problems naturally arise in resource management projects. A chief difficulty with multiple-objective optimization is that it is no longer clear what one means by an optimal solution. A possible remedy to this situation is to refine the concept of ...
ABSTRACT
... Other constraints addressing perceptual groupings and aesthetic considera-... shape, size, and placement in a diagram are often used to convey significant software design information. ... At other times, the purpose is better served... more
... Other constraints addressing perceptual groupings and aesthetic considera-... shape, size, and placement in a diagram are often used to convey significant software design information. ... At other times, the purpose is better served if the intermodule dependencies are shown with a ...
Summary form only given. When we compress a variety of multimedia images using a fixed wavelet filter, the PSNR values vary widely. Similarly in lossless image compression using a fixed integer wavelet transform, the bit rates can vary... more
Summary form only given. When we compress a variety of multimedia images using a fixed wavelet filter, the PSNR values vary widely. Similarly in lossless image compression using a fixed integer wavelet transform, the bit rates can vary sharply. These large variations can be attributed to the image activity measure (IAM). We define and use a number of IAM from image variance, edges, wavelet coefficients and gradients, and analyse various images to see the effect of such image activity on the coding performance. It is observed that for both textures and images, a gradient-based activity measure is found to be the most effective measure in capturing the activities and solely determines the compressibility of an image
This paper focuses on use of a new image filtering technique, Pulsed Coupled Neural Network factoring to enhance both the analysis and visual interpretation of noisy sinusoidal time signals, such as those produced by LLNL's Microwave... more
This paper focuses on use of a new image filtering technique, Pulsed Coupled Neural Network factoring to enhance both the analysis and visual interpretation of noisy sinusoidal time signals, such as those produced by LLNL's Microwave Impulse Radar motion sensor. Separation of a slower, carrier wave from faster, finer detailed signals and from scattered noise is illustrated. The resulting images clearly illustrate the changes over time of simulated heart motion patterns. Such images can potentially assist a field medic in interpretation of the extent of combat injuries. These images can also be transmitted or stored and retrieved for later analysis.
ABSTRACT
ABSTRACT
ABSTRACT
ABSTRACT
ABSTRACT
Many optimization techniques work well for unimodal functions. If applied to multimodal functions, they tend to converge to only one of the many peaks. Optimization of multimodal functions becomes even more difficult if the function... more
Many optimization techniques work well for unimodal functions. If applied to multimodal functions, they tend to converge to only one of the many peaks. Optimization of multimodal functions becomes even more difficult if the function parameters change dynamically. Genetic algorithms have been successfully applied by several investigators for static optimization of multimodal functions. This modest success is primarily due to the ability of genetic algorithms to locate more than one peak. In this paper we introduce a combination of selection and replacement operators that is suitable for multimodal function optimization in a dynamic environment using various test functions, performance of this new operator is studied. Utility of this new operator to multimodal function optimization in a dynamic environment is described.
Research Interests:
The effectiveness of a multilayered feedforward neural network for seismic phase identification was investigated. The database consisted of seismograms from 75 earthquakes and 75 underground nuclear explosions. For learning, the conjugate... more
The effectiveness of a multilayered feedforward neural network for seismic phase identification was investigated. The database consisted of seismograms from 75 earthquakes and 75 underground nuclear explosions. For learning, the conjugate gradient error backpropagation algorithm with a weight-elimination method was used. Results indicate that feedforward neural networks appear to outperform a conventional Bayesian classifier in a problem where the task
ABSTRACT
Presented are issues in designing smart, believable software agents capable of playing strategy games, with particular emphasis on the design of an agent capable of playing Cyberwar XXI, a complex war game. The architecture of a... more
Presented are issues in designing smart, believable software agents capable of playing strategy games, with particular emphasis on the design of an agent capable of playing Cyberwar XXI, a complex war game. The architecture of a personality-rich, advise-taking game playing agent that learns to play is described. The suite of computational-intelligence tools used by the advisers include evolutionary computation and neural nets.
Research Interests:
ABSTRACT
The recent growth of the Internet has left many users awash in a sea of information. This development has spawned the need for intelligent filtering systems. This paper describes work implemented in the INFOS (Intelligent News Filtering... more
The recent growth of the Internet has left many users awash in a sea of information. This development has spawned the need for intelligent filtering systems. This paper describes work implemented in the INFOS (Intelligent News Filtering Organizational System) project that is designed to reduce the user's search burden by automatically categorizing data as relevant or irrelevant based upon user interests. These predictions are learned automatically based upon features taken from input articles and collaborative features derived from ...
This book is about enhancing computer security through smart technology. The previous three chapters examined various facets of the security problem. Chapters 5-9 will discuss various methods of introducing the “smarts” into the security... more
This book is about enhancing computer security through smart technology. The previous three chapters examined various facets of the security problem. Chapters 5-9 will discuss various methods of introducing the “smarts” into the security domain. This chapter provides some rationale for the need of smart technology and, in a brisk manner, covers some of the relevant ideas from artificial intelligence and machine learning.
We present the design and evaluation of I4, a network infrastructure that enables information exchange and collaboration among different domains. I4 can help with network management in many scenarios, such as when eliminating the unwanted... more
We present the design and evaluation of I4, a network infrastructure that enables information exchange and collaboration among different domains. I4 can help with network management in many scenarios, such as when eliminating the unwanted traffic to improve the network performance as well as diagnosing the network problems. We present the Distributed Denial-of-Service (DDoS) attack as an example to demonstrate the advantages of I4. Simulation results show that I4 can significantly reduce the amount of DDoS attack packets and dramatically improve the quality of services received by legit- imate users. Our design provides attractive properties, such as incremental deployment as well as incentives for such deployment etc.
Research Interests:
We investigate the topology design in IP backbone networks while exploiting both the advantages of IP routing and optical networking. The emerging WDM and optical switching technologies enable the use of optically switched circuits to... more
We investigate the topology design in IP backbone networks while exploiting both the advantages of IP routing and optical networking. The emerging WDM and optical switching technologies enable the use of optically switched circuits to interconnect routers, which may not have direct fiber links with each other, in an IP-over-WDM network. All these “light circuits” form the underlying topology of the IP network. In this study, we optimize this WDM-layer topology to minimize the packet-loss rate under dynamic IP traffic. The dimensions of this optimization include finding the right connectivity pattern and allocate limited capacity to each light circuit. Giving the fact that the total interface capacity of a router is bounded by its internal packet-switching speed, we found the problem manifests itself as a trade off between the interface count and the interface speed. Dense connectivity leads to low average interface speed and utilization, vice versa for sparse connectivity. Interestingly, low interface utilization alone do not guarantee low packet-loss rate since higher speed pipes carry traffic more efficiently. Using both analytical and experimental methods, we verified that there are two critical loads for any network. When the network load is higher than the high critical load, the best topology is always the clique. When the network load is lower than the low critical load, the best topology is always the ring. When the network load is between these two extremes, which we believe to be the normal operating point of a backbone network, the best topology has a moderate connectivity density.
Research Interests:
Traffic grooming is the term used to describe how different traffic streams are packed into higher-speed streams. In a WDM SONET/ring network, each wavelength can carry several lower-rate traffic streams in TDM fashion. The traffic... more
Traffic grooming is the term used to describe how different traffic streams are packed into higher-speed streams. In a WDM SONET/ring network, each wavelength can carry several lower-rate traffic streams in TDM fashion. The traffic demand, which is an integer multiple of the timeslot capacity, between any two nodes is established on several TDM virtual connections. A virtual connection needs to be added and dropped only at the two end nodes of the connection; as a result, the electronic Add/Drop Multiplexors (ADMs) at intermediate nodes (if there are any) will electronically bypass this timeslot. Instead of having an ADM on every wavelength at every node, it may be possible to have some nodes on some wavelength where no add/drop is needed on any time slot; thus, the total number of ADMs in the networks (and hence the network cost) can be reduced. Under the static traffic pattern, the savings can be maximized by carefully packing the virtual connections into wavelengths. In this work, we allow arbitrary (non-uniform) traffic and we first present a formal mathematical definition of the problem, which turns out to be an integer linear program (ILP). Then, we propose a simulated-annealing-based heuristic algorithm for the case where all the traffic are carried on directly connected virtual connections (referred to as the “single-hop” case). Then, we study the case where a hub node is used to bridge traffic from different wavelengths (referred to as the multihop case). We find the following main results. The simulated-annealing-based approach has been found to achieve the best results so far in most cases relative to other comparable approaches proposed in the literature. In general, a multihop approach can achieve better equipment savings when the grooming ratio is large, but it consumes more bandwidth.
Research Interests:
Automated web-based knowledge acquisition can play a useful role in developing systematic methods for solving inverse problems from sparse and unreliable data sets. As inverse problems are ill-posed, they are normally solved by using some... more
Automated web-based knowledge acquisition can play a useful role in developing systematic methods for solving inverse problems from sparse and unreliable data sets. As inverse problems are ill-posed, they are normally solved by using some sort of regularization procedure - a mathematical strategy that seeks to supply the "missing data." We seek to fill the missing data entries by a judicious search of the WWW. The next step is to learn the structure and parameters of the unknown system. The task of learning the structure can be accomplished either by an automated evolutionary search or by a user-assisted generate-and-test strategy. In either case, the goal is to learn a Bayesian network structure by looking at various possible node orderings and interconnection topologies. The parameters to be estimated are conditional probabilities associated with the causal relationships represented by the Bayesian net. These conditional probabilities are deduced using the data sets mined from the WWW in conjunction with the data available on hand. Using heart disease data sets available at the UC Irvine Machine Learning Repository, this procedure is tested and some preliminary results are presented.
Research Interests:
Machine learning is the science of building predictors from data randomly sampled from an assumed probability distribution while ac- counting for the computational complexity of the learning algorithm and the predictor’s performance on... more
Machine learning is the science of building predictors from data randomly sampled from an assumed probability distribution while ac- counting for the computational complexity of the learning algorithm and the predictor’s performance on future data. Much of the work in machine learning is empirical and prone to errors because what the machine learns depends on the adequacy and trust- worthiness of the training data. Automated web-based knowledge acquisition can play a useful role in developing systematic methods for solving a class of machine learning prob- lems in which there is insufficient or untrust- worthy data. Such problems can be solved by using a regularization procedure - a math- ematical strategy that seeks to supply the ”missing data.” There are several ways of reg- ularizing a problem. Statistical methods, for example, can fill in a few data items. But these methods rely on using the available, and possibly unreliable, data to calculate the missing values. Besides, they perform poorly if the percentage of missing values exceeds a threshold. An alternative is to fill in the miss- ing data by an automated knowledge discov- ery process via mining the WWW. This novel procedure is applied by first restoring missing information and next learning the structure and parameters of the unknown system from the restored data. Using a Bayesian network as a possible model for the unknown system, the parameters, i.e., the probabilities associ- ated with the causal relationships in the net- work, are deduced using the knowledge mined from the WWW in conjunction with the data available on hand. The method, when tested against heart disease data sets from the UC Irvine Machine Learning Repository [UCI],
V. Rao Vemuri
Computer Science Dept. University of California, Davis Davis, CA 95616
gave satisfactory results. Preliminary results of our approach, using the Naive Bayes as the system model, are then compared with the performance of the EM algorithm, a well- understood statistical method. Work is cur- rently in progress to assess the performance of this method in data-poor domains.
Research Interests:
The multi-niche crowding genetic algorithm (MNC GA) has demonstrated its ability to maintain population diversity and stable subpopulations while allowing different species to evolve naturally in different niches of the fitness landscape.... more
The multi-niche crowding genetic algorithm (MNC GA) has demonstrated its ability to maintain population diversity and stable subpopulations while allowing different species to evolve naturally in different niches of the fitness landscape. These properties are a consequence, in part, to the effect of crowding selection and worst among most similar replacement genetic operators. In this paper we take a closer look at these genetic operators and present mathematical results that show their effect on the population when used in the MNC GA. We also present some guidelines about the parameter values to use in these genetic operators to achieve the desired niching pressure during a run. We conclude with a list of unexplored avenues that might be helpful in a future analysis of the behavior of the MNC GA.
Keywords: Genetic algorithms, multimodal functions, niching, speciation.
Research Interests:
The determination of the sequence of all nucleotide base-pairs in a DNA molecule, from restriction-fragment data, is a complex task and can be posed as the problem of finding the optima of a multi-modal function. A genetic algorithm that... more
The determination of the sequence of all nucleotide base-pairs in a DNA molecule, from restriction-fragment data, is a complex task and can be posed as the problem of finding the optima of a multi-modal function. A genetic algorithm that uses multi-niche crowding permits us to do this. Performance of this algorithm is first tested using a standard suite of test functions. The algorithm is next tested using two data sets obtained from the Human Genome Project at the Lawrence Livermore National Laboratory. The new method holds promise in automating the sequencing computations.
Key words: Genetic algorithms, multi-modal functions, DNA restriction fragment assembly, human genome project;
Research Interests:
The determination of the sequence of all nucleotide base-pairs in a DNA molecule, from restriction-fragment data, is a complex task and can be posed as the problem of finding the optima of a multi-modal function. A genetic algorithm that... more
The determination of the sequence of all nucleotide base-pairs in a DNA molecule, from restriction-fragment data, is a complex task and can be posed as the problem of finding the optima of a multi-modal function. A genetic algorithm that uses multi-niche crowding permits us to do this. Performance of this algorithm is first tested using a standard suite of test functions. The algorithm is next tested using two data sets obtained from the Human Genome Project at the Lawrence Livermore National Laboratory. The new method holds promise in automating the sequencing computations.
Key words: Genetic algorithms, multi-modal functions, DNA restriction fragment assembly, human genome project;
Research Interests:
Application of genetic algorithms to problems where the fitness landscape changes dynamically is a challenging problem. Genetic algorithms for such environments must maintain a diverse population that can adapt to the changing landscape... more
Application of genetic algorithms to problems where the fitness landscape changes dynamically is a challenging problem. Genetic algorithms for such environments must maintain a diverse population that can adapt to the changing landscape and locate better solutions dynamically. A niching genetic algorithm suitable for locating multiple solutions in a multimodal landscape is applied. The results show the suitability of such approach to locate and maintain solutions in a dynamic landscape.
Research Interests:
A new genetic algorithm based on multi niche crowding is capable of efficiently locating all the peaks of a multi-modal function. By associating these peaks with the utility accrued from different sets of decision variables it is possible... more
A new genetic algorithm based on multi niche crowding is capable of efficiently locating all the peaks of a multi-modal function. By associating these peaks with the utility accrued from different sets of decision variables it is possible to extend the use of genetic algorithms to multi criteria decision making problems. This concept is applied to address the problems arising in the context of remediation of a contaminated aquifer. The multi niche crowding genetic algorithm is used to decide the optimal location of pumping wells. The aquifer dynamics are simulated by repeatedly solving the partial differential equations describing the flow of water using SUTRA code. Output of this simulation constitutes the input to the genetic algorithm.
Key words: Genetic algorithms, multi-modal functions, multi-objective optimization, aquifer remediation, ground water management.
Research Interests:
A new approach, based on the k-Nearest Neighbor (kNN) classifier, is used to classify program behavior as normal or intrusive. Short sequences of system calls have been used by others to characterize a program’s normal behavior before.... more
A new approach, based on the k-Nearest Neighbor (kNN) classifier, is used to classify program behavior as normal or intrusive. Short sequences of system calls have been used by others to characterize a program’s normal behavior before. However, separate databases of short system call sequences have to be built for dif- ferent programs, and learning program profiles involves time-consuming training and testing processes. With the kNN classifier, the frequencies of system calls are used to describe the program behavior. Text categoriza- tion techniques are adopted to convert each process to a vector and calculate the similarity between two pro- gram activities. Since there is no need to learn individ- ual program profiles separately, the calculation involved is largely reduced. Preliminary experiments with 1998 DARPA BSM audit data show that the kNN classifier can effectively detect intrusive attacks and achieve a low false positive rate.
Research Interests:
Intrusion detection complements prevention mechanisms, such as firewalls, cryptography, and authentication to capture intrusions into an information system while they are acting on the information system. This study presents an analysis... more
Intrusion detection complements prevention mechanisms, such as firewalls, cryptography, and authentication to capture intrusions into an information system while they are acting on the information system. This study presents an analysis of a method proposed for anomaly detection. The method uses a multivariate statistical method called Principal Component Analysis to detect selected Denial-of-Service and Network Probe attacks using the 1998 DARPA Intrusion Detection data set. The Principal Components are calculated for both attack and normal traffic, and the loading values of the various feature vector components are analyzed with respect to the Principal Components. The variance and standard deviation of the Principal Components are calculated and analyzed. A brief introduction to Principal Component Analysis and the merits of using it for detecting the selected intrusions are discussed. A method for identifying an attack based on the Principal Component Analysis results is proposed. The results obtained using a proposed criterion for detecting the selected intrusions show that a detection rate of 100% can be achieved using this method. Bi-Plots are used as a graphical mean for summarizing the statistics collected as a result of the analyzed data.
Research Interests:
everal clustering methods have been developed for clustering network traffic in order to detect traffic anomalies and possible intrusions. Many of these methods are typically implemented in software. As a result they suffer performance... more
everal clustering methods have been developed for clustering network traffic in order to detect traffic anomalies and possible intrusions. Many of these methods are typically implemented in software. As a result they suffer performance limitations while processing real time traffic. This study presents a hardware implementation of the k-means clustering algorithm that is used to cluster network traffic. The implementation uses the Verilog hardware description language to build a circuit to read packet information from system memory and produce the output cluster assignments in a 32-bit register. After reset is applied, the circuit uses a state-machine that represents the k-means algorithm to process IP packets for a fixed number of iterations and then generates an interrupt to indicate that it had finished processing the data. The implementation is synthesized into a Field Programmable Gate Array in order to study the number of gates required for the implementation. The maximum achievable clock cycle without applying timing constraints is 40 MHz. To compare the performance of this implementation with a software-based implementation, a C version of the k-means algorithm is compiled, run and profiled with similar parameters of the hardware-based implementation. The results show that the performance of the hardware-based implementation is approximately 300 times faster than a software-based implementation.
Research Interests:
Research Interests:
This paper presents results obtained by using a method of profiling a user based on the login host, the login time, the command set, and the command set execution time of the profiled user. It is assumed that the user is logging onto a... more
This paper presents results obtained by using a method of profiling a user based on the login host, the login time, the command set, and the command set execution time of the profiled user. It is assumed that the user is logging onto a UNIX host on a computer network.
The paper concentrates on two areas: short- term and long-term profiling. In short-term profiling the focus is on profiling the user at a given session where user characteristics do not change much. In long-term profiling, the duration of observation is over a much longer period of time. The latter is more challenging because of a phenomenon called concept or profile drift. Profile drift occurs when a user logs onto a host for an extended period of time (over several sessions) causing his profile to change.
Research Interests:
When a variety of multimedia images of different types (natural, synthetic, compound, medical, etc.) are compressed using a fixed wavelet filter, it is observed that the peak SNR (PSNR) values for a given compression ratio vary widely by... more
When a variety of multimedia images of different types (natural, synthetic, compound, medical, etc.) are compressed using a fixed wavelet filter, it is observed that the peak SNR (PSNR) values for a given compression ratio vary widely by as much as 30 dB from image to image. In this letter, it is shown that most of the gray-level histogram statistics
Wavelet-based image coding algorithms, that have become popular in recent years, use a fixed perfect reconstruction filter bank built into the algorithm for coding and decoding of all multimedia images. Initial results from our... more
Wavelet-based image coding algorithms, that have become popular in recent years, use a fixed perfect reconstruction filter bank built into the algorithm for coding and decoding of all multimedia images. Initial results from our experiments conducted by compressing a variety of such images through wavelet filters and integer wavelet transforms suggest that the coding performance for both lossy and lossless
ABSTRACT
ABSTRACT