2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2016
This paper presents a software tool developed for assisting physicians during an examination proc... more This paper presents a software tool developed for assisting physicians during an examination process. The tool consists of a number of modules with the aim to make the examination process not only quicker but also fault proof moving from a simple electronic medical records management system towards an intelligent assistant for the physician. The intelligent component exploits users' inputs as well as well established standards to line up possible suggestions for filling in the examination report. As the physician continues using it, the tool keeps extracting new knowledge. The architecture of the tool is presented in brief while the intelligent component which builds upon the notion of multilabel learning is presented in more detail. Our preliminary results from a real test case indicate that the performance of the intelligent module can reach quite high performance without a large amount of data.
In recent years Internet miscreants have been leveraging the DNS to build malicious network infra... more In recent years Internet miscreants have been leveraging the DNS to build malicious network infrastructures for malware command and control. In this paper we pro-pose a novel detection system called Kopis for detecting malware-related domain names. Kopis passively moni- ...
Proceedings of the 21st Usenix Conference on Security Symposium, Aug 8, 2012
ABSTRACT Many botnet detection systems employ a blacklist of known command and control (C&amp... more ABSTRACT Many botnet detection systems employ a blacklist of known command and control (C&C) domains to detect bots and block their traffic. Similar to signature-based virus detection, such a botnet detection approach is static because the blacklist is updated only after running an external (and often manual) process of domain discovery. As a response, botmasters have begun employing domain generation algorithms (DGAs) to dynamically produce a large number of random domain names and select a small subset for actual C&C use. That is, a C&C domain is randomly generated and used for a very short period of time, thus rendering detection approaches that rely on static domain lists ineffective. Naturally, if we know how a domain generation algorithm works, we can generate the domains ahead of time and still identify and block bot-net C&C traffic. The existing solutions are largely based on reverse engineering of the bot malware executables, which is not always feasible. In this paper we present a new technique to detect randomly generated domains without reversing. Our insight is that most of the DGA-generated (random) domains that a bot queries would result in Non-Existent Domain (NXDomain) responses, and that bots from the same bot-net (with the same DGA algorithm) would generate similar NXDomain traffic. Our approach uses a combination of clustering and classification algorithms. The clustering algorithm clusters domains based on the similarity in the make-ups of domain names as well as the groups of machines that queried these domains. The classification algorithm is used to assign the generated clusters to models of known DGAs. If a cluster cannot be assigned to a known model, then a new model is produced, indicating a new DGA variant or family. We implemented a prototype system and evaluated it on real-world DNS traffic obtained from large ISPs in North America. We report the discovery of twelve DGAs. Half of them are variants of known (botnet) DGAs, and the other half are brand new DGAs that have never been reported before.
We focus on solving the problem of learning an optimal smoothing kernel for the unsupervised lear... more We focus on solving the problem of learning an optimal smoothing kernel for the unsupervised learning problem of kernel density estimation(KDE) by using hyper-kernels. The optimal kernel is the one which minimizes the regularized negative leave-one-out-log likelihood score of the train set. We demonstrate that "fixed bandwidth" and "variable bandwidth" KDE are special cases of our algorithm.
In this paper we compute the Fourier spectrum of the Fractal Interpolation Functions FIFs as intr... more In this paper we compute the Fourier spectrum of the Fractal Interpolation Functions FIFs as introduced by Michael Barnsley. We show that there is an analytical way to compute them. In this paper we attempt to solve the inverse problem of FIF by using the spectrum
2009 IEEE 13th Digital Signal Processing Workshop and 5th IEEE Signal Processing Education Workshop, 2009
Page 1. LEARNING THE INTRINSIC DIMENSIONS OF THE TIMIT SPEECH DATABASE WITH MAXIMUM VARIANCE UNFO... more Page 1. LEARNING THE INTRINSIC DIMENSIONS OF THE TIMIT SPEECH DATABASE WITH MAXIMUM VARIANCE UNFOLDING Nikolaos Vasiloglou, David V. Anderson, Alexander G. Gray Georgia Institute of Technology ...
Conference Record of the Thirty-Eighth Asilomar Conference on Signals, Systems and Computers, 2004., 2004
ABSTRACT With rapidly increasing storage and computational capac-ity, a common PC can store and i... more ABSTRACT With rapidly increasing storage and computational capac-ity, a common PC can store and index hundreds of hours of speech. This suggests that new approaches based on database techniques might be useful in speech recognition and speech indexing. This paper ...
2006 16th IEEE Signal Processing Society Workshop on Machine Learning for Signal Processing, 2006
ABSTRACT Manifold learning turns out to be a very useful tool for many applications of machine le... more ABSTRACT Manifold learning turns out to be a very useful tool for many applications of machine learning, such as classifica-tion. Unfortunately the existing algorithms use ad hoc se-lection of the parameters that define the geometry of the manifold. The parameter choice affects ...
2008 IEEE Workshop on Machine Learning for Signal Processing, 2008
Maximum Variance Unfolding (MVU) is among the state of the art Manifold Learning (ML) algorithms ... more Maximum Variance Unfolding (MVU) is among the state of the art Manifold Learning (ML) algorithms and experimen-tally proven to be the best method to unfold a manifold to its intrinsic dimension. Unfortunately it doesn't scale for more than a few hundred points. A non ...
2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2016
This paper presents a software tool developed for assisting physicians during an examination proc... more This paper presents a software tool developed for assisting physicians during an examination process. The tool consists of a number of modules with the aim to make the examination process not only quicker but also fault proof moving from a simple electronic medical records management system towards an intelligent assistant for the physician. The intelligent component exploits users' inputs as well as well established standards to line up possible suggestions for filling in the examination report. As the physician continues using it, the tool keeps extracting new knowledge. The architecture of the tool is presented in brief while the intelligent component which builds upon the notion of multilabel learning is presented in more detail. Our preliminary results from a real test case indicate that the performance of the intelligent module can reach quite high performance without a large amount of data.
In recent years Internet miscreants have been leveraging the DNS to build malicious network infra... more In recent years Internet miscreants have been leveraging the DNS to build malicious network infrastructures for malware command and control. In this paper we pro-pose a novel detection system called Kopis for detecting malware-related domain names. Kopis passively moni- ...
Proceedings of the 21st Usenix Conference on Security Symposium, Aug 8, 2012
ABSTRACT Many botnet detection systems employ a blacklist of known command and control (C&amp... more ABSTRACT Many botnet detection systems employ a blacklist of known command and control (C&C) domains to detect bots and block their traffic. Similar to signature-based virus detection, such a botnet detection approach is static because the blacklist is updated only after running an external (and often manual) process of domain discovery. As a response, botmasters have begun employing domain generation algorithms (DGAs) to dynamically produce a large number of random domain names and select a small subset for actual C&C use. That is, a C&C domain is randomly generated and used for a very short period of time, thus rendering detection approaches that rely on static domain lists ineffective. Naturally, if we know how a domain generation algorithm works, we can generate the domains ahead of time and still identify and block bot-net C&C traffic. The existing solutions are largely based on reverse engineering of the bot malware executables, which is not always feasible. In this paper we present a new technique to detect randomly generated domains without reversing. Our insight is that most of the DGA-generated (random) domains that a bot queries would result in Non-Existent Domain (NXDomain) responses, and that bots from the same bot-net (with the same DGA algorithm) would generate similar NXDomain traffic. Our approach uses a combination of clustering and classification algorithms. The clustering algorithm clusters domains based on the similarity in the make-ups of domain names as well as the groups of machines that queried these domains. The classification algorithm is used to assign the generated clusters to models of known DGAs. If a cluster cannot be assigned to a known model, then a new model is produced, indicating a new DGA variant or family. We implemented a prototype system and evaluated it on real-world DNS traffic obtained from large ISPs in North America. We report the discovery of twelve DGAs. Half of them are variants of known (botnet) DGAs, and the other half are brand new DGAs that have never been reported before.
We focus on solving the problem of learning an optimal smoothing kernel for the unsupervised lear... more We focus on solving the problem of learning an optimal smoothing kernel for the unsupervised learning problem of kernel density estimation(KDE) by using hyper-kernels. The optimal kernel is the one which minimizes the regularized negative leave-one-out-log likelihood score of the train set. We demonstrate that "fixed bandwidth" and "variable bandwidth" KDE are special cases of our algorithm.
In this paper we compute the Fourier spectrum of the Fractal Interpolation Functions FIFs as intr... more In this paper we compute the Fourier spectrum of the Fractal Interpolation Functions FIFs as introduced by Michael Barnsley. We show that there is an analytical way to compute them. In this paper we attempt to solve the inverse problem of FIF by using the spectrum
2009 IEEE 13th Digital Signal Processing Workshop and 5th IEEE Signal Processing Education Workshop, 2009
Page 1. LEARNING THE INTRINSIC DIMENSIONS OF THE TIMIT SPEECH DATABASE WITH MAXIMUM VARIANCE UNFO... more Page 1. LEARNING THE INTRINSIC DIMENSIONS OF THE TIMIT SPEECH DATABASE WITH MAXIMUM VARIANCE UNFOLDING Nikolaos Vasiloglou, David V. Anderson, Alexander G. Gray Georgia Institute of Technology ...
Conference Record of the Thirty-Eighth Asilomar Conference on Signals, Systems and Computers, 2004., 2004
ABSTRACT With rapidly increasing storage and computational capac-ity, a common PC can store and i... more ABSTRACT With rapidly increasing storage and computational capac-ity, a common PC can store and index hundreds of hours of speech. This suggests that new approaches based on database techniques might be useful in speech recognition and speech indexing. This paper ...
2006 16th IEEE Signal Processing Society Workshop on Machine Learning for Signal Processing, 2006
ABSTRACT Manifold learning turns out to be a very useful tool for many applications of machine le... more ABSTRACT Manifold learning turns out to be a very useful tool for many applications of machine learning, such as classifica-tion. Unfortunately the existing algorithms use ad hoc se-lection of the parameters that define the geometry of the manifold. The parameter choice affects ...
2008 IEEE Workshop on Machine Learning for Signal Processing, 2008
Maximum Variance Unfolding (MVU) is among the state of the art Manifold Learning (ML) algorithms ... more Maximum Variance Unfolding (MVU) is among the state of the art Manifold Learning (ML) algorithms and experimen-tally proven to be the best method to unfold a manifold to its intrinsic dimension. Unfortunately it doesn't scale for more than a few hundred points. A non ...
Uploads
Papers by Nik Vasiloglou