Nik Vasiloglou

Follower

Following

Public Views

Noel B. Salazar

KU Leuven

Geoffrey Skoll

SUNY: Buffalo State College

Monica Marquina

CONICET

Denilson Lopes

Universidade Federal do Rio de Janeiro (UFRJ)

Armando Marques-Guedes

UNL - New University of Lisbon

Tom Van Hout

University of Antwerp

Dr. Surendra Pathak

Dr. Hari Singh Gaur University, Sagar(M.P.)

Svend Hollensen

University of Southern Denmark

Emanuele Coccia

EHESS-Ecole des hautes études en sciences sociales

Dinesh Kumar

Amity University

Interests

Uploads

Papers by Nik Vasiloglou

FASTlib Design and Development Manual

Practical Attacks Against Graph-based Clustering

Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security

Graph modeling allows numerous security problems to be tackled in a general way, however, little ... more Graph modeling allows numerous security problems to be tackled in a general way, however, little work has been done to understand their ability to withstand adversarial attacks. We design and evaluate two novel graph attacks against a state-of-the-art networklevel, graph-based detection system. Our work highlights areas in adversarial machine learning that have not yet been addressed, specically: graph-based clustering techniques, and a global feature space where realistic attackers without perfect knowledge must be accounted for (by the defenders) in order to be practical. Even though less informed attackers can evade graph clustering with low cost, we show that some practical defenses are possible.

Download

Product collection recommendation in online retail

Proceedings of the 13th ACM Conference on Recommender Systems

From the lab to production: A case study of session-based recommendations in the home-improvement domain

Fourteenth ACM Conference on Recommender Systems

ERBlox : Combining matching dependencies with machine learning for entity resolution

International Journal of Approximate Reasoning

Entity resolution (ER), an important and common data cleaning problem, is about detecting data du... more Entity resolution (ER), an important and common data cleaning problem, is about detecting data duplicate representations for the same external entities, and merging them into single representations. Relatively recently, declarative rules called matching dependencies (MDs) have been proposed for specifying similarity conditions under which attribute values in database records are merged. In this work we show the process and the benefits of integrating four components of ER: (a) Building a classifier for duplicate/non-duplicate record pairs built using machine learning (ML) techniques; (b) Use of MDs for supporting the blocking phase of ML; (c) Record merging on the basis of the classifier results; and (d) The use of the declarative language LogiQL-an extended form of Datalog supported by the LogicBlox platform-for all activities related to data processing, and the specification and enforcement of MDs.

Download

An intelligent assistant for physicians

2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2016

This paper presents a software tool developed for assisting physicians during an examination proc... more This paper presents a software tool developed for assisting physicians during an examination process. The tool consists of a number of modules with the aim to make the examination process not only quicker but also fault proof moving from a simple electronic medical records management system towards an intelligent assistant for the physician. The intelligent component exploits users&#39; inputs as well as well established standards to line up possible suggestions for filling in the examination report. As the physician continues using it, the tool keeps extracting new knowledge. The architecture of the tool is presented in brief while the intelligent component which builds upon the notion of multilabel learning is presented in more detail. Our preliminary results from a real test case indicate that the performance of the intelligent module can reach quite high performance without a large amount of data.

Detecting malware domains at the upper DNS hierarchy

In recent years Internet miscreants have been leveraging the DNS to build malicious network infra... more

Systems and Methods for Identifying Sets of Similar Products

From throw-away traffic to bots: detecting the rise of DGA-based malware

Proceedings of the 21st Usenix Conference on Security Symposium, Aug 8, 2012

ABSTRACT Many botnet detection systems employ a blacklist of known command and control (C&amp... more ABSTRACT Many botnet detection systems employ a blacklist of known command and control (C&amp;C) domains to detect bots and block their traffic. Similar to signature-based virus detection, such a botnet detection approach is static because the blacklist is updated only after running an external (and often manual) process of domain discovery. As a response, botmasters have begun employing domain generation algorithms (DGAs) to dynamically produce a large number of random domain names and select a small subset for actual C&amp;C use. That is, a C&amp;C domain is randomly generated and used for a very short period of time, thus rendering detection approaches that rely on static domain lists ineffective. Naturally, if we know how a domain generation algorithm works, we can generate the domains ahead of time and still identify and block bot-net C&amp;C traffic. The existing solutions are largely based on reverse engineering of the bot malware executables, which is not always feasible. In this paper we present a new technique to detect randomly generated domains without reversing. Our insight is that most of the DGA-generated (random) domains that a bot queries would result in Non-Existent Domain (NXDomain) responses, and that bots from the same bot-net (with the same DGA algorithm) would generate similar NXDomain traffic. Our approach uses a combination of clustering and classification algorithms. The clustering algorithm clusters domains based on the similarity in the make-ups of domain names as well as the groups of machines that queried these domains. The classification algorithm is used to assign the generated clusters to models of known DGAs. If a cluster cannot be assigned to a known model, then a new model is produced, indicating a new DGA variant or family. We implemented a prototype system and evaluated it on real-world DNS traffic obtained from large ISPs in North America. We report the discovery of twelve DGAs. Half of them are variants of known (botnet) DGAs, and the other half are brand new DGAs that have never been reported before.

Method and system for detecting malicious domain names at an upper DNS hierarchy

Method and System for Detecting Dga-Based Malware

ERBlox: Combining Matching Dependencies with Machine Learning for Entity Resolution

Lecture Notes in Computer Science, 2015

Hyperkernel based density estimation

We focus on solving the problem of learning an optimal smoothing kernel for the unsupervised lear... more We focus on solving the problem of learning an optimal smoothing kernel for the unsupervised learning problem of kernel density estimation(KDE) by using hyper-kernels. The optimal kernel is the one which minimizes the regularized negative leave-one-out-log likelihood score of the train set. We demonstrate that "fixed bandwidth" and "variable bandwidth" KDE are special cases of our algorithm.

Download

Spectrum of Fractal Interpolation Functions

In this paper we compute the Fourier spectrum of the Fractal Interpolation Functions FIFs as intr... more

Download

Learning the Intrinsic Dimensions of the Timit Speech Database with Maximum Variance Unfolding

2009 IEEE 13th Digital Signal Processing Workshop and 5th IEEE Signal Processing Education Workshop, 2009

Page 1. LEARNING THE INTRINSIC DIMENSIONS OF THE TIMIT SPEECH DATABASE WITH MAXIMUM VARIANCE UNFO... more

Isolated word, speaker dependent recognition under the presence of noise, based on an audio retrieval algorithm

Conference Record of the Thirty-Eighth Asilomar Conference on Signals, Systems and Computers, 2004., 2004

ABSTRACT With rapidly increasing storage and computational capac-ity, a common PC can store and i... more

Non-Negative Matrix Factorization, Convexity and Isometry

Proceedings of the 2009 SIAM International Conference on Data Mining, 2009

In this paper we explore avenues for improving the reliability of dimensionality reduction method... more In this paper we explore avenues for improving the reliability of dimensionality reduction methods such as Non-Negative Matrix Factorization (NMF) as interpretive exploratory data analysis tools. We first explore the difficulties of the optimization problem underlying NMF, showing for the first time that non-trivial NMF solutions always exist and that the optimization problem is actually convex, by using the theory of Completely Positive Factorization. We subsequently explore four novel approaches to finding globallyoptimal NMF solutions using various ideas from convex optimization. We then develop a new method, isometric NMF (isoNMF), which preserves non-negativity while also providing an isometric embedding, simultaneously achieving two properties which are helpful for interpretation. Though it results in a more difficult optimization problem, we show experimentally that the resulting method is scalable and even achieves more compact spectra than standard NMF.

Download

Learning Isometric Separation Maps

2009 IEEE International Workshop on Machine Learning for Signal Processing, 2009

Maximum Variance Unfolding (MVU) and its variants have been very successful in embedding data-man... more Maximum Variance Unfolding (MVU) and its variants have been very successful in embedding data-manifolds in lower dimensional spaces, often revealing the true intrinsic dimension. In this paper we show how to also incorporate supervised class information into an MVU-like method without breaking its convexity. We call this method the Isometric Separation Map and we show that the resulting kernel matrix can be used as a binary/multiclass Support Vector Machine-like method in a semi-supervised (transductive) framework. We also show that the method always finds a kernel matrix that linearly separates the training data exactly without projecting them in infinite dimensional spaces. In traditional SVMs we choose a kernel and hope that the data become linearly separable in the kernel space. In this paper we show how the hyperplane can be chosen ad-hoc and the kernel is trained so that data are always linearly separable. Comparisons with Large Margin SVMs show comparable performance.

Download

Parameter Estimation for Manifold Learning, Through Density Estimation

2006 16th IEEE Signal Processing Society Workshop on Machine Learning for Signal Processing, 2006

ABSTRACT Manifold learning turns out to be a very useful tool for many applications of machine le... more

Scalable semidefinite manifold learning

2008 IEEE Workshop on Machine Learning for Signal Processing, 2008

Maximum Variance Unfolding (MVU) is among the state of the art Manifold Learning (ML) algorithms ... more

FASTlib Design and Development Manual

Practical Attacks Against Graph-based Clustering

Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security

Download

Product collection recommendation in online retail

Proceedings of the 13th ACM Conference on Recommender Systems

From the lab to production: A case study of session-based recommendations in the home-improvement domain

Fourteenth ACM Conference on Recommender Systems

ERBlox : Combining matching dependencies with machine learning for entity resolution

International Journal of Approximate Reasoning

Download

An intelligent assistant for physicians

2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2016

Detecting malware domains at the upper DNS hierarchy

In recent years Internet miscreants have been leveraging the DNS to build malicious network infra... more

Systems and Methods for Identifying Sets of Similar Products

From throw-away traffic to bots: detecting the rise of DGA-based malware

Proceedings of the 21st Usenix Conference on Security Symposium, Aug 8, 2012

Method and system for detecting malicious domain names at an upper DNS hierarchy

Method and System for Detecting Dga-Based Malware

ERBlox: Combining Matching Dependencies with Machine Learning for Entity Resolution

Lecture Notes in Computer Science, 2015

Hyperkernel based density estimation

Download

Spectrum of Fractal Interpolation Functions

In this paper we compute the Fourier spectrum of the Fractal Interpolation Functions FIFs as intr... more

Download

Learning the Intrinsic Dimensions of the Timit Speech Database with Maximum Variance Unfolding

2009 IEEE 13th Digital Signal Processing Workshop and 5th IEEE Signal Processing Education Workshop, 2009

Page 1. LEARNING THE INTRINSIC DIMENSIONS OF THE TIMIT SPEECH DATABASE WITH MAXIMUM VARIANCE UNFO... more

Isolated word, speaker dependent recognition under the presence of noise, based on an audio retrieval algorithm

Conference Record of the Thirty-Eighth Asilomar Conference on Signals, Systems and Computers, 2004., 2004

ABSTRACT With rapidly increasing storage and computational capac-ity, a common PC can store and i... more

Non-Negative Matrix Factorization, Convexity and Isometry

Proceedings of the 2009 SIAM International Conference on Data Mining, 2009

Download

Learning Isometric Separation Maps

2009 IEEE International Workshop on Machine Learning for Signal Processing, 2009

Download

Parameter Estimation for Manifold Learning, Through Density Estimation

2006 16th IEEE Signal Processing Society Workshop on Machine Learning for Signal Processing, 2006

ABSTRACT Manifold learning turns out to be a very useful tool for many applications of machine le... more

Scalable semidefinite manifold learning

2008 IEEE Workshop on Machine Learning for Signal Processing, 2008

Maximum Variance Unfolding (MVU) is among the state of the art Manifold Learning (ML) algorithms ... more

Nik Vasiloglou

Uploads

Papers by Nik Vasiloglou

Log In