Skip to main content
Research Interests:
ABSTRACT Many academic disciplines have general theories, which apply across the discipline and underlie much of its research. Examples include the Big Bang theory (cosmology), Maxwell's equations (electrodynamics), the theories... more
ABSTRACT Many academic disciplines have general theories, which apply across the discipline and underlie much of its research. Examples include the Big Bang theory (cosmology), Maxwell's equations (electrodynamics), the theories of the cell and evolution (biology), ...
Across 4. Report Program Generator 6. For beginners... 9. Larry Wall’s Practical extraction and report language. 11. Milner's Language 14. Aspect-oriented extension to Java. 16. OO, dynamically typed, reflective programming language... more
Across 4. Report Program Generator 6. For beginners... 9. Larry Wall’s Practical extraction and report language. 11. Milner's Language 14. Aspect-oriented extension to Java. 16. OO, dynamically typed, reflective programming language from Xerox PARC 21. A language for processing text-based data, by Aho et al. 24. A functional language by Turner 25. Algorithmic language 26. Liskov’s clusters. 27. Reflective OO language by Yukihiro Matsumoto 28. Information Processing Language 29. More than a blend of coffee... 31. Personal Home Page 33. Original name of PL/I 35. Safer C variant. 38. In 1957, first language for string handling 39. Restructured extended executor 40. Combined programming language 41. Original name for ALGOL-58 42. Common business oriented language 44. A popular object-oriented Pascal dialect (and its IDE).
Review of “Working effectively with legacy code by Michael Feathers”, Prentice Hall PTR, 2004, $44.99 ISBN: 0131177052
Review of “Why Programs Fail: A Guide to Systematic Debugging by Andreas Zeller”, Morgan Kaufmann, 2005, $54.95, ISBN: 1558608664
With brief examples based on the authors’ real-life experiences, the reader sees how “good software practice” in its traditional and literal sense can often lead to something between disaster and frustration. In particular, the authors... more
With brief examples based on the authors’ real-life experiences, the reader sees how “good software practice” in its traditional and literal sense can often lead to something between disaster and frustration. In particular, the authors argue that specifying a software system in its entirety before developing any code, ensures there is no flexibility for changes down the road when more is known about the problem.
We present JSimil, a code clone detector that uses a novel algorithm to detect similarities in sets of Java programs at the bytecode level. The proposed technique emphasizes scalability and efficiency. It also supports customization... more
We present JSimil, a code clone detector that uses a novel algorithm to detect similarities in sets of Java programs at the bytecode level. The proposed technique emphasizes scalability and efficiency. It also supports customization through profiles that allow the user to specify matching rules, system behavior, pruning thresholds, and output details. Experimental results reveal that JSimil outperforms existing systems. It is even able to spot similarities when complex code obfuscation techniques have been applied.
Community detection is a fundamental problem in the analysis of complex networks. It is the analogue of clustering in network data mining. Within community detection methods, hierarchical algorithms are popular. However, their iterative... more
Community detection is a fundamental problem in the analysis of complex networks. It is the analogue of clustering in network data mining. Within community detection methods, hierarchical algorithms are popular. However, their iterative nature and the need to recompute the structural properties used to split the network (i.e. edge betweenness in Girvan and Newman's algorithm), make them unsuitable for large network data sets. In this paper, we study how local structural network properties can be used as proxies to improve the efficiency of hierarchical community detection while, at the same time, achieving competitive results in terms of modularity. In particular, we study the potential use of the structural properties commonly used to perform local link prediction, a supervised learning problem where community structure is relevant, as nodes are prone to establish new links with other nodes within their communities. In addition, we check the performance impact of network prunin...
Network data mining has become an important area of study due to the large number of problems it can be applied to. This paper presents NOESIS, an open source framework for network data mining that provides a large collection of network... more
Network data mining has become an important area of study due to the large number of problems it can be applied to. This paper presents NOESIS, an open source framework for network data mining that provides a large collection of network analysis techniques, including the analysis of network structural properties, community detection methods, link scoring, and link prediction, as well as network visualization algorithms. It also features a complete stand-alone graphical user interface that facilitates the use of all these techniques. The NOESIS framework has been designed using solid object-oriented design principles and structured parallel programming. As a lightweight library with minimal external dependencies and a permissive software license, NOESIS can be incorporated into other software projects. Released under a BSD license, it is available from this http URL
The machine forms bales of circular cross section by continuously rolling the bale upon a supporting surface within a formation chamber while additional material is supplied to the chamber. The chamber is partially defined by two... more
The machine forms bales of circular cross section by continuously rolling the bale upon a supporting surface within a formation chamber while additional material is supplied to the chamber. The chamber is partially defined by two separate, cooperating sets of flexible belts, one set having an upwardly moving stretch at the rear of the chamber and the other having a forwardly moving stretch defining the top of the chamber such that material entering the chamber at the beginning of the baling cycle is lifted upwardly by the rear stretch and rolled forwardly by the top stretch. The top and rear stretches converge to an upper rear corner of the chamber spaced above the supporting surface for the rolling bale, and such corner may be adjustably shifted vertically and/or horizontally in a fore-and-aft direction as may be necessary or desirable to facilitate bale starting under differing crop conditions. Alternative belt arrangements are disclosed for obtaining the desired adjustability of ...
Determining the quality of the results obtained by clustering techniques is a key issue in unsupervised machine learning. Many authors have discussed the desirable features of good clustering algorithms. However, Jon Kleinberg established... more
Determining the quality of the results obtained by clustering techniques is a key issue in unsupervised machine learning. Many authors have discussed the desirable features of good clustering algorithms. However, Jon Kleinberg established an impossibility theorem for clustering. As a consequence, a wealth of studies have proposed techniques to evaluate the quality of clustering results depending on the characteristics of the clustering problem and the algorithmic technique employed to cluster data.
This paper describes EPROP, a novel technique requiring little prior knowledge for word sense disambiguation of semantic relations between pairs of ambiguous concepts in knowledge bases. Our method makes inferences by aggregating... more
This paper describes EPROP, a novel technique requiring little prior knowledge for word sense disambiguation of semantic relations between pairs of ambiguous concepts in knowledge bases. Our method makes inferences by aggregating evidences from ambiguous word interpretations and propagating the acquired knowledge over a taxonomy to generalize or specialize this knowledge. This propagation process allows the estimation of the degree of belief for each possible word sense assignment given the available evidence. EPROP only requires a sense inventory structured as a taxonomy to disambiguate a knowledge base by combining evidence from the ambiguous facts stored in the knowledge base. We have performed different experiments that show that our method achieves good results on the disambiguation of the semantic relations included in WordNet and ConceptNet. We also show how our method can be used to improve the performance of state-of-the-art word sense disambiguation methods.
Role is a fundamental concept in the analysis of the behavior and function of interacting entities in complex networks. Role discovery is the task of uncovering the hidden roles of nodes within a network. Node roles are commonly defined... more
Role is a fundamental concept in the analysis of the behavior and function of interacting entities in complex networks. Role discovery is the task of uncovering the hidden roles of nodes within a network. Node roles are commonly defined in terms of equivalence classes. Two nodes have the same role if they fall within the same equivalence class. Automorphic equivalence, where two nodes are equivalent when they can swap their labels to form an isomorphic graph, captures this notion of role. The binary concept of equivalence is too restrictive, and nodes in real-world networks rarely belong to the same equivalence class. Instead, a relaxed definition in terms of similarity or distance is commonly used to compute the degree to which two nodes are equivalent. In this paper, we propose a novel distance metric called automorphic distance, which measures how far two nodes are from being automorphically equivalent. We also study its application to node embedding, showing how our metric can b...
Network data mining has attracted a lot of attention since a large number of real-world problems have to deal with complex network data. In this paper, we present NOESIS, an open-source framework for network-based data mining. NOESIS... more
Network data mining has attracted a lot of attention since a large number of real-world problems have to deal with complex network data. In this paper, we present NOESIS, an open-source framework for network-based data mining. NOESIS features a large number of techniques and methods for the analysis of structural network properties, network visualization, community detection, link scoring, and link prediction. The proposed framework has been designed following solid design principles and exploits parallel computing using structured parallel programming. NOESIS also provides a stand-alone graphical user interface allowing the use of advanced software analysis techniques to users without prior programming experience. This framework is available under a BSD open-source software license.
Syntax-directed translation tools require the specification of a language by means of a formal grammar. This grammar must also conform to the specific requirements of the parser generator to be used. Software engineers then annotate the... more
Syntax-directed translation tools require the specification of a language by means of a formal grammar. This grammar must also conform to the specific requirements of the parser generator to be used. Software engineers then annotate the resulting grammar with semantic actions for the resulting system to perform its desired functionality. Whenever the input text format is modified, the grammar has to be updated and the subsequent changes propagate throughout the entire language processing tool chain. Moreover, if several applications use the same language, multiple copies of the same language specification have to be maintained in sync, since language specification (i.e. the grammar) is tightly coupled to language processing (i.e. the semantic actions that annotate that grammar). In this paper, we introduce ModelCC, a model-based parser generator that decouples language specification from language processing, hence avoiding the aforementioned problems that are caused by grammar-drive...
Fuzzy object-oriented database models allow the representation, storage, and retrieval of complex imperfect information according to the object-oriented data paradigm. This chapter describes both a framework and an architecture that can... more
Fuzzy object-oriented database models allow the representation, storage, and retrieval of complex imperfect information according to the object-oriented data paradigm. This chapter describes both a framework and an architecture that can be used to develop fuzzy object-oriented capabilities using the conventional features of the object-oriented data paradigm. We present a framework composed of a set of classical classes, which gives support to fuzzily described complex objects. We also explain how to deal with fuzzy extensions of object-oriented features using as a basis, the conventional object-oriented features. This proposal can be used to build a fuzzy object-oriented database system, by taking as a base an existing database system and minimizing the development effort.
Model-based language specification has applications in the implementation of language processors, the design of domain-specific languages, model-driven software development, data integration, text mining, natural language processing, and... more
Model-based language specification has applications in the implementation of language processors, the design of domain-specific languages, model-driven software development, data integration, text mining, natural language processing, and corpus-based induction of models. Model-based language specification decouples language design from language processing and, unlike traditional grammar-driven approaches, which constrain language designers to specific kinds of grammars, it needs general parser generators able to deal with ambiguities. In this paper, we propose Fence, an efficient bottom-up parsing algorithm with lexical and syntactic ambiguity support that enables the use of model-based language specification in practice.
Computing with words (CWW) techniques have been shown to be useful in the management of imperfect information. From the programmer's standpoint, new tools are necessary to ease the use of these techniques within current programming... more
Computing with words (CWW) techniques have been shown to be useful in the management of imperfect information. From the programmer's standpoint, new tools are necessary to ease the use of these techniques within current programming platforms. This paper presents a step in this direction by describing a general framework that supports the implementation of applications dealing with fuzzy objects. We pay special attention to the study of the object comparison problem by offering both a theoretical analysis and a simple and transparent way to use our theoretical results in practice.
Networks have become increasingly important to model complex systems composed of interacting elements. Network data mining has a large number of applications in many disciplines including protein-protein interaction networks, social... more
Networks have become increasingly important to model complex systems composed of interacting elements. Network data mining has a large number of applications in many disciplines including protein-protein interaction networks, social networks, transportation networks, and telecommunication networks. Different empirical studies have shown that it is possible to predict new relationships between elements attending to the topology of the network and the properties of its elements. The problem of predicting new relationships in networks is called link prediction. Link prediction aims to infer the behavior of the network link formation process by predicting missed or future relationships based on currently observed connections. It has become an attractive area of study since it allows us to predict how networks will evolve. In this survey, we will review the general-purpose techniques at the heart of the link prediction problem, which can be complemented by domain-specific heuristic metho...
Research Interests:
Research Interests:
Association rules have become an important paradigm in knowledge discovery. Nevertheless, the huge number of rules which are usually obtained from standard datasets limits their applicability. In order to solve this problem, several... more
Association rules have become an important paradigm in knowledge discovery. Nevertheless, the huge number of rules which are usually obtained from standard datasets limits their applicability. In order to solve this problem, several solutions have been proposed, as the definition of subjective measures of interest for the rules or the use of more restrictive accuracy measures. Other approaches try to obtain different kinds of knowledge, referred to as pe-culiarities, infrequent rules, or exceptions. In general, the latter approaches are able to reduce the number of rules de-rived from the input dataset. This paper is focused on this topic. We introduce a new kind of rules, namely, anomalous rules, which can be viewed as association rules hidden by a dominant rule. We also develop an efficient algorithm to find all the anomalous rules existing in a database.

And 65 more