Skip to main content
Abstract—Probabilistic ontologies incorporate uncertain and incomplete information into domain ontologies, allowing uncertainty in attributes of and relationships among domain entities to be represented in a consistent and coherent... more
Abstract—Probabilistic ontologies incorporate uncertain and incomplete information into domain ontologies, allowing uncertainty in attributes of and relationships among domain entities to be represented in a consistent and coherent manner. The probabilistic ontology language PR-OWL provides OWL constructs for representing multi-entity Bayesian network (MEBN) theories. Although compatibility with OWL was a major design goal of PR-OWL, the initial version fell short in several important respects. These shortcomings are addressed by the latest version, PR-OWL 2. This paper provides an overview of the new features of PR-OWL 2 and presents a case study of a probabilistic ontology in the maritime domain. The case study describes the process of constructing a PR-OWL 2 ontology using an existing OWL ontology as a starting point.
Abstract- The change of focus in modern warfare from individual platforms to the network has caused a con-comitant shift in supporting concepts and technologies. Greater emphasis is placed on interoperability and com-poseability. New... more
Abstract- The change of focus in modern warfare from individual platforms to the network has caused a con-comitant shift in supporting concepts and technologies. Greater emphasis is placed on interoperability and com-poseability. New technologies such as SOA and semanti-cally aware systems have come into the spotlight. This paper argues that just as the problem space demands interoperability of diverse technologies, so must the solu-tion space. In other words, not only are new approaches needed, but they must also come together as a seamlessly interoperable technological tool set. This can be accom-plished only via a consistent multi-disciplinary approach. In this paper, we present some of the major requirements of today’s Predictive Situation Awareness Systems (PSAW), propose our approach as a coordinated mix between state-of-the-art research efforts, and present the architecture for enabling our approach.
Abstract. As stated in [5], a major design goal for PR-OWL was to attain compatibility with OWL. However, this goal has been only partially achieved as yet, primarily due to several key issues not fully addressed in the original work.... more
Abstract. As stated in [5], a major design goal for PR-OWL was to attain compatibility with OWL. However, this goal has been only partially achieved as yet, primarily due to several key issues not fully addressed in the original work. This paper describes several important issues of compatibility between PR-OWL and OWL, and suggests approaches to deal with them. To illustrate the issues and how they can be addressed, we use procurement fraud as an example application domain [2]. First, we describe the lack of mapping between PR-OWL random variables (RVs) and the concepts defined in OWL, and then show how this mapping can be done. Second, we describe PR-OWL’s lack of compatibility with existing types already present in OWL, and then show how every type defined in PR-OWL can be directly mapped to concepts already present in OWL.
Multi-Sensor Fusion is founded on the principle that combining information from different sensors will enable a better understanding of the surroundings. However, it would be desirable to evaluate how much one gains by combining different... more
Multi-Sensor Fusion is founded on the principle that combining information from different sensors will enable a better understanding of the surroundings. However, it would be desirable to evaluate how much one gains by combining different sensors in a fusion system, even before implementing it. This paper presents a methodology and tool that allows a user to evaluate the classification performance of a multi-sensor fusion system modeled by a Bayesian network. Specifically, we first define a generic global confusion matrix (GCM) to represent classification performance in a multi-sensor environment, we then develop a methodology with analytical convergence bounds to estimate the performance. The resulting system is designed to answer questions such as: (i) What is the probability of correct classification of a given target using a specific sensor individually? (ii) What if a specific set of sensors combined together are used instead? (iii) What is the performance gain by adding anothe...
Abstract—High-level fusion of hard and soft information from diverse sensor types still depends heavily on human cognition. This results in a scalability conundrum that current technologies are incapable of solving. Although there is... more
Abstract—High-level fusion of hard and soft information from diverse sensor types still depends heavily on human cognition. This results in a scalability conundrum that current technologies are incapable of solving. Although there is widespread acknowl-edgement that an HLF framework must support automated knowledge representation and reasoning with uncertainty, there is no consensus on the most appropriate technology to satisfy this requirement. Further, the debate among proponents of the various approaches is laden with miscommunication and ill-supported assumptions, which inhibits advancement of HLF research as a whole. A clearly defined, scientifically rigorous evaluation framework is needed to help information fusion researchers assess the suitability of various approaches and tools to their applications. This paper describes requirements for such a framework and describes a use case in HLF evaluation.
Each Brazilian Deputy receives a quota of money quota to cover the politician activity expenses, besides their salary. The amount of money reserved for that quota can sum up to almost 1 billion of Brazilian currency (approximately 300... more
Each Brazilian Deputy receives a quota of money quota to cover the politician activity expenses, besides their salary. The amount of money reserved for that quota can sum up to almost 1 billion of Brazilian currency (approximately 300 million US Dollars) in a 4 year legislature. Civic society is using that data to perform independent auditing to verify expenses that are against the rules. This article presents the application of deep Autoencoders to identify anomalies in that data. The anomalies found indicate new suspicious expenses and several data quality problems in the data opened to the society.
Tax administrations in most countries have more corporate and personal information than any other government office. Data mining techniques can be used in many different problems due to the large amount of tax returns received every year.... more
Tax administrations in most countries have more corporate and personal information than any other government office. Data mining techniques can be used in many different problems due to the large amount of tax returns received every year. In the present work we show an essay of the Brazilian Tax Administration on using Bayesian networks to predict taxpayers behavior based on historical analysis of income tax compliance. More specifically, we tried to improve a previous risk based audit selection which detects a large amount of taxpayers as high risk. However, in its current form it identifies much more cases than the tax auditors can handle. Our first results are promising, considerably improving tax audit performance.
This study proposes a predictive model to detect the delay in bank teller queues. Since there are penalties and fines applied to the branches that leave their clients waiting for a long time, detecting these cases as early as possible is... more
This study proposes a predictive model to detect the delay in bank teller queues. Since there are penalties and fines applied to the branches that leave their clients waiting for a long time, detecting these cases as early as possible is essential. Four models were tested: one using a Queuing Theory's formula and the other three using Data Mining algorithms -- Deep Learning (DL), Gradient Boost Machine (GBM), and Random Forest (RF). The results indicated the GBM model as the most efficient, with an accuracy of 97% and a F1-measure of 75%.
The Uncertainty Modeling Process for Semantic Technologies (UMP-ST) is an incremental and iterative approach that covers the difficulty in maintaining and evolving existing POs [5]. It is a general methodology for the majority of the... more
The Uncertainty Modeling Process for Semantic Technologies (UMP-ST) is an incremental and iterative approach that covers the difficulty in maintaining and evolving existing POs [5]. It is a general methodology for the majority of the existing semantic technologies which support uncertainty. One of them is the Probabilistic OWL (PR-OWL), which is a language for representing Multi-Entity Bayesian Networks (MEBN). The modeling of a PO using UMP-ST methodology and MEBN/PR-OWL representation is supported by UnBBayes, a framework for building probabilistic graphical models and performing plausible reasoning. Although there is a guidance described by UMP-ST to model a PO, the implementation of a PO is painful and repetitive. Nowadays, the user needs to build the ontology from the zero in a specific technology, even if the user models the PO in UMP-ST. A proper integration that helps the user to implement the PO such as an intermediate structure makes implementation easier than build the PO...
Teachers use e-learning systems to develop course notes and web-based activities to communicate with learners on one side and monitor and classify their progress on the other. Learners use it for learning, communication, and... more
Teachers use e-learning systems to develop course notes and web-based activities to communicate with learners on one side and monitor and classify their progress on the other. Learners use it for learning, communication, and collaboration. Adaptive e-learning systems often employ learner models, and the behavior of an adaptive system varies depending on the data from the learner model and the learner’s pro le. Without knowing anything about the learner who uses the system, a system would behave in exactly the same way for all learners.
Modeling the learner in adaptive systems involves different information. There are several methods to manage the learner model. They do not handle the uncertainty in the dynamic modeling of the learner. The main hypothesis of this chapter... more
Modeling the learner in adaptive systems involves different information. There are several methods to manage the learner model. They do not handle the uncertainty in the dynamic modeling of the learner. The main hypothesis of this chapter is the management of the learner model based on multi-entity Bayesian networks. This chapter focuses on modeling the learner model in a dynamic and probabilistic way. The authors propose in this work the use of the notion of fragments and m-theory to lead to a Bayesian multi-entity network. The use of this Bayesian method can handle the whole course of a learner as well as all of its shares in an adaptive educational hypermedia.
This article presents a study using Brazilian publicly listed companies' financial statements footnotes, in order to verify if these documents can provide information to predict variations in the debt of the corresponding firms. From... more
This article presents a study using Brazilian publicly listed companies' financial statements footnotes, in order to verify if these documents can provide information to predict variations in the debt of the corresponding firms. From a text mining perspective, we built classification models by assigning a class to each company in a period of time based on the variation of the debt to equity ratio. We conducted experiments using two different classifiers: random forest and support vector machine. The caret package in R language was used for this work. When we employed the random forest classifier, we got accuracies 4 percent points greater than the baseline accuracy, suggesting that the textual content of financial statements has some potential use for debt prediction.
Procurement is one of the main forms of acquisition of the Federal Government and are subject to fraud and corruption. In this context, many institutions such as the Brazilian Office of the Controller General try to identify fraud... more
Procurement is one of the main forms of acquisition of the Federal Government and are subject to fraud and corruption. In this context, many institutions such as the Brazilian Office of the Controller General try to identify fraud evidences, e.g., use of queries in databases with information about partnership of companies owned by the same group of people to simulate competition. However, many of these queries are manual or limited by the use of information systems databases. This paper explore the benefits of using graph NoSQL databases to identify the relationships between companies to detect fraud in procurement. A case study was carried out to validate the use of NoSQL databases using companies' partnership information based on queries defined by the CGU's auditors.
The ubiquity of uncertainty across application domains generates a need for principled support for uncertainty management in semantically aware systems. A probabilistic ontology provides constructs for representing uncertainty in domain... more
The ubiquity of uncertainty across application domains generates a need for principled support for uncertainty management in semantically aware systems. A probabilistic ontology provides constructs for representing uncertainty in domain ontologies. While the literature has been growing on formalisms for representing uncertainty in ontologies, there remains little guidance in the knowledge engineering literature for how to design probabilistic ontologies. To address the gap, this paper presents the Uncertainty Modeling Process for Semantic Technology (UMP-ST), a new methodology for modeling probabilistic ontologies. To explain how the methodology works and to verify that it can be applied to different scenarios, this paper describes step-by-step the construction of a proof-of-concept probabilistic ontology. The resulting domain model can be used to support identification of fraud in public procurements in Brazil. While the case study illustrates the development of a probabilistic ont...
Research Interests:
One of the main goals of every tax administration is safeguarding tax justice. For that matter, accurate taxpayers’ auditing selection plays an important role. Current scenario of economic recession, budget cuts and tax professionals’... more
One of the main goals of every tax administration is safeguarding tax justice. For that matter, accurate taxpayers’ auditing selection plays an important role. Current scenario of economic recession, budget cuts and tax professionals’ hiring difficulty combined with growth of both population and number of enterprises presents the necessity of a more efficiently approach from tax administration in order to meet its objectives. The present work intends to show how data mining techniques usage helps better understand the profile of non compliant tax payers who claim for tax refunds. Moreover, we present results on the adoption of predictive models towards selection improvement of those who claims that are more likely to be rejected in Federal Revenue of Brazil (RFB). Preliminary results shows that this approach is an efficient way for selecting tax payers rather than not using it.
Research Interests:
Research Interests:
Research Interests:
This paper presents a case study of machine learning applied to measure the risk of corruption of civil servants using political party affiliation data. Initially, a statistical hypothesis test verified the dependency between corruption... more
This paper presents a case study of machine learning applied to measure the risk of corruption of civil servants using political party affiliation data. Initially, a statistical hypothesis test verified the dependency between corruption and political party affiliation. Then, we constructed datasets with standardization and three different discrimination techniques. Using Weka environment, this work shows the application and statistical evaluation of four classification algorithms to build models for predicting risk of corruption: Bayesian Networks, Support Vector Machines, Random Forest, and Artificial Neural Networks with back propagation. To evaluate the models we used data mining metrics such as precision, recall, kappa statistic and percent correct. Lastly, the case study compares the learned model with the best performance to the experts' model. The comparison not only confirms previous experts' affirmations, but also provides new assertions on the affiliation-corruptibility relation.
This volume contains the papers presented at the 7th International Workshop on Uncertainty Reasoning for the Semantic Web (URSW 2011), held as a part of the 10th International Semantic Web Conference (ISWC 2011) at Bonn, Germany, October... more
This volume contains the papers presented at the 7th International Workshop on Uncertainty Reasoning for the Semantic Web (URSW 2011), held as a part of the 10th International Semantic Web Conference (ISWC 2011) at Bonn, Germany, October 23, 2011. It contains 8 technical papers and 3 position papers, which were selected in a rigorous reviewing process, where each paper was reviewed by at least four program committee members. The International Semantic Web Conference is a major international forum for presenting ...
This volume contains the papers presented at the 7th International Workshop on Uncertainty Reasoning for the Semantic Web (URSW 2011), held as a part of the 10th International Semantic Web Conference (ISWC 2011) at Bonn, Germany, October... more
This volume contains the papers presented at the 7th International Workshop on Uncertainty Reasoning for the Semantic Web (URSW 2011), held as a part of the 10th International Semantic Web Conference (ISWC 2011) at Bonn, Germany, October 23, 2011. It contains 8 technical papers and 3 position papers, which were selected in a rigorous reviewing process, where each paper was reviewed by at least four program committee members.
Foreword This volume contains the papers presented at the 5th International Workshop on Uncertainty Reasoning for the Semantic Web (URSW 2009), held as a part of the 8th International Semantic Web Conference (ISWC 2009) at the Westfields... more
Foreword This volume contains the papers presented at the 5th International Workshop on Uncertainty Reasoning for the Semantic Web (URSW 2009), held as a part of the 8th International Semantic Web Conference (ISWC 2009) at the Westfields Conference Center near Washington, DC, USA, October 26, 2009. It contains 6 technical papers and 3 position papers, which were selected in a rigorous reviewing process, where each paper was reviewed by at least four program committee members.
Page 1. A GUI Tool for Plausible Reasoning in the Semantic Web using MEBN Rommel N. Carvalho, Laécio L. Santos, Marcelo Ladeira Computer Science Department – Exact Sciences School University of Brasília – Brasília ...
As the work with semantics and services grows more ambitious in the Semantic Web community, there is an increasing appreciation on the need for principled approaches for representing and reasoning under uncertainty. Reacting to this... more
As the work with semantics and services grows more ambitious in the Semantic Web community, there is an increasing appreciation on the need for principled approaches for representing and reasoning under uncertainty. Reacting to this trend, the World Wide Web Consortium (W3C) has recently created the Uncertainty Reasoning for the World Wide Web Incubator Group (URW3-XG) to better define the challenge of reasoning with and representing uncertain information available through the World Wide Web and related WWW technologies. In according to the URW3-XG effort this Chapter presents the implementation of a graphical user interface (GUI) for building probabilistic ontologies, an application programming interface (API) for saving and loading these ontologies and a grammar proposal to specify formulas for creating conditional probabilistic tables dynamically. The language used for building probabilistic ontologies is Probabilistic OWL (PR-OWL), an extension for OWL based on Multi-Entity Bayesian Network (MEBN). The GUI, API, and the compiler for the proposed grammar were implemented into UnBBayes-MEBN, an open source, Java-based application that provides an easy way for building probabilistic ontologies and reasoning based on the PR-OWL/MEBN framework.
ABSTRACT The quest for principled approaches to represent and reason under uncertainty in the Semantic Web (SW) is a very active research subject. Recently, the World Wide Web Consortium (W3C) created the Uncertainty Reasoning for the... more
ABSTRACT The quest for principled approaches to represent and reason under uncertainty in the Semantic Web (SW) is a very active research subject. Recently, the World Wide Web Consortium (W3C) created the Uncertainty Reasoning for the World Wide Web Incubator Group - URW3-XG [Laskey, K.J. et al., 2007] to better define the challenge of reasoning with and representing uncertain information available through the World Wide Web and related WWW technologies. One of the most promising approaches is the use of a Bayesian framework to handle uncertainty in SW ontologies. Working within this approach, Costa [2005] proposed a probabilistic ontology language, denoted PR-OWL, to represent and to reason with probabilistic ontologies. PR-OWL language is based on MEBN – Multi-Entity Bayesian Network [Laskey & Mahoney, 1997; Laskey & Costa, 2005; Laskey, 2007], a formalism that brings together the expressiveness of first-order logic (FOL) and the inferential power of Bayesian Networks (BN) to support probabilistic reasoning. Since both MEBN and PR-OWL are still under development, there is no tool that implements MEBN/PR-OWL as a knowledge representation formalism and probabilistic reasoner. This paper discusses the technical problems encountered, as well as how they were addressed in such an implementation that is currently under development at the University of Brasilia, with technical support from the C4I Center at George Mason University. Keywords: Multi-Entity Bayesian Network, Bayesian networks, probabilistic ontology Web, probabilistic reasoning, Semantic Web.

And 9 more