[go: up one dir, main page]

Next Issue
Volume 10, October
Previous Issue
Volume 10, August
 
 

Information, Volume 10, Issue 9 (September 2019) – 24 articles

Cover Story (view full-size image): A polysemous term has many potential translation equivalents in a target language. The translation could lose its meaning if the term translation and domain knowledge are not taken into account. The evaluation of terminology translation has been one of the least-explored areas in machine translation (MT) research. To the best of our knowledge, as of now, no one has proposed any effective way to evaluate terminology translation in MT automatically. This work presents a semi-automatic terminology annotation strategy from which a gold standard for evaluating terminology translation in automatic translation can be created. The paper also introduces a classification framework that can automatically classify term translation-related errors and expose specific problems in relation to terminology translation in MT. View this paper
  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
17 pages, 3598 KiB  
Article
Clustering Algorithms and Validation Indices for a Wide mmWave Spectrum
by Bogdan Antonescu, Miead Tehrani Moayyed and Stefano Basagni
Information 2019, 10(9), 287; https://doi.org/10.3390/info10090287 - 19 Sep 2019
Cited by 4 | Viewed by 3015
Abstract
Radio channel propagation models for the millimeter wave (mmWave) spectrum are extremely important for planning future 5G wireless communication systems. Transmitted radio signals are received as clusters of multipath rays. Identifying these clusters provides better spatial and temporal characteristics of the mmWave channel. [...] Read more.
Radio channel propagation models for the millimeter wave (mmWave) spectrum are extremely important for planning future 5G wireless communication systems. Transmitted radio signals are received as clusters of multipath rays. Identifying these clusters provides better spatial and temporal characteristics of the mmWave channel. This paper deals with the clustering process and its validation across a wide range of frequencies in the mmWave spectrum below 100 GHz. By way of simulations, we show that in outdoor communication scenarios clustering of received rays is influenced by the frequency of the transmitted signal. This demonstrates the sparse characteristic of the mmWave spectrum (i.e., we obtain a lower number of rays at the receiver for the same urban scenario). We use the well-known k-means clustering algorithm to group arriving rays at the receiver. The accuracy of this partitioning is studied with both cluster validity indices (CVIs) and score fusion techniques. Finally, we analyze how the clustering solution changes with narrower-beam antennas, and we provide a comparison of the cluster characteristics for different types of antennas. Full article
(This article belongs to the Special Issue Emerging Topics in Wireless Communications for Future Smart Cities)
Show Figures

Figure 1

Figure 1
<p>The 44 MPCs received at Rx#9 location.</p>
Full article ">Figure 2
<p>MATLAB environment for controlling Wireless InSite simulations.</p>
Full article ">Figure 3
<p>Clustered CIR—average received power and ToA for each cluster.</p>
Full article ">Figure 4
<p>Clustering with <span class="html-italic">k</span>-means algorithm—ToA vs. AoA and AoD.</p>
Full article ">Figure 5
<p>CH and DB indices applied to clustering results for Rx#9 location.</p>
Full article ">Figure 6
<p>GD index applied to clustering results for Rx#9 location.</p>
Full article ">Figure 7
<p>XB and PBM indices applied to clustering results for Rx#9 location.</p>
Full article ">Figure 8
<p>CDF of the cluster AoA and ToA for all four mmWave frequencies and all 14 receivers.</p>
Full article ">Figure 9
<p>CDF of the cluster AoA and ToA for 22° and 7° HPBW antennas at 28 GHz.</p>
Full article ">Figure 10
<p>CDF of the cluster AoA and ToA for 22° and 7° HPBW antennas at 73 GHz.</p>
Full article ">Figure 11
<p>CDF of the RMS delay spread for 22° and 7° HPBW antennas at 28 GHz and 73 GHz.</p>
Full article ">
26 pages, 6453 KiB  
Article
Copy-Move Forgery Detection and Localization Using a Generative Adversarial Network and Convolutional Neural-Network
by Younis Abdalla, M. Tariq Iqbal and Mohamed Shehata
Information 2019, 10(9), 286; https://doi.org/10.3390/info10090286 - 16 Sep 2019
Cited by 40 | Viewed by 8022
Abstract
The problem of forged images has become a global phenomenon that is spreading mainly through social media. New technologies have provided both the means and the support for this phenomenon, but they are also enabling a targeted response to overcome it. Deep convolution [...] Read more.
The problem of forged images has become a global phenomenon that is spreading mainly through social media. New technologies have provided both the means and the support for this phenomenon, but they are also enabling a targeted response to overcome it. Deep convolution learning algorithms are one such solution. These have been shown to be highly effective in dealing with image forgery derived from generative adversarial networks (GANs). In this type of algorithm, the image is altered such that it appears identical to the original image and is nearly undetectable to the unaided human eye as a forgery. The present paper investigates copy-move forgery detection using a fusion processing model comprising a deep convolutional model and an adversarial model. Four datasets are used. Our results indicate a significantly high detection accuracy performance (~95%) exhibited by the deep learning CNN and discriminator forgery detectors. Consequently, an end-to-end trainable deep neural network approach to forgery detection appears to be the optimal strategy. The network is developed based on two-branch architecture and a fusion module. The two branches are used to localize and identify copy-move forgery regions through CNN and GAN. Full article
(This article belongs to the Section Information and Communications Technology)
Show Figures

Figure 1

Figure 1
<p>The GAN training cycle for fake image translation (i.e., fake image creation based on a real image).</p>
Full article ">Figure 2
<p>Layout of the proposed model.</p>
Full article ">Figure 3
<p>The GAN module used in the proposed model.</p>
Full article ">Figure 4
<p>The branch used for similarity detection based on CNN.</p>
Full article ">Figure 5
<p>Overview of the proposed two-branched, DNN-based model (GAN and CNN) ended by merging the networks to provide a CMFD solution.</p>
Full article ">Figure 6
<p>From left to right, the samples from random results show the process of training improving transition results of the model using the CIFAR-10 dataset and a maximum iteration of 10,000.</p>
Full article ">Figure 7
<p>The loss function of the pretraining model.</p>
Full article ">Figure 8
<p>The training results of the MNIST dataset with 1000 epochs.</p>
Full article ">Figure 9
<p>The training results of the CIFAR-10 dataset with 1000 epochs.</p>
Full article ">Figure 10
<p>The training results of the CIFAR-10 dataset with 10,000 epochs.</p>
Full article ">Figure 11
<p>The output after the first initial training stage is low quality.</p>
Full article ">Figure 12
<p>Output in an advanced stage of training loops shows a closer output compared to the real input image.</p>
Full article ">Figure 13
<p>From left to right, some samples from random results show the process of training improving the transition result of the model using the local dataset.</p>
Full article ">Figure 14
<p>Loss functions (D loss, G loss) of the training model using a custom local dataset using the pretrained discriminator and generator, respectively.</p>
Full article ">Figure 15
<p>Detecting the copy-move forgery using the GAN model.</p>
Full article ">Figure 16
<p>Random results of similarity detection with F1 score &gt; 0.25.</p>
Full article ">Figure 17
<p>Random results of similarity detection with F1 score and threshold T &gt; 0.75.</p>
Full article ">Figure 18
<p>The ROC for F score comparison with state-of-the-art models.</p>
Full article ">Figure 19
<p>(<b>a</b>) Masking the forged area (GAN) (<math display="inline"><semantics> <mrow> <mi>M</mi> <mo stretchy="false">)</mo> </mrow> </semantics></math>. (<b>b</b>) Masking the similar areas in the forged image frame (CNN) (<math display="inline"><semantics> <mrow> <msup> <mi>M</mi> <mo>′</mo> </msup> <mo stretchy="false">)</mo> </mrow> </semantics></math>. (<b>c</b>) Forgery location.</p>
Full article ">Figure 20
<p>Shows the ROC for F scores.</p>
Full article ">Figure 21
<p>Illustrates that the area under the curve (AUC).</p>
Full article ">Figure 22
<p>The final output contributes the three main objectives of the model.</p>
Full article ">
16 pages, 416 KiB  
Article
Low-Cost, Low-Power FPGA Implementation of ED25519 and CURVE25519 Point Multiplication
by Mohamad Ali Mehrabi and Christophe Doche
Information 2019, 10(9), 285; https://doi.org/10.3390/info10090285 - 14 Sep 2019
Cited by 24 | Viewed by 6227
Abstract
Twisted Edwards curves have been at the center of attention since their introduction by Bernstein et al. in 2007. The curve ED25519, used for Edwards-curve Digital Signature Algorithm (EdDSA), provides faster digital signatures than existing schemes without sacrificing security. The CURVE25519 is a [...] Read more.
Twisted Edwards curves have been at the center of attention since their introduction by Bernstein et al. in 2007. The curve ED25519, used for Edwards-curve Digital Signature Algorithm (EdDSA), provides faster digital signatures than existing schemes without sacrificing security. The CURVE25519 is a Montgomery curve that is closely related to ED25519. It provides a simple, constant time, and fast point multiplication, which is used by the key exchange protocol X25519. Software implementations of EdDSA and X25519 are used in many web-based PC and Mobile applications. In this paper, we introduce a low-power, low-area FPGA implementation of the ED25519 and CURVE25519 scalar multiplication that is particularly relevant for Internet of Things (IoT) applications. The efficiency of the arithmetic modulo the prime number 2 255 19 , in particular the modular reduction and modular multiplication, are key to the efficiency of both EdDSA and X25519. To reduce the complexity of the hardware implementation, we propose a high-radix interleaved modular multiplication algorithm. One benefit of this architecture is to avoid the use of large-integer multipliers relying on FPGA DSP modules. Full article
Show Figures

Figure 1

Figure 1
<p>ED25519 Point doubling flow diagram.</p>
Full article ">Figure 2
<p>ED25519 Point addition flow diagram.</p>
Full article ">Figure 3
<p>CURVE25519 differential point addition and point doubling flow diagram.</p>
Full article ">Figure 4
<p>Legend for <a href="#information-10-00285-f001" class="html-fig">Figure 1</a>, <a href="#information-10-00285-f002" class="html-fig">Figure 2</a> and <a href="#information-10-00285-f003" class="html-fig">Figure 3</a>.</p>
Full article ">Figure 5
<p>Basic Interleaved modular multiplication unit proposed by [<a href="#B17-information-10-00285" class="html-bibr">17</a>].</p>
Full article ">Figure 6
<p>Radix-8 Interleaved modular multiplication unit.</p>
Full article ">Figure 7
<p>Hardware implementation of modular addition (<math display="inline"><semantics> <mrow> <mi>A</mi> <mo>+</mo> <mi>B</mi> <mspace width="3.33333pt"/> <mo form="prefix">mod</mo> <mspace width="0.277778em"/> <mi>p</mi> </mrow> </semantics></math>).</p>
Full article ">Figure 8
<p>Hardware implementation of modular subtraction (<math display="inline"><semantics> <mrow> <mi>A</mi> <mo>−</mo> <mi>B</mi> <mspace width="3.33333pt"/> <mo form="prefix">mod</mo> <mspace width="0.277778em"/> <mi>p</mi> </mrow> </semantics></math>).</p>
Full article ">Figure 9
<p>ALU unit configuration.</p>
Full article ">Figure 10
<p>Point multiplication core.</p>
Full article ">
8 pages, 511 KiB  
Article
Another Step in the Ladder of DNS-Based Covert Channels: Hiding Ill-Disposed Information in DNSKEY RRs
by Marios Anagnostopoulos and John André Seem
Information 2019, 10(9), 284; https://doi.org/10.3390/info10090284 - 12 Sep 2019
Cited by 1 | Viewed by 3516
Abstract
Covert channel communications are of vital importance for the ill-motivated purposes of cyber-crooks. Through these channels, they are capable of communicating in a stealthy way, unnoticed by the defenders and bypassing the security mechanisms of protected networks. The covert channels facilitate the hidden [...] Read more.
Covert channel communications are of vital importance for the ill-motivated purposes of cyber-crooks. Through these channels, they are capable of communicating in a stealthy way, unnoticed by the defenders and bypassing the security mechanisms of protected networks. The covert channels facilitate the hidden distribution of data to internal agents. For instance, a stealthy covert channel could be beneficial for the purposes of a botmaster that desires to send commands to their bot army, or for exfiltrating corporate and sensitive private data from an internal network of an organization. During the evolution of Internet, a plethora of network protocols has been exploited as covert channel. DNS protocol however has a prominent position in this exploitation race, as it is one of the few protocols that is rarely restricted by security policies or filtered by firewalls, and thus fulfills perfectly a covert channel’s requirements. Therefore, there are more than a few cases where the DNS protocol and infrastructure are exploited in well-known security incidents. In this context, the work at hand puts forward by investigating the feasibility of exploiting the DNS Security Extensions (DNSSEC) as a covert channel. We demonstrate that is beneficial and quite straightforward to embed the arbitrary data of an aggressor’s choice within the DNSKEY resource record, which normally provides the public key of a DNSSEC-enabled domain zone. Since DNSKEY contains the public key encoded in base64 format, it can be easily exploited for the dissemination of an encrypted or stego message, or even for the distribution of a malware’s binary encoded in base64 string. To this end, we implement a proof of concept based on two prominent nameserver software, namely BIND and NDS, and we publish in the DNS hierarchy custom data of our choice concealed as the public key of the DNS zone under our jurisdiction in order to demonstrate the effectiveness of the proposed covert channel. Full article
(This article belongs to the Special Issue Botnets)
Show Figures

Figure 1

Figure 1
<p>Proposed covert channel transactions.</p>
Full article ">
38 pages, 16344 KiB  
Article
Modelling and Resolution of Dynamic Reliability Problems by the Coupling of Simulink and the Stochastic Hybrid Fault Tree Object Oriented (SHyFTOO) Library
by Ferdinando Chiacchio, Jose Ignacio Aizpurua, Lucio Compagno, Soheyl Moheb Khodayee and Diego D’Urso
Information 2019, 10(9), 283; https://doi.org/10.3390/info10090283 - 11 Sep 2019
Cited by 17 | Viewed by 5289
Abstract
Dependability assessment is one of the most important activities for the analysis of complex systems. Classical analysis techniques of safety, risk, and dependability, like Fault Tree Analysis or Reliability Block Diagrams, are easy to implement, but they estimate inaccurate dependability results due to [...] Read more.
Dependability assessment is one of the most important activities for the analysis of complex systems. Classical analysis techniques of safety, risk, and dependability, like Fault Tree Analysis or Reliability Block Diagrams, are easy to implement, but they estimate inaccurate dependability results due to their simplified hypotheses that assume the components’ malfunctions to be independent from each other and from the system working conditions. Recent contributions within the umbrella of Dynamic Probabilistic Risk Assessment have shown the potential to improve the accuracy of classical dependability analysis methods. Among them, Stochastic Hybrid Fault Tree Automaton (SHyFTA) is a promising methodology because it can combine a Dynamic Fault Tree model with the physics-based deterministic model of a system process, and it can generate dependability metrics along with performance indicators of the physical variables. This paper presents the Stochastic Hybrid Fault Tree Object Oriented (SHyFTOO), a Matlab® software library for the modelling and the resolution of a SHyFTA model. One of the novel features discussed in this contribution is the ease of coupling with a Matlab® Simulink model that facilitates the design of complex system dynamics. To demonstrate the utilization of this software library and the augmented capability of generating further dependability indicators, three different case studies are discussed and solved with a thorough description for the implementation of the corresponding SHyFTA models. Full article
Show Figures

Figure 1

Figure 1
<p>Breakdown of the categories of stochastic modelling methods for the dependability assessment.</p>
Full article ">Figure 2
<p>Representation of the Stochastic Hybrid Fault Tree Automaton (SHyFTA) model of the distillation column case study.</p>
Full article ">Figure 3
<p>Simulink model of the physical process of the case study.</p>
Full article ">Figure 4
<p>Simulink model of the HBE1 and HBE5 of the case study.</p>
Full article ">Figure 5
<p>Simulink model of the HBE2 of the case study.</p>
Full article ">Figure 6
<p>Simulink model of the HBE6 of the case study.</p>
Full article ">Figure 7
<p>The Simulink implementation of the hybrid basic event HBE6.</p>
Full article ">Figure 8
<p>The Simulink implementation of the physical process of the case study.</p>
Full article ">Figure 9
<p>The SHyFTA model of an electric motor.</p>
Full article ">Figure 10
<p>Unreliability of the electric motor and comparison with the static Fault Tree model result.</p>
Full article ">Figure 11
<p>Ambient temperature used as input of the SHyFTA model.</p>
Full article ">Figure 12
<p>Failure rate of the bearings of the electric motor case study.</p>
Full article ">Figure 13
<p>Schema of the deterministic process of the PV system, including the battery.</p>
Full article ">Figure 14
<p>Dynamic fault tree of the domestic photovoltaic power plant equipped with a storage system.</p>
Full article ">Figure 15
<p>Expected progressive energy production injected to the grid.</p>
Full article ">Figure 16
<p>Expected progressive energy required from the grid.</p>
Full article ">Figure 17
<p>The return of investment.</p>
Full article ">Figure 18
<p>Payback time sensitivity analysis depending on the variation of the battery cost.</p>
Full article ">Figure 19
<p>Volume of the mixture processed in a good way (OK), dumped (NOK), or not processed at all (Lost).</p>
Full article ">Figure 20
<p>Unsafety of the process of distillation.</p>
Full article ">Figure A1
<p>Customization of a Simulink block of type ToWorkspace.</p>
Full article ">Figure A2
<p>Dynamic Fault Tree gate behaviour.</p>
Full article ">Figure A3
<p>Simulink blocks of the SHyFTA_TEMPLATE.slx file, to use in the Simulink model of a SHyFTA.</p>
Full article ">Figure A4
<p>Customization of a Simulink block of type Constant.</p>
Full article ">Figure A5
<p>The Simulink implementation of the “Generic Hybrid Basic Event” block.</p>
Full article ">Figure A6
<p>Block Parameters for the variables (<b>a</b>) QI and (<b>b</b>) % particle.</p>
Full article ">
15 pages, 4519 KiB  
Article
A Novel Approach to Component Assembly Inspection Based on Mask R-CNN and Support Vector Machines
by Haisong Huang, Zhongyu Wei and Liguo Yao
Information 2019, 10(9), 282; https://doi.org/10.3390/info10090282 - 11 Sep 2019
Cited by 21 | Viewed by 3845
Abstract
Assembly is a very important manufacturing process in the age of Industry 4.0. Aimed at the problems of part identification and assembly inspection in industrial production, this paper proposes a method of assembly inspection based on machine vision and a deep neural network. [...] Read more.
Assembly is a very important manufacturing process in the age of Industry 4.0. Aimed at the problems of part identification and assembly inspection in industrial production, this paper proposes a method of assembly inspection based on machine vision and a deep neural network. First, the image acquisition platform is built to collect the part and assembly images. We use the Mask R-CNN model to identify and segment the shape from each part image, and to obtain the part category and position coordinates in the image. Then, according to the image segmentation results, the area, perimeter, circularity, and Hu invariant moment of the contour are extracted to form the feature vector. Finally, the SVM classification model is constructed to identify the assembly defects, with a classification accuracy rate of over 86.5%. The accuracy of the method is verified by constructing an experimental platform. The results show that the method effectively completes the identification of missing and misaligned parts in the assembly, and has good robustness. Full article
(This article belongs to the Special Issue IoT Applications and Industry 4.0)
Show Figures

Figure 1

Figure 1
<p>Algorithm flow chart. SVM: Support Vector Machine.</p>
Full article ">Figure 2
<p>Mask R-CNN structure chart.</p>
Full article ">Figure 3
<p>The region proposal network (RPN) structure.</p>
Full article ">Figure 4
<p>The Conception of the Interview over Union (IoU).</p>
Full article ">Figure 5
<p>The structure of full convolutional network (FCN).</p>
Full article ">Figure 6
<p>The principle of Support Vector Machine.</p>
Full article ">Figure 7
<p>Image acquisition platform.</p>
Full article ">Figure 8
<p>Some examples of data enhancement. (<b>a</b>) Origin image; (<b>b</b>) randomly crop; (<b>c</b>) change the image contrast; and (<b>d</b>) add noise.</p>
Full article ">Figure 9
<p>Some examples of annotated training images.</p>
Full article ">Figure 10
<p>Parameter fine-tuning process.</p>
Full article ">Figure 11
<p>The entire model training process.</p>
Full article ">Figure 12
<p>The entire model training process.</p>
Full article ">Figure 13
<p>The entire model training process. (<b>a</b>) The value of the regression loss function; (<b>b</b>) the value of the classification loss function; (<b>c</b>) the value of the regression mask loss function; (<b>d</b>) the value of the total loss function.</p>
Full article ">Figure 14
<p>Multi-category test result.</p>
Full article ">Figure 15
<p>Segmentation results of assembly. (<b>a</b>) Example of origin test image; (<b>b</b>) example of detection and segmentation results.</p>
Full article ">
18 pages, 1492 KiB  
Article
Factors Influencing Online Hotel Booking: Extending UTAUT2 with Age, Gender, and Experience as Moderators
by Chia-Ming Chang, Li-Wei Liu, Hsiu-Chin Huang and Huey-Hong Hsieh
Information 2019, 10(9), 281; https://doi.org/10.3390/info10090281 - 9 Sep 2019
Cited by 72 | Viewed by 21448
Abstract
As people feel more comfortable using the Internet, online hotel bookings has become popular in recent years. Understanding the drivers of online booking intention and behavior can help hotel managers to apply corresponding strategies to increase hotel booking rates. Thus the purpose of [...] Read more.
As people feel more comfortable using the Internet, online hotel bookings has become popular in recent years. Understanding the drivers of online booking intention and behavior can help hotel managers to apply corresponding strategies to increase hotel booking rates. Thus the purpose of this study is to investigate the factors influencing the use intention and behavioral intention of online hotel booking. The proposed model has assimilated factors from the extended Unified theory of Acceptance and use of Technology (UTAUT2) along with age, gender, and experience as moderators. Data were collected by conducting a field survey questionnaire completed by 488 participants. The results showed that behavioral intention is significantly and positively influenced by performance expectancy, social influence, facilitating condition, hedonic motivation, price value, and habit behavior. Use behavior is positively influenced by facilitating condition and hedonic motivation. As for moderators, gender moderates the relationships between performance expectancy, social influence, and behavioral intention. Age moderates the relationships between effort expectancy, social influence, hedonic motivation, and behavioral intention. Experience moderates the relationships between social influence, price value, and behavioral intention and between habit behavior and use behavior. Based on the results, recommendations for hotel managers are proposed. Furthermore, research limitations and future directions are discussed. Full article
Show Figures

Figure 1

Figure 1
<p>The hypothesized conceptual model of the study.</p>
Full article ">Figure 2
<p>Structural equation modelling (SEM) results of the standardized model parameter estimation. (note: ”--“: path coefficient was not significant; ”-”: path coefficient was significant).</p>
Full article ">Figure 3
<p>Direct effect of effort expectancy on behavioral intention.</p>
Full article ">Figure 4
<p>Seven independent variables effect on behavioral intention. * <span class="html-italic">p</span> &lt; 0.05.</p>
Full article ">Figure 5
<p>Direct effect of behavioral intention on use behavior.</p>
Full article ">Figure 6
<p>Seven independent variables and behavioral intention effect on use behavior. *<span class="html-italic">p</span> &lt; 0.05.</p>
Full article ">Figure 7
<p>Seven variables and experience moderator removed the effect of behavioral intention on use behavior within the model.</p>
Full article ">
22 pages, 10442 KiB  
Article
Constructing and Visualizing High-Quality Classifier Decision Boundary Maps
by Francisco C. M. Rodrigues, Mateus Espadoto, Roberto Hirata, Jr. and Alexandru C. Telea
Information 2019, 10(9), 280; https://doi.org/10.3390/info10090280 - 9 Sep 2019
Cited by 32 | Viewed by 5972
Abstract
Visualizing decision boundaries of machine learning classifiers can help in classifier design, testing and fine-tuning. Decision maps are visualization techniques that overcome the key sparsity-related limitation of scatterplots for this task. To increase the trustworthiness of decision map use, we perform an extensive [...] Read more.
Visualizing decision boundaries of machine learning classifiers can help in classifier design, testing and fine-tuning. Decision maps are visualization techniques that overcome the key sparsity-related limitation of scatterplots for this task. To increase the trustworthiness of decision map use, we perform an extensive evaluation considering the dimensionality-reduction (DR) projection techniques underlying decision map construction. We extend the visual accuracy of decision maps by proposing additional techniques to suppress errors caused by projection distortions. Additionally, we propose ways to estimate and visually encode the distance-to-decision-boundary in decision maps, thereby enriching the conveyed information. We demonstrate our improvements and the insights that decision maps convey on several real-world datasets. Full article
(This article belongs to the Special Issue Information Visualization Theory and Applications (IVAPP 2019))
Show Figures

Figure 1

Figure 1
<p>(<b>a</b>) Dense map construction algorithm; (<b>b</b>) Two-phase experiment set-up.</p>
Full article ">Figure 2
<p>Dense maps for Logistic Regression (<b>a</b>) and Random Forest (<b>b</b>) classifiers on the 2-class <math display="inline"><semantics> <msub> <mi>S</mi> <mn>2</mn> </msub> </semantics></math> dataset, all 28 tested projections.</p>
Full article ">Figure 3
<p>Dense maps for k-Nearest Neighbor (k-NN) (<b>a</b>) and Convolutional Neural Network (CNN) (<b>b</b>) classifiers on the 2-class <math display="inline"><semantics> <msub> <mi>S</mi> <mn>2</mn> </msub> </semantics></math> dataset, all 28 tested projections.</p>
Full article ">Figure 4
<p>Dense maps for all classifiers, 10-class dataset, five best-performing projections.</p>
Full article ">Figure 5
<p>Classification errors (white dots) shown atop of the dense maps, logistic regression (LR) and CNN classifiers.</p>
Full article ">Figure 6
<p>Histogram of <math display="inline"><semantics> <mrow> <mi>J</mi> <msub> <mi>D</mi> <mi>k</mi> </msub> </mrow> </semantics></math> rank for varying values of <span class="html-italic">k</span> for MNIST dataset, t-SNE projection.</p>
Full article ">Figure 7
<p>Removing poorly projected points with low <math display="inline"><semantics> <mrow> <mi>J</mi> <msub> <mi>D</mi> <mi>k</mi> </msub> </mrow> </semantics></math> ranks to filter dense map artifacts for the MNIST dataset, projected by t-SNE, inversely projected by iLAMP.</p>
Full article ">Figure 8
<p>Dense map (<b>a</b>) and various distance-to-boundary maps (<b>b</b>–<b>d</b>) for Blobs dataset, computed using UMAP for <span class="html-italic">P</span> and NNInv for <math display="inline"><semantics> <msup> <mi>P</mi> <mrow> <mo>−</mo> <mn>1</mn> </mrow> </msup> </semantics></math>.</p>
Full article ">Figure 9
<p>Estimation of distance-to-boundary <math display="inline"><semantics> <msubsup> <mi>d</mi> <mrow> <mi>n</mi> <mi>D</mi> </mrow> <mrow> <mi>i</mi> <mi>m</mi> <mi>g</mi> </mrow> </msubsup> </semantics></math> (<b>a</b>) and <math display="inline"><semantics> <msubsup> <mi>d</mi> <mrow> <mi>n</mi> <mi>D</mi> </mrow> <mrow> <mi>n</mi> <mi>n</mi> </mrow> </msubsup> </semantics></math> (<b>b</b>). See <a href="#sec6dot1-information-10-00280" class="html-sec">Section 6.1</a> and <a href="#sec6dot2-information-10-00280" class="html-sec">Section 6.2</a>.</p>
Full article ">Figure 10
<p>Dense map and distance maps for MNIST (top row) and FashionMNIST dataset (bottom row), with projection <span class="html-italic">P</span> set to UMAP and <math display="inline"><semantics> <msup> <mi>P</mi> <mrow> <mo>−</mo> <mn>1</mn> </mrow> </msup> </semantics></math> to NNInv respectively.</p>
Full article ">Figure 11
<p>(<b>a</b>) Dense map for MNIST (top row) and FashionMNIST (bottom row) datasets. (<b>b</b>–<b>d</b>) Combined dense map and distance-to-boundary maps for different <math display="inline"><semantics> <msub> <mi>k</mi> <mn>1</mn> </msub> </semantics></math> and <math display="inline"><semantics> <msub> <mi>k</mi> <mn>2</mn> </msub> </semantics></math> values.</p>
Full article ">Figure 12
<p>Misclassifications with opacity coding distance-to-boundary for (<b>a</b>) MNIST and (<b>b</b>) FashionMNIST datasets.</p>
Full article ">Figure 13
<p>Enridged distance maps for MNIST (top row) and FashionMNIST (bottom row) datasets. Images (<b>b</b>–<b>e</b>) show the progressive noise-smoothing effect of the filter radius <span class="html-italic">K</span>.</p>
Full article ">
3 pages, 149 KiB  
Editorial
Editorial for the Special Issue on “Natural Language Processing and Text Mining”
by Pablo Gamallo and Marcos Garcia
Information 2019, 10(9), 279; https://doi.org/10.3390/info10090279 - 6 Sep 2019
Cited by 2 | Viewed by 2897
Abstract
Natural language processing (NLP) and Text Mining (TM) are a set of overlapping strategies working on unstructured text [...] Full article
(This article belongs to the Special Issue Natural Language Processing and Text Mining)
15 pages, 925 KiB  
Article
An Efficient Dummy-Based Location Privacy-Preserving Scheme for Internet of Things Services
by Yongwen Du, Gang Cai, Xuejun Zhang, Ting Liu and Jinghua Jiang
Information 2019, 10(9), 278; https://doi.org/10.3390/info10090278 - 5 Sep 2019
Cited by 11 | Viewed by 4650
Abstract
With the rapid development of GPS-equipped smart mobile devices and mobile computing, location-based services (LBS) are increasing in popularity in the Internet of Things (IoT). Although LBS provide enormous benefits to users, they inevitably introduce some significant privacy concerns. To protect user privacy, [...] Read more.
With the rapid development of GPS-equipped smart mobile devices and mobile computing, location-based services (LBS) are increasing in popularity in the Internet of Things (IoT). Although LBS provide enormous benefits to users, they inevitably introduce some significant privacy concerns. To protect user privacy, a variety of location privacy-preserving schemes have been recently proposed. Among these schemes, the dummy-based location privacy-preserving (DLP) scheme is a widely used approach to achieve location privacy for mobile users. However, the computation cost of the existing dummy-based location privacy-preserving schemes is too high to meet the practical requirements of resource-constrained IoT devices. Moreover, the DLP scheme is inadequate to resist against an adversary with side information. Thus, how to effectively select a dummy location is still a challenge. In this paper, we propose a novel lightweight dummy-based location privacy-preserving scheme, named the enhanced dummy-based location privacy-preserving(Enhanced-DLP) to address this challenge by considering both computational costs and side information. Specifically, the Enhanced-DLP adopts an improved greedy scheme to efficiently select dummy locations to form a k-anonymous set. A thorough security analysis demonstrated that our proposed Enhanced-DLP can protect user privacy against attacks. We performed a series of experiments to verify the effectiveness of our Enhanced-DLP. Compared with the existing scheme, the Enhanced-DLP can obtain lower computational costs for the selection of a dummy location and it can resist side information attacks. The experimental results illustrate that the Enhanced-DLP scheme can effectively be applied to protect the user’s location privacy in IoT applications and services. Full article
(This article belongs to the Special Issue The End of Privacy?)
Show Figures

Figure 1

Figure 1
<p>Location services and applications in the Internet of Things (IoT).</p>
Full article ">Figure 2
<p>LSB privacy-protection model in IoT.</p>
Full article ">Figure 3
<p>Scenario 1: Entropy and run time.</p>
Full article ">Figure 4
<p>Scenario 2: Entropy and run time.</p>
Full article ">Figure 5
<p>Scenario 2: The probability of query recognition versus <span class="html-italic">k</span>.</p>
Full article ">
14 pages, 2624 KiB  
Article
Network Model for Online News Media Landscape in Twitter
by Ford Lumban Gaol, Tokuro Matsuo and Ardian Maulana
Information 2019, 10(9), 277; https://doi.org/10.3390/info10090277 - 5 Sep 2019
Cited by 7 | Viewed by 3520
Abstract
Today, most studies of audience networks analyze the landscape of the news media on the web. However, media ecology has been drastically reconfigured by the emergence of social media. In this study, we use Twitter follower data to build an online news media [...] Read more.
Today, most studies of audience networks analyze the landscape of the news media on the web. However, media ecology has been drastically reconfigured by the emergence of social media. In this study, we use Twitter follower data to build an online news media network that represents the pattern of news consumption in Twitter. This study adopted a weighted network model proposed by Mukerjee et al. and implemented the Filter Disparity Method suggested by Majó-Vázquez et al. to identify the most significant overlaps in the network. The implementation result on news media outlets data in three countries, namely Indonesia, Malaysia, and Singapore, shows that network analysis of follower overlap data can offer relevant insights about media diet and the way readers navigate various news sources available on social media. Full article
(This article belongs to the Section Information and Communications Technology)
Show Figures

Figure 1

Figure 1
<p>Research methodology</p>
Full article ">Figure 2
<p>Histogram of Twitter followers for online news media outlets in Indonesia, Malaysia, and Singapore.</p>
Full article ">Figure 3
<p>The top five online news media based on the degree value and the number of Twitter followers.</p>
Full article ">Figure 4
<p>News media networks in Indonesia (V<sub>indonesia</sub> = 162, E<sub>indonesia</sub> = 754). Nodes represent online news media outlets and edges represent shared followers between any two outlets. The size and the label of a node is proportional to the degree centrality of the node.</p>
Full article ">Figure 5
<p>News media networks in Malaysia (V<sub>malaysia</sub> = 86, E<sub>malaysia</sub> = 227). Nodes represent online news media outlets and edges represent shared followers between any two outlets. The size and the label of a node is proportional to the degree centrality of the node.</p>
Full article ">Figure 6
<p>News media networks in Singapore (V<sub>singapore</sub> = 30, E<sub>singapore</sub> = 46). Nodes represent online news media outlets and edges represent shared followers between any two outlets. The size and the label of a node is proportional to the degree centrality of the node.</p>
Full article ">
13 pages, 1479 KiB  
Article
Adverse Drug Event Detection Using a Weakly Supervised Convolutional Neural Network and Recurrent Neural Network Model
by Min Zhang and Guohua Geng
Information 2019, 10(9), 276; https://doi.org/10.3390/info10090276 - 4 Sep 2019
Cited by 19 | Viewed by 3364
Abstract
Social media and health-related forums, including the expression of customer reviews, have recently provided data sources for adverse drug reaction (ADR) identification research. However, in the existing methods, the neglect of noise data and the need for manually labeled data reduce the accuracy [...] Read more.
Social media and health-related forums, including the expression of customer reviews, have recently provided data sources for adverse drug reaction (ADR) identification research. However, in the existing methods, the neglect of noise data and the need for manually labeled data reduce the accuracy of the prediction results and greatly increase manual labor. We propose a novel architecture named the weakly supervised mechanism (WSM) convolutional neural network (CNN) long-short-term memory (WSM-CNN-LSTM), which combines the strength of CNN and bi-directional long short-term memory (Bi-LSTM). The WSM applies the weakly labeled data to pre-train the parameters of the model and then uses the labeled data to fine-tune the initialized network parameters. The CNN employs a convolutional layer to study the characteristics of the drug reviews and active features at different scales, and then the feed-forward and feed-back neural networks of the Bi-LSTM utilize these salient features to output the regression results. The experimental results effectively demonstrate that our model marginally outperforms the comparison models in ADR identification and that a small quantity of labeled samples results in an optimal performance, which decreases the influence of noise and reduces the manual data-labeling requirements. Full article
(This article belongs to the Section Information Applications)
Show Figures

Figure 1

Figure 1
<p>Architecture of the WSM-CNN-LSTM model.</p>
Full article ">Figure 2
<p>Architecture of the LSTM memory.</p>
Full article ">Figure 3
<p>Sizes of the weakly labeled and manually labeled datasets.</p>
Full article ">Figure 4
<p>Impact of labeled training data size on each model.</p>
Full article ">
15 pages, 779 KiB  
Article
Least Squares Consensus for Matching Local Features
by Qingming Zhang, Buhai Shi and Haibo Xu
Information 2019, 10(9), 275; https://doi.org/10.3390/info10090275 - 2 Sep 2019
Cited by 1 | Viewed by 2556
Abstract
This paper presents a new approach to estimate the consensus in a data set. Under the framework of RANSAC, the perturbation on data has not been considered sufficiently. We analysis the computation of homography in RANSAC and find that the variance of its [...] Read more.
This paper presents a new approach to estimate the consensus in a data set. Under the framework of RANSAC, the perturbation on data has not been considered sufficiently. We analysis the computation of homography in RANSAC and find that the variance of its estimation monotonically decreases when the size of sample increases. From this result, we carry out an approach which can suppress the perturbation and estimate the consensus set simultaneously. Different from other consensus estimators based on random sampling methods, our approach builds on the least square method and the order statistics and therefore is an alternative scheme for consensus estimation. Combined with the nearest neighbour-based method, our approach reaches higher matching precision than the plain RANSAC and MSAC, which is shown in our simulations. Full article
Show Figures

Figure 1

Figure 1
<p>Scale change for the textured scene by the Bark sequence. From (<b>a</b>) to (<b>e</b>) the degree of change ascends.</p>
Full article ">Figure 2
<p>Blur for the structured scene by the Bikes sequence. From (<b>a</b>) to (<b>e</b>) the degree of change ascends.</p>
Full article ">Figure 3
<p>Scale change for the structured scene by the Boat sequence. From (<b>a</b>) to (<b>e</b>) the degree of change ascends.</p>
Full article ">Figure 4
<p>Blur for the textured scene by the Trees sequence. From (<b>a</b>) to (<b>e</b>) the degree of change ascends.</p>
Full article ">Figure 5
<p>JPEG compression by the UBC sequence. From (<b>a</b>) to (<b>e</b>) the degree of change ascends.</p>
Full article ">Figure 6
<p>Illumination change by the Leuven sequence. From (<b>a</b>) to (<b>e</b>) the degree of change ascends.</p>
Full article ">Figure 7
<p>Viewpoint change for the textured scene by the Wall sequence. From (<b>a</b>) to (<b>e</b>) the degree of change ascends.</p>
Full article ">Figure 8
<p>Viewpoint change for the structured scene by the Graffiti sequence. From (<b>a</b>) to (<b>e</b>) the degree of change ascends.</p>
Full article ">
21 pages, 942 KiB  
Article
Encrypting and Preserving Sensitive Attributes in Customer Churn Data Using Novel Dragonfly Based Pseudonymizer Approach
by Kalyan Nagaraj, Sharvani GS and Amulyashree Sridhar
Information 2019, 10(9), 274; https://doi.org/10.3390/info10090274 - 31 Aug 2019
Cited by 6 | Viewed by 4441
Abstract
With miscellaneous information accessible in public depositories, consumer data is the knowledgebase for anticipating client preferences. For instance, subscriber details are inspected in telecommunication sector to ascertain growth, customer engagement and imminent opportunity for advancement of services. Amongst such parameters, churn rate is [...] Read more.
With miscellaneous information accessible in public depositories, consumer data is the knowledgebase for anticipating client preferences. For instance, subscriber details are inspected in telecommunication sector to ascertain growth, customer engagement and imminent opportunity for advancement of services. Amongst such parameters, churn rate is substantial to scrutinize migrating consumers. However, predicting churn is often accustomed with prevalent risk of invading sensitive information from subscribers. Henceforth, it is worth safeguarding subtle details prior to customer-churn assessment. A dual approach is adopted based on dragonfly and pseudonymizer algorithms to secure lucidity of customer data. This twofold approach ensures sensitive attributes are protected prior to churn analysis. Exactitude of this method is investigated by comparing performances of conventional privacy preserving models against the current model. Furthermore, churn detection is substantiated prior and post data preservation for detecting information loss. It was found that the privacy based feature selection method secured sensitive attributes effectively as compared to traditional approaches. Moreover, information loss estimated prior and post security concealment identified random forest classifier as superlative churn detection model with enhanced accuracy of 94.3% and minimal data forfeiture of 0.32%. Likewise, this approach can be adopted in several domains to shield vulnerable information prior to data modeling. Full article
(This article belongs to the Special Issue The End of Privacy?)
Show Figures

Figure 1

Figure 1
<p>Overview of privacy imposed churn detection approach.</p>
Full article ">Figure 2
<p>The distribution of attributes in churn dataset; Here red indicates churn customers while blue indicates non-churn customers. The figure is derived from Weka software [<a href="#B72-information-10-00274" class="html-bibr">72</a>].</p>
Full article ">Figure 3
<p>The distribution of churn features based on importance score in Boruta algorithm.</p>
Full article ">
28 pages, 657 KiB  
Article
Terminology Translation in Low-Resource Scenarios
by Rejwanul Haque, Mohammed Hasanuzzaman and Andy Way
Information 2019, 10(9), 273; https://doi.org/10.3390/info10090273 - 30 Aug 2019
Cited by 3 | Viewed by 5096
Abstract
Term translation quality in machine translation (MT), which is usually measured by domain experts, is a time-consuming and expensive task. In fact, this is unimaginable in an industrial setting where customised MT systems often need to be updated for many reasons (e.g., availability [...] Read more.
Term translation quality in machine translation (MT), which is usually measured by domain experts, is a time-consuming and expensive task. In fact, this is unimaginable in an industrial setting where customised MT systems often need to be updated for many reasons (e.g., availability of new training data, leading MT techniques). To the best of our knowledge, as of yet, there is no publicly-available solution to evaluate terminology translation in MT automatically. Hence, there is a genuine need to have a faster and less-expensive solution to this problem, which could help end-users to identify term translation problems in MT instantly. This study presents a faster and less expensive strategy for evaluating terminology translation in MT. High correlations of our evaluation results with human judgements demonstrate the effectiveness of the proposed solution. The paper also introduces a classification framework, TermCat, that can automatically classify term translation-related errors and expose specific problems in relation to terminology translation in MT. We carried out our experiments with a low resource language pair, English–Hindi, and found that our classifier, whose accuracy varies across the translation directions, error classes, the morphological nature of the languages, and MT models, generally performs competently in the terminology translation classification task. Full article
(This article belongs to the Special Issue Computational Linguistics for Low-Resource Languages)
Show Figures

Figure 1

Figure 1
<p>TermMarker.</p>
Full article ">Figure 2
<p>Curves for acceptance ratio of suggestions.</p>
Full article ">Figure 3
<p>Flowchart of TermCat.</p>
Full article ">
19 pages, 320 KiB  
Essay
Correlations and How to Interpret Them
by Harald Atmanspacher and Mike Martin
Information 2019, 10(9), 272; https://doi.org/10.3390/info10090272 - 29 Aug 2019
Cited by 8 | Viewed by 5523
Abstract
Correlations between observed data are at the heart of all empirical research that strives for establishing lawful regularities. However, there are numerous ways to assess these correlations, and there are numerous ways to make sense of them. This essay presents a bird’s eye [...] Read more.
Correlations between observed data are at the heart of all empirical research that strives for establishing lawful regularities. However, there are numerous ways to assess these correlations, and there are numerous ways to make sense of them. This essay presents a bird’s eye perspective on different interpretive schemes to understand correlations. It is designed as a comparative survey of the basic concepts. Many important details to back it up can be found in the relevant technical literature. Correlations can (1) extend over time (diachronic correlations) or they can (2) relate data in an atemporal way (synchronic correlations). Within class (1), the standard interpretive accounts are based on causal models or on predictive models that are not necessarily causal. Examples within class (2) are (mainly unsupervised) data mining approaches, relations between domains (multiscale systems), nonlocal quantum correlations, and eventually correlations between the mental and the physical. Full article
16 pages, 6117 KiB  
Article
Waveform Optimization of Compressed Sensing Radar without Signal Recovery
by Quanhui Wang and Ying Sun
Information 2019, 10(9), 271; https://doi.org/10.3390/info10090271 - 29 Aug 2019
Cited by 1 | Viewed by 3246
Abstract
Radar signal processing mainly focuses on target detection, classification, estimation, filtering, and so on. Compressed sensing radar (CSR) technology can potentially provide additional tools to simultaneously reduce computational complexity and effectively solve inference problems. CSR allows direct compressive signal processing without the need [...] Read more.
Radar signal processing mainly focuses on target detection, classification, estimation, filtering, and so on. Compressed sensing radar (CSR) technology can potentially provide additional tools to simultaneously reduce computational complexity and effectively solve inference problems. CSR allows direct compressive signal processing without the need to reconstruct the signal. This study aimed to solve the problem of CSR detection without signal recovery by optimizing the transmit waveform. Therefore, a waveform optimization method was introduced to improve the output signal-to-interference-plus-noise ratio (SINR) in the case where the target signal is corrupted by colored interference and noise having known statistical characteristics. Two different target models are discussed: deterministic and random. In the case of a deterministic target, the optimum transmit waveform is derived by maximizing the SINR and a suboptimum solution is also presented. In the case of random target, an iterative waveform optimization method is proposed to maximize the output SINR. This approach ensures that SINR performance is improved in each iteration step. The performance of these methods is illustrated by computer simulation. Full article
(This article belongs to the Section Information Processes)
Show Figures

Figure 1

Figure 1
<p>Illustration of (<b>a</b>) the signal model of the compressed sensing radar (CSR) and (<b>b</b>) the discrete baseband equivalent model. (<b>b</b>) illustrates the discrete baseband equivalent model, where the target transfer function <span class="html-italic">T</span>(<span class="html-italic">z</span>) can be assumed to be a finite impulse response (FIR) filter of the form. (<b>c</b>) illustrates the <span class="html-italic">K</span> sampling waveforms <math display="inline"><semantics> <mrow> <msubsup> <mrow> <mrow> <mo>{</mo> <mrow> <msub> <mi>φ</mi> <mi>k</mi> </msub> <mrow> <mo>(</mo> <mi>n</mi> <mo>)</mo> </mrow> </mrow> <mo>}</mo> </mrow> </mrow> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>K</mi> </msubsup> </mrow> </semantics></math>, where <span class="html-italic">r</span>(<span class="html-italic">t</span>) is the analog input signal.</p>
Full article ">Figure 2
<p>The equivalent signal model.</p>
Full article ">Figure 3
<p>Comparison of the signal-to-interference-plus-noise ratio (SINR) versus the compressive ratio <math display="inline"><semantics> <mrow> <mi>γ</mi> <mo>.</mo> </mrow> </semantics></math></p>
Full article ">Figure 4
<p>Comparison of the SINR versus <math display="inline"><semantics> <mrow> <msup> <mi>σ</mi> <mn>2</mn> </msup> <mo>/</mo> <msubsup> <mi>σ</mi> <mn>0</mn> <mn>2</mn> </msubsup> </mrow> </semantics></math> under various compressed ratios <math display="inline"><semantics> <mi>γ</mi> </semantics></math>.</p>
Full article ">Figure 5
<p>The spectra of (<b>a</b>) optimal and (<b>b</b>) suboptimal waveforms when <math display="inline"><semantics> <mrow> <mi>γ</mi> <mo>=</mo> <mn>1</mn> </mrow> </semantics></math>.</p>
Full article ">Figure 6
<p>Corresponding autocorrelation functions of (<b>a</b>) optimal and (<b>b</b>) suboptimal waveforms when <math display="inline"><semantics> <mrow> <mi>γ</mi> <mo>=</mo> <mn>1</mn> </mrow> </semantics></math>.</p>
Full article ">Figure 7
<p>The spectra of (<b>a</b>) optimal and (<b>b</b>) suboptimal waveforms when <math display="inline"><semantics> <mrow> <mi>γ</mi> <mo>=</mo> <mn>0.2</mn> </mrow> </semantics></math>.</p>
Full article ">Figure 8
<p>Corresponding autocorrelation functions of (<b>a</b>) optimal and (<b>b</b>) suboptimal waveforms when <math display="inline"><semantics> <mrow> <mi>γ</mi> <mo>=</mo> <mn>0.2</mn> </mrow> </semantics></math>.</p>
Full article ">Figure 9
<p>Comparison of the SINR versus the compressive ratio <math display="inline"><semantics> <mi>γ</mi> </semantics></math> for extended target.</p>
Full article ">Figure 10
<p>Comparison of the SINR versus <math display="inline"><semantics> <mrow> <msup> <mi>σ</mi> <mn>2</mn> </msup> <mo>/</mo> <msubsup> <mi>σ</mi> <mn>0</mn> <mn>2</mn> </msubsup> </mrow> </semantics></math> under various compression ratios for extended target.</p>
Full article ">Figure 11
<p>Comparison of the SINR versus the number of iterations for random target impulse response <math display="inline"><semantics> <mrow> <mi>γ</mi> <mo>=</mo> <mn>0.5</mn> </mrow> </semantics></math>.</p>
Full article ">Figure 12
<p>Comparison of the SINR versus compressive ratio for random target impulse response.</p>
Full article ">Figure 13
<p>Comparison of the SINR versus <math display="inline"><semantics> <mrow> <msup> <mi>σ</mi> <mn>2</mn> </msup> <mo>/</mo> <msubsup> <mi>σ</mi> <mn>0</mn> <mn>2</mn> </msubsup> </mrow> </semantics></math> under various compression ratios <math display="inline"><semantics> <mi>γ</mi> </semantics></math> for random target impulse response.</p>
Full article ">
16 pages, 2307 KiB  
Article
Process Discovery in Business Process Management Optimization
by Paweł Dymora, Maciej Koryl and Mirosław Mazurek
Information 2019, 10(9), 270; https://doi.org/10.3390/info10090270 - 29 Aug 2019
Cited by 9 | Viewed by 5679
Abstract
Appropriate business processes management (BPM) within an organization can help attain organizational goals. It is particularly important to effectively manage the lifecycle of these processes for organizational effectiveness in improving ever-growing performance and competitivity-building across the company. This paper presents a process discovery [...] Read more.
Appropriate business processes management (BPM) within an organization can help attain organizational goals. It is particularly important to effectively manage the lifecycle of these processes for organizational effectiveness in improving ever-growing performance and competitivity-building across the company. This paper presents a process discovery and how we can use it in a broader framework supporting self-organization in BPM. Process discovery is intrinsically associated with the process lifecycle. We have made a pre-evaluation of the usefulness of our facts using a generated log file. We also compared visualizations of the outcomes of our approach with different cases and showed performance characteristics of the cash loan sales process. Full article
(This article belongs to the Section Information Systems)
Show Figures

Figure 1

Figure 1
<p>Proposed omnichannel business model.</p>
Full article ">Figure 2
<p>Use cases (UC) in the process of a cash loan.</p>
Full article ">Figure 3
<p>WF-net discovered for <math display="inline"><semantics> <mrow> <msub> <mi>L</mi> <mn>1</mn> </msub> <mo>,</mo> <mo> </mo> <msub> <mi>L</mi> <mn>2</mn> </msub> <mo>,</mo> <mo> </mo> <msub> <mi>L</mi> <mn>3</mn> </msub> </mrow> </semantics></math>.</p>
Full article ">Figure 4
<p>Cash loan process in terms of BPMN.</p>
Full article ">Figure 5
<p>Part of the event log of a cash loan process.</p>
Full article ">Figure 6
<p>Petri net for a cash loan process recorded in the event log.</p>
Full article ">Figure 7
<p>BPMN graph for a cash loan process recorded in the event log.</p>
Full article ">Figure 8
<p>Subcontracting social network.</p>
Full article ">Figure 9
<p>Resource service time statistics.</p>
Full article ">
17 pages, 854 KiB  
Concept Paper
Computer Vision-Based Unobtrusive Physical Activity Monitoring in School by Room-Level Physical Activity Estimation: A Method Proposition
by Hans Hõrak
Information 2019, 10(9), 269; https://doi.org/10.3390/info10090269 - 28 Aug 2019
Cited by 13 | Viewed by 4607
Abstract
As sedentary lifestyles and childhood obesity are becoming more prevalent, research in the field of physical activity (PA) has gained much momentum. Monitoring the PA of children and adolescents is crucial for ascertaining and understanding the phenomena that facilitate and hinder PA in [...] Read more.
As sedentary lifestyles and childhood obesity are becoming more prevalent, research in the field of physical activity (PA) has gained much momentum. Monitoring the PA of children and adolescents is crucial for ascertaining and understanding the phenomena that facilitate and hinder PA in order to develop effective interventions for promoting physically active habits. Popular individual-level measures are sensitive to social desirability bias and subject reactivity. Intrusiveness of these methods, especially when studying children, also limits the possible duration of monitoring and assumes strict submission to human research ethics requirements and vigilance in personal data protection. Meanwhile, growth in computational capacity has enabled computer vision researchers to successfully use deep learning algorithms for real-time behaviour analysis such as action recognition. This work analyzes the weaknesses of existing methods used in PA research; gives an overview of relevant advances in video-based action recognition methods; and proposes the outline of a novel action intensity classifier utilizing sensor-supervised learning for estimating ambient PA. The proposed method, if applied as a distributed privacy-preserving sensor system, is argued to be useful for monitoring the spatio-temporal distribution of PA in schools over long periods and assessing the efficiency of school-based PA interventions. Full article
Show Figures

Figure 1

Figure 1
<p>Classifying the intensity of ambient physical activity at a constant frequency (30 frames/~1 Hz).</p>
Full article ">
16 pages, 441 KiB  
Article
The Usefulness of Imperfect Speech Data for ASR Development in Low-Resource Languages
by Jaco Badenhorst and Febe de Wet
Information 2019, 10(9), 268; https://doi.org/10.3390/info10090268 - 28 Aug 2019
Cited by 6 | Viewed by 3962
Abstract
When the National Centre for Human Language Technology (NCHLT) Speech corpus was released, it created various opportunities for speech technology development in the 11 official, but critically under-resourced, languages of South Africa. Since then, the substantial improvements in acoustic modeling that deep architectures [...] Read more.
When the National Centre for Human Language Technology (NCHLT) Speech corpus was released, it created various opportunities for speech technology development in the 11 official, but critically under-resourced, languages of South Africa. Since then, the substantial improvements in acoustic modeling that deep architectures achieved for well-resourced languages ushered in a new data requirement: their development requires hundreds of hours of speech. A suitable strategy for the enlargement of speech resources for the South African languages is therefore required. The first possibility was to look for data that has already been collected but has not been included in an existing corpus. Additional data was collected during the NCHLT project that was not included in the official corpus: it only contains a curated, but limited subset of the data. In this paper, we first analyze the additional resources that could be harvested from the auxiliary NCHLT data. We also measure the effect of this data on acoustic modeling. The analysis incorporates recent factorized time-delay neural networks (TDNN-F). These models significantly reduce phone error rates for all languages. In addition, data augmentation and cross-corpus validation experiments for a number of the datasets illustrate the utility of the auxiliary NCHLT data. Full article
(This article belongs to the Special Issue Computational Linguistics for Low-Resource Languages)
Show Figures

Figure 1

Figure 1
<p>Local phone error rates (PERs) for 400 utterance subsets of the <span class="html-italic">Aux1</span> data.</p>
Full article ">
12 pages, 582 KiB  
Article
Study on Unknown Term Translation Mining from Google Snippets
by Bin Li and Jianmin Yao
Information 2019, 10(9), 267; https://doi.org/10.3390/info10090267 - 28 Aug 2019
Cited by 2 | Viewed by 3177
Abstract
Bilingual web pages are widely used to mine translations of unknown terms. This study focused on an effective solution for obtaining relevant web pages, extracting translations with correct lexical boundaries, and ranking the translation candidates. This research adopted co-occurrence information to obtain the [...] Read more.
Bilingual web pages are widely used to mine translations of unknown terms. This study focused on an effective solution for obtaining relevant web pages, extracting translations with correct lexical boundaries, and ranking the translation candidates. This research adopted co-occurrence information to obtain the subject terms and then expanded the source query with the translation of the subject terms to collect effective bilingual search engine snippets. Afterwards, valid candidates were extracted from small-sized, noisy bilingual corpora using an improved frequency change measurement that combines adjacent information. This research developed a method that considers surface patterns, frequency–distance, and phonetic features to elect an appropriate translation. The experimental results revealed that the proposed method performed remarkably well for mining translations of unknown terms. Full article
Show Figures

Figure 1

Figure 1
<p>The architecture of the mining system for translations of unknown terms based on the web.</p>
Full article ">
23 pages, 15211 KiB  
Article
Enhanced Grid-Based Visual Analysis of Retinal Layer Thickness with Optical Coherence Tomography
by Martin Röhlig, Ruby Kala Prakasam, Jörg Stüwe, Christoph Schmidt, Oliver Stachs and Heidrun Schumann
Information 2019, 10(9), 266; https://doi.org/10.3390/info10090266 - 23 Aug 2019
Cited by 12 | Viewed by 12999
Abstract
Optical coherence tomography enables high-resolution 3D imaging of retinal layers in the human eye. The thickness of the layers is commonly assessed to understand a variety of retinal and systemic disorders. Yet, the thickness data are complex and currently need to be considerably [...] Read more.
Optical coherence tomography enables high-resolution 3D imaging of retinal layers in the human eye. The thickness of the layers is commonly assessed to understand a variety of retinal and systemic disorders. Yet, the thickness data are complex and currently need to be considerably reduced prior to further processing and analysis. This leads to a loss of information on localized variations in thickness, which is important for early detection of certain retinal diseases. We propose an enhanced grid-based reduction and exploration of retinal thickness data. Alternative grids are computed, their representation quality is rated, and best fitting grids for given thickness data are suggested. Selected grids are then visualized, adapted, and compared at different levels of granularity. A visual analysis tool bundles all computational, visual, and interactive means in a flexible user interface. We demonstrate the utility of our tool in a complementary analysis procedure, which eases the evaluation of ophthalmic study data. Ophthalmologists successfully applied our solution to study localized variations in thickness of retinal layers in patients with diabetes mellitus. Full article
(This article belongs to the Special Issue Information Visualization Theory and Applications (IVAPP 2019))
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>The layout of the standard ETDRS grid. The ETDRS grid cells divide the retina into nine regions defined by three rings and four quadrants (<b>a</b>). An overview shows the grid on top of a fundus image of a healthy left eye (<b>b</b>). The location of a B-scan is marked (central green line). A detail view shows the B-scan (<b>c</b>), segmented layer boundaries of total retina, and the layer thickness as a line chart (<b>d</b>). The dotted lines denote anatomically distinct areas along the image axis.</p>
Full article ">Figure 2
<p>Grid representation of retinal layer thickness. An OCT scan captures the area around the macula and the optic disk (<b>a</b>). Multiple B-scans are acquired (<b>b</b>). Retinal layers are segmented per B-scan (<b>c</b>) and thickness values are computed for every point along the horizontal image axes (<b>d</b>). The thickness values are combined per layer (<b>e</b>) and aggregated into thickness grids (<b>f</b>).</p>
Full article ">Figure 3
<p>Problems with an ETDRS grid-based data representation. The thickness of a retinal layer is shown via a thickness map (<b>a</b>) and an ETDRS grid (<b>b</b>). A cell (<b>c</b>) with localized regions of high thickness, visible in the map (dark red; <b>d</b>), has almost the same aggregated value in the grid as a cell (<b>e</b>) without such regions in the map (<b>f</b>). In the second map (<b>g</b>), localized regions of positive and negative deviations in thickness (dark red and dark blue; <b>h</b>) are nearly nullified due to data aggregation in corresponding cells (<b>i</b>) of the grid (<b>j</b>).</p>
Full article ">Figure 4
<p>Subdivision, mapping, aggregation, and rating of grids. The standard ETDRS grid layout is subdivided via radial (dashed line) or sector-wise (dotted line) partitions (<b>a</b>). A coarse grid is mapped to a fine grid or vice versa by subdividing or merging corresponding grid cells (<b>b</b>). Multiple source grids are compiled into a single aggregated grid (<b>c</b>). The representation quality of grids is rated by measuring the standard deviation of thickness values within each grid cell (<b>d</b>).</p>
Full article ">Figure 5
<p>Overview of our visualization design. In the left top-down view (<b>a</b>), a retinal layer (<b>b</b>) is selected in the layer overview (<b>c</b>) and an associated thickness grid is shown on top of a fundus image. Cell color encodes aggregated thickness values, labels indicate cell values close to specified thresholds, and borders of cells with low ratings are highlighted (purple). Details of a selected cell (<b>d</b>) are shown in a linked measurement view (<b>e</b>). In the right top-down view (<b>f</b>), a grid is mapped and compared to a control grid and deviations are color-coded. Details of a selected cell (<b>g</b>) are shown in a second measurement view (<b>h</b>) in relation to the control distribution.</p>
Full article ">Figure 6
<p>Interactive labeling and adaption of grids. The labeling of grids shows either numerical cell values or location-oriented cell names (<b>a</b>). White cell borders mark the layout of the standard ETDRS grid on top of a subdivided grid (<b>b</b>). The adaption of grids involves specifying initial grids, browsing through them, and adjusting cells of selected grids (purple borders) on demand (<b>c</b>).</p>
Full article ">Figure 7
<p>Comparison of grids. The top-down view (<b>a</b>) and linked measurement view (<b>b</b>) show a comparison between patients and controls via aggregated grids of both groups. The juxtaposed small multiple views (<b>c</b>) show an overview of grids of all individual patients in relation to controls. Patient values or group means are compared to the distribution of controls (<b>d</b>). A diverging color palette encodes differences per cell (<b>e</b>) and borders of cells with significant differences are highlighted (orange).</p>
Full article ">Figure 8
<p>A visual analysis tool for retinal layer thickness. A top-down view shows either grids (<b>a</b>) or maps (<b>b</b>). A 3D view presents a volume visualization of raw OCT data (<b>c</b>), and a 2D cross-sectional view displays individual B-scans (<b>d</b>) and thickness profiles of selected layers (<b>e</b>). All views are linked, selections in grids or maps are highlighted, and details are shown in a measurement view (<b>f</b>).</p>
Full article ">Figure 9
<p>Exemplary procedures for evaluating cross-sectional ophthalmic study data using an ETDRS-based analysis approach (CA) and an enhanced grid-based analysis approach (VA). After a common data preparation stage, CA comprises four analysis steps (<b>CA1–CA4</b>) with three software tools and VA involves three analysis steps (<b>VA1–VA3</b>) with our visual analysis tool.</p>
Full article ">Figure 10
<p>The experimental study setup and results in AMD patients and matched controls. Drusen are indicated by localized thickening in lower retinal layers of patient eyes (<b>a</b>). Examples of evaluated grids show increasingly finer subdivision of cells (<b>b</b>). The line plot (<b>c</b>) illustrates ratings of grids with respect to cell counts for thickness data of patients (red) and controls (green) in relation to three partitioning strategies: radial (<b>d</b>,<b>g</b>), sector-wise (<b>e</b>,<b>h</b>), and radial and sector-wise (<b>f</b>,<b>i</b>).</p>
Full article ">Figure 11
<p>Exemplary results of two cross-sectional studies. The grids represent TR in pediatric T1DM patients (<b>a</b>,<b>b</b>) and RNFL in a subgroup of adult T2DM patients (<b>c</b>,<b>d</b>) compared to matched controls. Cell color encodes thickness deviation, statistical significance, or effect size. Highlighted cell borders mark significant differences (<span class="html-italic">p</span> &lt; 0.05). In T1DM patients, the ETDRS grids (<b>a</b>) depict significant thinning limited to a single cell. The subdivided grids (<b>b</b>) show additional cells of significant localized thinning and measurements of multiple small cells (<b>e</b>). In T2DM patients, the ETDRS grids (<b>c</b>) present a general overview of thickness deviation. The subdivided grids (<b>d</b>) show more details of the spatial distribution and degree of thinning. Some ETDRS grid cells underestimate the thinning (<b>f</b>) compared to corresponding subdivided grid cells (<b>g</b>).</p>
Full article ">Figure 12
<p>Asymmetry analysis of retinal layer thickness using rectangular grids. A standard 8 × 8 base grid layout (<b>a</b>) shows the mean difference between corresponding cells of opposite hemispheres (<b>b</b>). The fovea-to-disc axis (green line) marks the symmetry line and cell color encodes the degree of asymmetry (darker colors represent higher differences). In the upper grid half, negative differences between superior and inferior hemisphere are displayed and vice versa in the lower grid half. The subdivided grid (<b>c</b>) shows additional information on the spatial distribution and degree of differences.</p>
Full article ">
16 pages, 331 KiB  
Article
Breaking the MDS-PIR Capacity Barrier via Joint Storage Coding
by Hua Sun and Chao Tian
Information 2019, 10(9), 265; https://doi.org/10.3390/info10090265 - 22 Aug 2019
Cited by 17 | Viewed by 3528
Abstract
The capacity of private information retrieval (PIR) from databases coded using maximum distance separable (MDS) codes was previously characterized by Banawan and Ulukus, where it was assumed that the messages are encoded and stored separably in the databases. This assumption was also usually [...] Read more.
The capacity of private information retrieval (PIR) from databases coded using maximum distance separable (MDS) codes was previously characterized by Banawan and Ulukus, where it was assumed that the messages are encoded and stored separably in the databases. This assumption was also usually made in other related works in the literature, and this capacity is usually referred to as the MDS-PIR capacity colloquially. In this work, we considered the question of if and when this capacity barrier can be broken through joint encoding and storing of the messages. Our main results are two classes of novel code constructions, which allow joint encoding, as well as the corresponding PIR protocols, which indeed outperformed the separate MDS-coded systems. Moreover, we show that a simple, but novel expansion technique allows us to generalize these two classes of codes, resulting in a wider range of the cases where this capacity barrier can be broken. Full article
(This article belongs to the Special Issue Private Information Retrieval: Techniques and Applications)
31 pages, 861 KiB  
Review
A Systematic Mapping Study of MMOG Backend Architectures
by Nicos Kasenides and Nearchos Paspallis
Information 2019, 10(9), 264; https://doi.org/10.3390/info10090264 - 21 Aug 2019
Cited by 4 | Viewed by 4927
Abstract
The advent of utility computing has revolutionized almost every sector of traditional software development. Especially commercial cloud computing services, pioneered by the likes of Amazon, Google and Microsoft, have provided an unprecedented opportunity for the fast and sustainable development of complex distributed systems. [...] Read more.
The advent of utility computing has revolutionized almost every sector of traditional software development. Especially commercial cloud computing services, pioneered by the likes of Amazon, Google and Microsoft, have provided an unprecedented opportunity for the fast and sustainable development of complex distributed systems. Nevertheless, existing models and tools aim primarily for systems where resource usage—by humans and bots alike—is logically and physically quite disperse resulting in a low likelihood of conflicting resource access. However, a number of resource-intensive applications, such as Massively Multiplayer Online Games (MMOGs) and large-scale simulations introduce a requirement for a very large common state with many actors accessing it simultaneously and thus a high likelihood of conflicting resource access. This paper presents a systematic mapping study of the state-of-the-art in software technology aiming explicitly to support the development of MMOGs, a class of large-scale, resource-intensive software systems. By examining the main focus of a diverse set of related publications, we identify a list of criteria that are important for MMOG development. Then, we categorize the selected studies based on the inferred criteria in order to compare their approach, unveil the challenges faced in each of them and reveal research trends that might be present. Finally we attempt to identify research directions which appear promising for enabling the use of standardized technology for this class of systems. Full article
(This article belongs to the Section Review)
Show Figures

Figure 1

Figure 1
<p>Choice of infrastructure over time—as derived from the studied works.</p>
Full article ">Figure 2
<p>Choice of software architecture over time—as derived from the studied works.</p>
Full article ">Figure 3
<p>Scalability over time—as determined in the studied works.</p>
Full article ">
Previous Issue
Next Issue
Back to TopTop