0% found this document useful (0 votes)

24 views5 pages

Real-Time Semantic Web Data Stream Processing Using Storm

This document summarizes research on real-time processing of semantic web data streams. It discusses related work on compressing and distributing RDF data streams. The author presents a system for managing RDF data streams in real-time using the Apache Storm platform. The system analyzes Twitter messages to detect events and extract relations in RDF format to enrich a knowledge base.

Uploaded by

Ravi H

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views5 pages

Real-Time Semantic Web Data Stream Processing Using Storm

Uploaded by

Ravi H

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

2020 International Conference on Computing and Information Technology, University o f Tabuk, Kingdom o f Saudi Arabia.

Volume: 02, Issue: ICCIT- 1441, Page No.: 79 - 83, 9 h & 1 0 Sep. 2020

Real-Time Semantic Web Data Stream Processing

Using Storm
Mouad Banane
dept. o f computer science
Hassan II University
Casablanca, Morocco
mouad.banane-etu@etu.univh2c.ma

Abstract— Semantic web technologies are increasingly used

for the management of data flows. Several RDF flow II. Re l a t e d W o r k
processing systems have been proposed. The data at the entry All the RDF flow processing systems proposed, as well
of the system is big and generated continuously at a fast and as the RDF data processing approaches in real-time, share
variable rate. As a result, storing and processing the entire
the same problems in terms of heterogeneity (multisource
flow becomes costly and reasoning almost impossible.
data) and absence of explicit semantics making it possible to
Consequently, the use of techniques allowing to reduce the load
while preserving the semantics of the data, makes it possible to
satisfy complex requests and the reasoning. In this section,
optimize the treatments even the reasoning. However, none of we provide a brief overview of the research work that deals
the SPARQL extensions include this functionality. Thus, In with the subject of managing RDF data flows in a real-time
this paper we present a system for managing RDF data flows in distributed system. We also present the main systems
real-time, the system contains two parts, the first manages the offering a serialization adapted to RDF flows. RDSZ [1]
storage of RDF data and the second process the data that (RDF Differential Stream compressor based on Zlib) is an
comes in real-time, and combines these news data with old ones RDF compression method. The differential encoder used by
to respond to requests from users, programs, and software the algorithm assigns an identifier to each subject and each
agents. For validation, this approach makes it possible to detect object of the RDF triples and stores them in a table of key-
events and automatically extract relations from them, in RDF value pairs. We can thus serialize the RDF triples, by
format. To do this, the system analyzes Twitter messages in replacing the subjects and the objects by their identifiers, or
real-time simultaneously with the processing of RDF data by putting them empty when the subject repeats on several
stored in a triplestore. triplets. RDSZ analyzes which compression method is most
effective between applying the Zlib algorithm [2] to base
Keywords— Big Data, Real-time Processing, Semantic Web, triples, or applying Zlib to the serialized form. Ztreamy [3] is
RDF Data Stream Processing. a scalable middleware platform for the distribution of
semantic data streams. It makes it possible to publish data
I. In t r o d u c t io n flows so that these can be consumed by other applications;
The Semantic Web is the evolution of web 1.0. Its main the platform supports operations such as mirroring
innovation is to enable data reuse, making it easier to find, (duplication of streams for parallel processing), joining,
combine and use. For this, the available data are organized in partitioning (separation of elements from the stream for
a semantic network, a semantic structure organized using specific processing) and filtering. The scalable and portable
metadata. Metadata are data describing other data: thus, we approach of this project makes it adaptable to a wide range of
can obtain information on each annotated data, which use cases such as for example, the smart city. ERI [4] is a
facilitates its search (for example, we can specify that the compressed RDF data format aimed at reducing the amount
Jason Statham data, associated with a film, corresponds to a of data transmitted during flow processing. Based on RDSZ,
name actor). The semantization of data facilitates their use the algorithm is based on the fact that the structure of the
both for the user and for the machine. The RDF (Resource data in the streams is very familiar to the producer, and that it
Description Framework) format is the basic language of the does not vary greatly. ERI [12] considers a stream as a
semantic web. As a data model, it makes it possible to continuous sequence of RDF triplet blocks. Each block is
establish a graph representing web resources and their divided into channels: structural channels, to encode the
metadata. An RDF document represents information in the subjects of the triples and the properties associated with each
form of triples consisting of a subject, a predicate, and an with a dynamic dictionary of structures, and value channels,
object. In the literature, the management of real-time to encode the values of the concrete data of the triples. The
semantic web data is generally seen as a sub-task of the task structure dictionary brings together all the different groups of
of managing massive data flows. The likelihood of having triplets having the same subject: these groups are called
false positives is also discussed in this paper. Finally, we molecules. Various operations are carried out to optimize the
implement the entire system on the Apache Storm platform size of the molecules, by avoiding repetitions of the discrete
using Twitter data. We propose an approach allowing to predicate for example (identical predicate-object pairs on
detect events and to automatically extract relations, in RDF several subjects). Information about discrete predicates, as
format to enrich a knowledge base. To do this, the system well as information about molecules (metadata, compression,
analyzes messages from Twitter, a dynamic source of configuration ...), is stored in presets, provided by the data
information that can be captured in real-time. In the next source, or inferred at runtime. An ERI stream is, therefore, a
section, we discuss previous work in the literature and then sequence of blocks of molecules, each being multiplexed
we specify our objective and the tools used in section 3. In into several channels, the whole forming a set suitable for
section 4, we describe our system, our system is evaluated in standard compression algorithms. The principle of the
section 5. structure dictionary allows compression optimized for RDF

Vol 02, No. ICCIT- 1441, Page No.: 79 - 83, 9th & 10th Sep. 2020
978-1 -7281 -2680-7/20/$31.00 ©2020 IEEE

Authorized licensed use limited to: R V College of Engineering. Downloaded on January 23,2024 at 18:42:02 UTC from IEEE Xplore. Restrictions apply.
Banane, Web Data Stream Processing ...

flows. But for our use case, the problem related to multiple memory, control their partitioning in order to optimize their
treatments, requiring several compressions and location, and manipulate them using a set of operations (map,
decompressions, persists. filter, join). Spark Streaming extends Spark by the micro
batch operation. It accumulates the data over a certain period
Several approaches to processing big data in distributed
to produce a micro-RDD on which it performs the desired
systems. Recently, many applications have emerged, using
calculation. For this, unlike Storm[15] which performs
data streams from distributed and heterogeneous sources.
processing one by one, Spark Streaming will add a delay
The realization of such a system remains a scientific
between the arrival of a message and its processing. Its API
challenge that will have to take into account the volume of
is identical to the classic Spark API. It is thus possible to
data, their speed and their variety. Some prototypes have
process data streams in the same way as static data. Systems
been proposed in order to define a system architecture that
have also been proposed in order to define hybrid
ensures the management of massive data flows in real-time.
architectures that manage both batch and real-time
o n the other hand, the domain of the semantic web offers
processing such as the Lambda and Storm-Yarn architecture.
through a common format (RDF) to combine several
The idea of Lambda Architecture [8] is to simultaneously use
heterogeneous systems and thus compensate for the variety
batch processing on all data to provide complete views, and
of data. In what follows, we describe some existing systems
real-time processing of data flows to provide dynamic views.
adapted to the processing of raw data flows on a distributed
The outputs from the two treatments can be combined at the
platform. Apache Hadoop [5] is one of these distributed
presentation level. This architecture attempts to balance
systems widely used to analyze big data. It allows you to
throughput, latency, and fault tolerance. It is made up of 3
manage a distributed data file system HDFS (Hadoop
layers: The batch layer manages the storage of the data set,
Distributed File System) which supports storage on a very
as well as the calculation of complete views on a large set or
large number of machines. The advantage of HDFS is to
part of data. These views are updated infrequently since the
limit the transfer time by assigning to each entity of the
calculation time can be long (a few hours). The real-time
cluster the task of processing the data it contains. Hadoop is
layer is used to process recent data (which is not taken into
based on the MapReduce parallel calculation algorithm [6]
account in the batch layer) in order to compensate for the
where the calculation time is normally divided by the
high latency of the batch layer. It continuously calculates
number of entities performing the task. This parallel
real-time views incrementally based on a flow processing
processing is based on the batch mode where each
system (e.g. Storm) and random read/write databases.
calculation lasts a certain time. It is very efficient for
Processing latency is in the order of a few milliseconds. The
analyzing large volumes of data. However, it was not
service layer is used to manage the merging of results from
designed to meet the needs of analyzes with high time
the batch and real-time layers. The logic of fusion is the
constraints, for example, in the case of real-time detection of
responsibility of the developer who will have to define how
anomalies or bank fraud. To get around the nature of the
the data will be exploited. The advantage of the Lambda
batch mode, other solutions are appearing in the Big Data
architecture is its ability to process and maintain data flows,
ecosystem, the most popular of which is Apache Storm and
while large historical data is also processed by a batch
Spark Streaming. Apache Storm [7] is a real-time oriented
pipeline. However, the duality of the batch and real-time
solution based on the concept of complex event processing
layers requires producing the same result from two different
(CEP) and uses the concept of topology. Concretely, it is a
paths. This requires maintaining code in two complex
fault-tolerant distributed computing system that guarantees
distributed systems, designed differently while ensuring the
data processing at least once. Storm revolves around 4 uniqueness of processing an event. Storm-on-Yarn [10] is
concepts: Tuple: it represents a message in the "Storm"
another solution developed by Yahoo! to co-locate the real
sense, namely a list of dynamically typed named values. time processing with the batch processing. The idea is to
Stream: a collection of tuples with the same pattern. Spout: it
make it possible to run Hadoop and CEP technologies in the
consumes the data from a source and transmits one or more
same cluster instead of two separate clusters. The load used
streams to the bolts. Bolt: a tuples processing node that can
by Storm often varies depending on the speed and volume of
generate streams which will be transmitted to other bolts. It
data to be processed. Storm-on-Yarn allows you to manage
can also write the output data to external storage platforms.
peak loads and dynamically allocate resources, normally
Storm also supports an additional level of abstraction
used by Hadoop, to Storm when necessary. Besides, Yahoo!
through the Trident API [8]. This API integrates certain
Added mechanisms that allow Storm applications to access
functions on a data set such as join, aggregation, and
data stored in HDFS and HBase[18].
grouping. It allows processing ordered by minibatch of N
tuples. o n the other hand, Storm does not provide any Big The originality of our work is the management of RDF
Data storage medium as in Hadoop[13]. Spark Streaming [9] data in real-time via the use of a big data processing tool in
is another real-time processing system based on the real-time called Storm..
MapReduce programming paradigm. It is the extension of
Apache Spark, analysis software that accelerates the III. Se m a n t ic W e b , D a t a flo w , and St o r m
processing of data on a Hadoop platform. Spark is 10 to 100
times faster than Hadoop due to the reduced number of A. Sematic Web
writes and reads on the disc. For this, it uses an abstraction
called RDD (Resilient Distributed Dataset) which allows, The Semantic Web aims to organize and structure the
enormous amount of information presented on the Net. It is a
transparently, to mount in-memory data distributed on HDFS
and to persist them on disk if necessary. RDD has the semi-structured language based on XML. Figure 1 shows
advantage of providing fault tolerance without having to one of the versions of the layered organization offered by the
resort to the often costly replication mechanisms. It makes it W3C. Each layer is built on the layers below. Thus, all of the
possible to explicitly persist the intermediate data in layers use XML syntax. This allows you to take advantage of
all the technologies developed around XML: XML Schema,

Vol. 02, No. ICCIT- 1441, Page No.: 79 - 83, 9th & 10th Sep. 2020

Authorized licensed use limited to: R V College of Engineering. Downloaded on January 23,2024 at 18:42:02 UTC from IEEE Xplore. Restrictions apply.
Banane, Web Data Stream Processing ...

XML resource exploitation tools (Java libraries, etc.), XML C. Storm

databases. XML comes from the SGML language, but unlike Apache Storm is a real-time distributed processing
HTML, the structure and presentation of XML documents system for processing data flows. Storm uses the concept of
are conceptually separate. XML is a language that uses tags topology through which data tuples travel. This architecture
as a format for the universal representation of data. At the is made up of Spout and Bolt. A Spout is a source of data
same time, an XML document contains the data and flow, while a Bolt contains the calculation logic. A Spout
indications on the role that this data plays. XML is the and Bolt network is represented by a directed acyclic graph
cornerstone of information exchange on the web. called "topologies". Storm is an open-source distributed real
Unfortunately, XML is insufficient to describe all the time processing system produced by the Apache community.
semantics needed on the Web. It enables large data streams to be processed quickly and
reliably. It can be used for real-time analysis, learning,
continuous calculation. It is characterized by: Speed,
Scalability, Fault tolerance, Reliability, Ease of use (multi
language), and Maturity.

IV. A p p r o a c h D e s c r ip t io n
In the architecture of our system, event data is processed
and managed by distributed systems like Redis [12] and
Storm [13], Redis is used as a memory processing
component.

Fig. 1. Stack of languages for the Semantic Web

RDF is a language developed by the W3C to put a

semantic layer on the Web [11]. It allows the connection of
web resources using directed and labeled arc. The structure
of RDF documents is complex. An RDF document is a set of
triples <subject, predicate, object> as shown in Figure 2. In
addition, the predicate (also called property) links the subject
(resource) to the object (value). Thus, the subject and the
object are nodes of the graph linked by an edge directed from
the subject to the object. The nodes and the arcs belong to
Fig. 4. system architecture
types "resources". A resource is identified by a URI [11].

Value

Fig. 2. An RDF triple.

B. Data Flow
We can define data flows as a continuous, ordered
sequence of items (implicitly by time of arrival in the Data
Flow Management System, or explicitly by production
timestamp at source), arriving in real time. The adoption of heterogeneity of d iffere n t
formats and data models! Homogeneity
semantic web technologies in the world of dynamic data and More knowledge
sensors gave rise to the concept of RDF data flow. Thus,
RDF flows were introduced as a natural extension of the Fig. 5. Translation of data formats and models in RDF
RDF model in the flow environment.

A distributed data flow processing system is an essential

element to ensure high scalability and fast response time.
This system must support scalable reasoning on data flows
using continuous SPARQL queries. Several real-time
distributed computing platforms exist such as Apache Storm,
Apache S4 or Spark[16] Streaming. They offer different
strategies for data partitioning and task allocation. The idea
Fig. 3. Example of RDF graph flow. is to ensure modularity through an API that allows the
flexible introduction of an existing or future calculation
system. When developing such a system, constraints will

Vol. 02, No. ICCIT- 1441, Page No.: 79 - 83, 9th & 10th Sep. 2020

Authorized licensed use limited to: R V College of Engineering. Downloaded on January 23,2024 at 18:42:02 UTC from IEEE Xplore. Restrictions apply.
Banane, Web Data Stream Processing ...

have to be considered such as the dynamic distribution of intermediate processing data in memory to have fast and
data and tasks, scheduling and parallelization of processing, inexpensive input-output access. The second approach
while optimizing network traffic and workload. allows you to persist static data and relevant summaries of
data flow to disk. There are several NoSQL storage solutions
A. Continuous SPARQL in memory such as Memcached [9] and Redis [10]. The data
A continuous query engine should be able to reason not is stored in RAM in a key-value format and can be
represented in several structures such as strings, lists, hashes,
only on data flows but also on static data and even the set of
Cloud Linked Open Data (LOD) datasets. The requests must and sets. A comparative study [11] shows almost similar
performances between Memcached and Redis in terms of
adapt to the incoming speed of the data flows and be
evaluated continuously in order to take account of the execution time. As part of our system, we decided to use
Redis because it supports more functionality for
evolving nature of the flow. The semantics of SPARQL
queries must allow processing based on time or the order of manipulating data. Unlike Memcached, Redis allows you to
periodically persist data on disk, which helps prevent data
arrival of data. The standard SPARQL will be extended by
introducing the concept of an adaptable sliding window (the loss in the event of a failure. It also supports an LUA-based
scripting language for writing stored procedures, the
defined portion of a flow).
atomicity of which is guaranteed by the architecture.
Some prototypes have been proposed recently in the monothreade. Besides, Redis' Sorted Set structure provides a
literature, drawing inspiration from the work done by the practical implementation of sliding windows. It allows you to
conventional database community. For example, CSPARQL automatically manage the sampling by operating
[14] is one of the first extensions of SPARQL intended to aggregations over a time interval, but the eviction must be
support continuous queries. other projects extending programmed manually.
SPARQL have been launched. SPARQL-Stream [15]
extends SPARQL so that it can manage window operators V. V a l id a t io n
without worrying about query performance. CQELS [16], the
most recent language, allows you to act natively on RDF This section assesses the quality and relevance of our
flows and continuous requests without going through extension. To do this, we looked at the performance obtained
intermediary tools. These projects take into account the in terms of execution time and the preservation of the
temporal aspect of flows and implement windowing semantics of the data. We consider the processing of a set of
operators. However, none of these examples is suited to the tweets.
large volume of distributed data flows. Queries on this data Twitter allows free retrieval of streaming data, taking
must be able to run in a dynamic environment with high time advantage of this advantage using a streaming tool like
constraints. The distribution of these queries, as well as the Storm, which is essential for processing this data in real time.
data, plays an important role in ensuring a certain level of In this paper, we will read and analyze Twitter messages in
scalability and latency. This distribution should take into real time with our Storm-based system. We create our
account the optimization of network traffic and the workload. application which retrieves tweets from “Twitter API” using
Also, the distribution of data in several RDF storage Java Eclipse.
platforms requires the establishment of a SPARQL
federation [17]. This raises a question about the best strategy After adding the necessary twitter4j biblios, we first
to follow to optimally execute the federation of continuous create a Java class CreateSpout.java, We know that the
SPARQL queries. To our knowledge, there are two works processing of tweets can be done using only one Bolt, but we
[11,12] that propose to execute continuous SPARQL queries created two Bolts BoltExtractor and RetweetBoltExtractor to
in a distributed way. However, their performance has not prove the join of our system. To join the tiles of these two
been evaluated in a context of complex reasoning, and there Bolts, we need another Bolt BoltRDFWriter, it will store this
is no consideration of the federation aspect. data in RDF format. Now let's create a topology that will
allow us to perform the processing in real time.
B. Data storage
Two approaches to data storage will have to be used in
our system. The first approach makes it possible to store the
45156 [NIO5erverCxn.Factory:0.0.0.0/0.0.0.0:2000] WARN org.apache,zookeeper.server.NIOServerCnxnFactory - Ignoring exception
java.nio.channels.ClosedChanneLException: null
at sun.nio.c h .ServerSocketChannellmpl.accept(ServerSocketChannellmpl.ja va:137) ~ [na:1.6.0_29]
at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:188) ~[zookeeper-3.4.5.jar:3.4.5-1392090]
at java.lang.Thread.run(Thread.java:662) [na:1.6.0_29]
45156 [main] INFO backtype.storm.testing - Done shutting down in process zookeeper

Fig. 6. Result of running the system on Tweets.

Vol. 02, No. ICCIT- 1441, Page No.: 79 - 83, 9 & 10th Sep. 2020

Authorized licensed use limited to: R V College of Engineering. Downloaded on January 23,2024 at 18:42:02 UTC from IEEE Xplore. Restrictions apply.
Banane, Web Data Stream Processing ...

import org.apache.storm.Config;
import o r g .apache.storm.LocalCluster;
import org.apache.storm.topology.TopologyBuilder;

import com. raidentrance.bolt.TwitterAnalyzerBolt;

import c o m .raidentrance.spout.TweetStreamSpout;

)public class TwitterTopology {

] public static void main(String args[]) {
TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("twitterSpout”, new TweetStreamSpout());
builder.setBolt("twitterAnalyzerBolt", new TwitterAnalyzerBolt(), 1).shuffleGrouping('"twitterSpout")

Config conf = new Config();

conf.setDebug(false);

final LocalCluster cluster = new LocalCluster();

• cluster.submit!opology("twitterTopology", conf, builder.createTopology());

[3] Querying RDF streams with C-SPARQL

[4] Cuesta C.E., Martinez-Prieto M.A., Fernandez J.D. (2013) Towards
Fig. 7. Topology code. an Architecture for Managing Big Semantic Data in Real-Time. In:
Drira K. (eds) Software Architecture. ECSA 2013. Lecture Notes in
As input, the system processes dynamic data based on Computer Science, vol 7957. Springer, Berlin, Heidelberg
continuously arriving atomic events but also supports [5] Mauri A. et al. (2016) TripleWave: Spreading RDF Streams on the
Web. In: Groth P. et al. (eds) The Semantic Web - ISWC 2016.
enrichment with static data. Unlike many engines like RSP ISWC 2016. Lecture Notes in Computer Science, vol 9982. Springer,
[19], our system does not consider events as a set of Cham
independent time stamped RDF triples, but as a graph of [6] Towards Efficient Processing of RDF Data Streams
atomic events which cannot be divided. Therefore, the [7] Strider: A Hybrid Adaptive Distributed RDF Stream Processing
system evaluates the continuous request against all of the Engine
events in a given window. This strategy makes it possible in [8] WAVES: Big Data Platform for Real-time RDF Stream Processing.
particular to deal with the throughput problems encountered Norberto Fernandez, Jesus Arias, Luis Sanchez, Damaris Fuentes
by many RSP engines [37]. Dynamic data flows have Lorenzo, and Oscar Corcho. Rdsz : an approach for lossless rdf
generated considerable interest within the semantic web stream compression. In European Semantic Web Conference, pages
52-67. Springer, 2014.
community. The processing of these flows has recently been
[9] Peter Deutsch and Jean-Loup Gailly. Zlib compressed data format
the subject of RSP systems based mainly on centralized specification version 3.3. Technical report, 1996.
execution. Recognizing the scalability limitations of single [10] Jesus Arias Fisteus, Norberto Fernandez Garcia, Luis Sanchez
machine systems, efforts have relied on generic flow Fernandez, and Damaris Fuentes-Lorenzo. Ztreamy : A middleware
processing frameworks to distribute requests on a cluster of for publishing semantic streams on the web. Web Semantics :
machines. Science, Services and Agents on the World Wide Web, 25 :16-23,
2014.
[11] Javier D Fernandez, Alejandro Llaves, and Oscar Corcho. Efficient
Co n c l u s io n rdf interchange (eri) format for rdf data streams. In International
Data flow processing is a research field dedicated to Semantic Web Conference, pages 244-259. Springer, 2014.
finding solutions to efficiently manage large flows of data [12] Apache Hadoop. [En ligne]. Available: http://hadoop.apache.org/.
during very short periods of time. In this paper we presented [13] J. Dean et S. Ghemawat, «MapReduce: simplified data processing on
large clusters,)) Commun. ACM, vol. 51 (1), pp. 107-113, January
a new Big Data solution for real-time analysis of RDF data
2008.
flows based on Storm. The principle consists in combining
[14] «Apache Storm,) [En ligne]. Available: http://storm.apache.org/.
the data flows with the stored data. For this, the system
[15] «Apache Spark Streaming,) [En ligne]. Available:
analyzes tweets from Twitter in real time simultaneously https://spark.apache.org/streaming/.
with the processing of RDF data stored in a triplestore.
[16] N. Marz et J. Warren, Big Data: Principles and best practices of
scalable realtime data systems, Manning Publications, 2013.
Re f e r e n c e s [17] Yahoo!, «Storm-on-YARN: Convergence of Low-Latency and Big
Data,) chez Annual Hadoop Summit, North America, 2013.
[18] Kolchin, Maxim, Peter Wetz, Elmar Kiesling, and A. Min Tjoa.
[1] J. J. C. G. Klyne: Resource Description Framework (rdf): Concepts
"YABench: A comprehensive framework for RDF stream processor
and abstract syntax. Tech. rep., W3C. (2004)
correctness and performance assessment." In International Conference
[2] Gerber D., Hellmann S., Buhmann L., Soru T., Usbeck R., Ngonga on Web Engineering, pp. 280-298. Springer, Cham, 2016.
Ngomo AC. (2013) Real-Time RDF Extraction from Unstructured
Data Streams. In: Alani H. et al. (eds) The Semantic Web - ISWC
2013. ISWC 2013. Lecture Notes in Computer Science, vol 8218.
Springer, Berlin, Heidelberg

Vol. 02, No. ICCIT- 1441, Page No.: 79 - 83, 9th & 10th Sep. 2020

Authorized licensed use limited to: R V College of Engineering. Downloaded on January 23,2024 at 18:42:02 UTC from IEEE Xplore. Restrictions apply.

Diplo Cloud
No ratings yet
Diplo Cloud
5 pages
Semantic Management of Streaming Data: Supervised by 2018
No ratings yet
Semantic Management of Streaming Data: Supervised by 2018
22 pages
Research Paper
No ratings yet
Research Paper
4 pages
Query Processing of Streaming RDF Data
No ratings yet
Query Processing of Streaming RDF Data
5 pages
An Overview of Linked Data and Its Application in Libraries: - Sanat Kumar Behera and Bulu Maharana
No ratings yet
An Overview of Linked Data and Its Application in Libraries: - Sanat Kumar Behera and Bulu Maharana
9 pages
5 RDF
No ratings yet
5 RDF
88 pages
Semantic Web Architecture Plan
No ratings yet
Semantic Web Architecture Plan
10 pages
Ontology Languages For The Semantic Web
No ratings yet
Ontology Languages For The Semantic Web
34 pages
Semantic Web RDF Website
No ratings yet
Semantic Web RDF Website
66 pages
Lec06 Rdfsquery
No ratings yet
Lec06 Rdfsquery
7 pages
Effective Query Processing For Web-Scale RDF Data Using Hadoop Components
100% (1)
Effective Query Processing For Web-Scale RDF Data Using Hadoop Components
7 pages
This Section Describes The Status of This Document at The Time of Its Publication. Other Documents May Supersede This Document. A List of
No ratings yet
This Section Describes The Status of This Document at The Time of Its Publication. Other Documents May Supersede This Document. A List of
26 pages
Social Media
No ratings yet
Social Media
6 pages
A Distributed Graph Engine For Web Scale RDF Data
No ratings yet
A Distributed Graph Engine For Web Scale RDF Data
12 pages
R Taha
No ratings yet
R Taha
24 pages
SNSW Co3
No ratings yet
SNSW Co3
59 pages
Information Sciences: Chunyao Song, Tingjian Ge, Yao Ge, Haowen Zhang, Xiaojie Yuan
No ratings yet
Information Sciences: Chunyao Song, Tingjian Ge, Yao Ge, Haowen Zhang, Xiaojie Yuan
24 pages
RDF Data Model and Query Languages: Sergio Tessaris
No ratings yet
RDF Data Model and Query Languages: Sergio Tessaris
100 pages
Semantic Web - Unit3 - Questions and Answers
No ratings yet
Semantic Web - Unit3 - Questions and Answers
13 pages
RDF Basics and XML Syntax Guide
No ratings yet
RDF Basics and XML Syntax Guide
120 pages
Output
No ratings yet
Output
12 pages
Habilitation
No ratings yet
Habilitation
186 pages
Seminor Rough Report
No ratings yet
Seminor Rough Report
18 pages
Sesame: An Architecture For Storing and Querying RDF Data and Schema Information
No ratings yet
Sesame: An Architecture For Storing and Querying RDF Data and Schema Information
16 pages
Ontology Languages For The Semantic Web
No ratings yet
Ontology Languages For The Semantic Web
17 pages
Semantic Web
No ratings yet
Semantic Web
25 pages
SW Unit 3
No ratings yet
SW Unit 3
42 pages
Semantic Web Unit-III
No ratings yet
Semantic Web Unit-III
17 pages
RSP Workshop 2015 Submission 15
No ratings yet
RSP Workshop 2015 Submission 15
3 pages
Introduction To Web Ontology Language (OWL)
No ratings yet
Introduction To Web Ontology Language (OWL)
48 pages
Core Semantic Web Technologies
No ratings yet
Core Semantic Web Technologies
6 pages
RDF Integration in HTML 5 Web Pages: Gijs Davis G.davis@student - Utwente.nl
No ratings yet
RDF Integration in HTML 5 Web Pages: Gijs Davis G.davis@student - Utwente.nl
8 pages
Chapter 3
No ratings yet
Chapter 3
47 pages
Semantic Web Programming Guide
No ratings yet
Semantic Web Programming Guide
32 pages
Scalable RDF Archive Management
No ratings yet
Scalable RDF Archive Management
26 pages
02 RDF (S)
No ratings yet
02 RDF (S)
36 pages
Efficient RDF Graph Summarization
No ratings yet
Efficient RDF Graph Summarization
18 pages
RDF SW Velocity
No ratings yet
RDF SW Velocity
12 pages
Describing Web Resources in RDF: Grigoris Antoniou Frank Van Harmelen
No ratings yet
Describing Web Resources in RDF: Grigoris Antoniou Frank Van Harmelen
120 pages
Querying Semantic Data On The Web: (Seq:2378690, Seq:length, "118") (Seq:2378690, Seq:taxonomy, Tax:36178)
No ratings yet
Querying Semantic Data On The Web: (Seq:2378690, Seq:length, "118") (Seq:2378690, Seq:taxonomy, Tax:36178)
12 pages
RDFS: Enhancing RDF with Class Hierarchy
No ratings yet
RDFS: Enhancing RDF with Class Hierarchy
4 pages
Tomaszuk Skonieczny Wood RDF Graph Partitions
No ratings yet
Tomaszuk Skonieczny Wood RDF Graph Partitions
10 pages
Semantic Web Ontology Lec 4 Week 2
No ratings yet
Semantic Web Ontology Lec 4 Week 2
25 pages
Describing Web Resources in RDF: Grigoris Antoniou Frank Van Harmelen
No ratings yet
Describing Web Resources in RDF: Grigoris Antoniou Frank Van Harmelen
120 pages
Ontologies and The Semantic Web Ain Shams University
No ratings yet
Ontologies and The Semantic Web Ain Shams University
37 pages
ST 4
No ratings yet
ST 4
53 pages
Ontology Etourism
No ratings yet
Ontology Etourism
34 pages
An Introduction To The Owl Web Ontology Language: Lehigh University
No ratings yet
An Introduction To The Owl Web Ontology Language: Lehigh University
24 pages
Core Classes and Properties in RDF-CT-1
No ratings yet
Core Classes and Properties in RDF-CT-1
35 pages
Trev 2011-Q1 Semantic-Web Evain
No ratings yet
Trev 2011-Q1 Semantic-Web Evain
13 pages
Chapter 3
No ratings yet
Chapter 3
119 pages
Pibiri 2020
No ratings yet
Pibiri 2020
12 pages
Complex Matrices For The Approximate Evaluation of Probabilistic Queries
No ratings yet
Complex Matrices For The Approximate Evaluation of Probabilistic Queries
8 pages
RDF Querying with Apache Spark
No ratings yet
RDF Querying with Apache Spark
6 pages
The Semantic Web An Introduction
No ratings yet
The Semantic Web An Introduction
23 pages
Semantic Web Unit-III
No ratings yet
Semantic Web Unit-III
17 pages
SPARQLreference 1.8 Us
No ratings yet
SPARQLreference 1.8 Us
2 pages
Describing Web Resources in RDF: A Semantic Web Primer
No ratings yet
Describing Web Resources in RDF: A Semantic Web Primer
120 pages
Antenna Specs for Installers
No ratings yet
Antenna Specs for Installers
2 pages
Essentials of Psychology Concepts and Applications 5th Edition Textbook
0% (2)
Essentials of Psychology Concepts and Applications 5th Edition Textbook
13 pages
Wa0010
No ratings yet
Wa0010
11 pages
Complete Diagnostic Ultrasound: Abdomen and Pelvis 2nd Edition Aya Kamaya PDF For All Chapters
100% (1)
Complete Diagnostic Ultrasound: Abdomen and Pelvis 2nd Edition Aya Kamaya PDF For All Chapters
37 pages
Detailed Lesson Plan in Horticulture
95% (57)
Detailed Lesson Plan in Horticulture
5 pages
SA 2 Grade 9
No ratings yet
SA 2 Grade 9
3 pages
Topic 3 Socialization and Social Interaction
No ratings yet
Topic 3 Socialization and Social Interaction
18 pages
Tsheets For Ii B.tech Ii Semester R20
No ratings yet
Tsheets For Ii B.tech Ii Semester R20
60 pages
55 3 Xqa
No ratings yet
55 3 Xqa
59 pages
Class 9 French Lesson 2 Notes
No ratings yet
Class 9 French Lesson 2 Notes
3 pages
WINDOWS 11 SENIORS GUIDE The Most User Friendly Seniors and Beginners Manual To Learn Windows 11 S Essential Features 1st Edition Steve White
100% (2)
WINDOWS 11 SENIORS GUIDE The Most User Friendly Seniors and Beginners Manual To Learn Windows 11 S Essential Features 1st Edition Steve White
78 pages
C70 Shift Removal
No ratings yet
C70 Shift Removal
57 pages
Spritual Magazine ANBOLI
No ratings yet
Spritual Magazine ANBOLI
16 pages
21 Bba HRM Unit 1 - HRM
No ratings yet
21 Bba HRM Unit 1 - HRM
41 pages
2019 Handbook
No ratings yet
2019 Handbook
88 pages
ThinkMove IT Solutions-Agreement For Educational Training and Job Placement Services (1) Updated
No ratings yet
ThinkMove IT Solutions-Agreement For Educational Training and Job Placement Services (1) Updated
4 pages
Basic Engg. Math
No ratings yet
Basic Engg. Math
3 pages
Bossy President (Official) Manga
No ratings yet
Bossy President (Official) Manga
1 page
DOCSIS - Upstream Cable Echoes Come in Two Flavors
No ratings yet
DOCSIS - Upstream Cable Echoes Come in Two Flavors
9 pages
Frida Key Cover Crochet Pattern
No ratings yet
Frida Key Cover Crochet Pattern
3 pages
1 - Medival
No ratings yet
1 - Medival
50 pages
Ethiopian Institutes of Architecture, Building Construction and City Development (Eiabc)
No ratings yet
Ethiopian Institutes of Architecture, Building Construction and City Development (Eiabc)
37 pages
Las Entrep Q4 W10
No ratings yet
Las Entrep Q4 W10
27 pages
Grade 3: Daily Routines & Time
No ratings yet
Grade 3: Daily Routines & Time
5 pages
5-Personalized Learning
100% (1)
5-Personalized Learning
2 pages
Mortgage Firms Show Strain: For Personal, Non-Commercial Use Only
No ratings yet
Mortgage Firms Show Strain: For Personal, Non-Commercial Use Only
38 pages
SPRING 2005 Historical Society Acquires 19th Century Papers
No ratings yet
SPRING 2005 Historical Society Acquires 19th Century Papers
6 pages
Business Research Methods Guide
No ratings yet
Business Research Methods Guide
30 pages
Kahoot! Admin Guide for Schools
No ratings yet
Kahoot! Admin Guide for Schools
13 pages
Poetry and The Imagination
No ratings yet
Poetry and The Imagination
67 pages

Real-Time Semantic Web Data Stream Processing Using Storm

Uploaded by

Real-Time Semantic Web Data Stream Processing Using Storm

Uploaded by

2020 International Conference on Computing and Information Technology, University o f Tabuk, Kingdom o f Saudi Arabia.

Real-Time Semantic Web Data Stream Processing

Abstract— Semantic web technologies are increasingly used

XML resource exploitation tools (Java libraries, etc.), XML C. Storm

Fig. 1. Stack of languages for the Semantic Web

RDF is a language developed by the W3C to put a

Fig. 2. An RDF triple.

A distributed data flow processing system is an essential

Fig. 6. Result of running the system on Tweets.

import com. raidentrance.bolt.TwitterAnalyzerBolt;

)public class TwitterTopology {

Config conf = new Config();

final LocalCluster cluster = new LocalCluster();

[3] Querying RDF streams with C-SPARQL

You might also like