Wikidata:SPARQL query service/WDQS backend update/WDQS backend alternatives

As you know, finding a way to make Wikidata Query Service (WDQS) scalable is a top priority for the Search Team. To accomplish this goal, moving off the Blazegraph backend was deemed a critical action to undertake. For this reason, in the last months, we evaluated several solutions that could help us achieve this goal.

Finding the alternatives

Finding an alternative backend was not hard. In fact, there are a large number of alternatives. The hardest part was to narrow down the possibilities. To accomplish this, we defined specific criteria for the backend to have, also relying on the feedback received in our February 2022 scaling community meetings, in order to evaluate them in regard to our needs.

At this moment, four potential candidates are short-listed for replacing Blazegraph (listed in alphabetical order):

Apache Jena with the Fuseki SPARQL Server component;
Qlever (some aspects, such as update support, still in development);
RDF4J V4 (still in development);
Virtuoso Open-Source.

You can find the full evaluation study process and results in our paper, “WDQS Backend Alternatives”. The paper addresses the technical and user requirements for the WDQS backend, gathered over the last seven years of operation, as well as the implications for the system architectures. These topics, the process for evaluation, and the resulting detailed assessments of the possible alternatives are discussed in the document.

Technical and community criteria assessments

The following tables are the results of the assessments of the candidate alternatives. The first table holds the overall technical assessments. The scoring is 0-5 (where 0 indicates no support and 5 indicates exceptional support). Note that the table includes a column evaluating the current Blazegraph solution.

All of the following criteria are discussed and explained in full in the document.

Overall technical assessments
Criteria	Blazegraph	Jena	QLever	RDF4J	Virtuoso
Scalability to 10B+ triples	5	5	5	3*	5
Scalability to 25B+ triples	0	0	5	1	5
Full SPARQL 1.1 capabilities	5	5	3*	5	3
Federated query	5	5	0*	5	5
Ability to define custom SPARQL functions	3	5	2	5	4
Ability to tune/define indexes and perform range lookups	2	5	5	5	4
Support for read and write at high frequency	5	5	0*	3*	3
Active open-source community	0	5	4	5	3
Well-designed and documented code base	5	5	5	5	2
Instrumentation for data store and query management	2	5	2	4	4
Query plan explanation	5	3	5	3	2
Query plan tuning/hints within SPARQL statement	5	4	2	3	3
Query without authentication	5	5	5	5	5
Ability to prevent write access	0	5	0	5	5
Data store reload in 2-3 days (worst case)	0	2	5	3	1
Query timeout and resource recovery	2	5	4	4	3
Support for geospatial data (e.g., POINT)	5	5	5	5	5
Support for GeoSPARQL	2	5	2*	4	3
Support for named graphs (quads)	5	5	0	5	5
Query builder interface (ease of use)	3	4	5	3	5
Dataset evaluation (SHACL, ShEX)	0	5	0	5	0

NB: a * indicates that score could be improved after testing/evaluation of work-in-progress.

The second table describes the implications to the users related to query times, complexity and data freshness for each of the alternatives.

Server assessment by user criteria
Criteria	Jena	QLever	RDF4J	Virtuoso
Permit long(er) and configurable query timeouts (which translates to additional query load)	Longer timeouts will likely be required due to federation; Timeouts configurable at global and query levels	Timeouts configurable at query level	Longer timeouts will likely be required due to federation; Timeouts configurable at query level	Timeout implications need investigation / evaluation based on query load; Anytime query is not possibility (since it is not deterministic); Timeouts are global and not configurable at query level
Query full set of triples	Slower performance on some queries due to need for federation	Full capability	Slower performance on some queries due to need for federation	Full capability
Reflect most current data (Requires ability to handle frequent writes)	Needs investigation/evaluation; There are capabilities for streamed update	Proposed solution for update needs investigation/evaluation; In theory, supports real-time updates	Needs investigation/evaluation; The LMDB store should be performant	Updates may necessitate index rebuild and affect performance and correctness; Needs investigation/evaluation
Good query response time (Requires performant indexing and join operations)	Some slower performance due to federation; Possible to tune indexes and configurations	Performant query demonstrated on sample endpoint; Queries that timeout on Blazegraph likely to succeed; All index permutations are supported	Some slower performance due to federation; Possible to tune indexes and configurations	Needs investigation / evaluation since column-wise data store may not be compatible with frequent writes; Complex queries may timeout or take a long time to complete
Ease of use/easier to use (All solutions support SPARQL 1.1; federation introduces additional complexity)	Queries will be more complex since they will reference different endpoints due to federation; Can evaluate HyperGraphQL for simple queries	Excellent UI (as demonstrated on sample endpoint) with autocomplete and graphical display of query plans; Need to test full SPARQL 1.1 compliance	With FedX support, queries should not have to change (changes would be due to splitting Wikidata into sub-graphs to reduce database size)	Queries will reference new prefixes (bif: and sql:) and use non-standard terminology; Query plans explained in SQL which could be confusing; Need to test full SPARQL 1.1 compliance

Our next steps ahead

First of all, we encourage all community members to review and comment on this paper, by:

posting feedback on the discussion page
participating in an online feedback session (link) on Wednesday April 13, 2022 at 19:00 UTC (Etherpad notes)

Clearly, the work to replace Blazegraph with a suitable alternative backend is just beginning. We already defined a number of steps we need to take, such as:

determine how to support the current local SERVICE functions (labelling, geospatial calculations, etc.) in a more SPARQL-compliant manner;
define a set of updates and query tests and workloads that exercise the engines and SPARQL endpoints;
define and test different algorithms for splitting the Wikidata graph, understand how the update and query workloads would change, and the implications for the RDF stream updater;
begin testing and tuning the selected alternative offerings using the specific SPARQL tests and workloads defined above;
investigate creation of a middleware layer (between the RDF store/SPARQL endpoint and users/applications) to remove dependencies on a specific implementation and reduce churn in potential, future migrations.

As we progress on these tasks, we remain committed to publishing our work and keeping the community updated. -- Sannita (WMF) (talk) (on behalf of the Search Team) 08:20, 29 March 2022 (UTC)[reply]

A Report on Using QLever

TL;DR: WMDE should set up a Wikidata query service based on QLever so that users can run queries that time out in the current WDQS.

Here are some comments on my experience on using QLever vs the WDQS. I have been running a lot of Wikidata queries on both QLever and the WDQS. I also loaded Wikidata into QLever on an old desktop machine I had available. Peter F. Patel-Schneider (talk) 18:16, 8 January 2024 (UTC)[reply]

Speed:

QLever can be *much* faster. There are queries that run in less than one second on the QLever public Wikidata query service (https://qlever.cs.uni-freiburg.de/wikidata) but time out in the WDQS (https://query.wikidata.org). The QLever Wikidata query service is using a recent RDF dump of Wikidata.

The following query runs in QLever in under 2 seconds with 142,349 results but times out in the WDQS.

PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX p: <http://www.wikidata.org/prop/>
PREFIX ps: <http://www.wikidata.org/prop/statement/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
SELECT DISTINCT ?item WHERE {
 ?item p:P31/ps:P31/(p:P279/ps:P279)* wd:Q11446 .
}

The following query runs in QLever in 0.5 second with 361 results but times out in the WDQS.

PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX p: <http://www.wikidata.org/prop/>
PREFIX ps: <http://www.wikidata.org/prop/statement/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
SELECT ?item WHERE {
 ?item p:P31/ps:P31/(p:P279/ps:P279)* wd:Q17205
.
}

The following query runs in QLever in 281 milliseconds with 142,244 results and runs in 833 milliseconds in the WDQS.

PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX p: <http://www.wikidata.org/prop/>
PREFIX ps: <http://www.wikidata.org/prop/statement/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
SELECT DISTINCT ?item WHERE {
 ?item wdt:P31/wdt:P279* wd:Q11446 .
}

The following query runs in QLever in 383 milliseconds with 142,293 results and runs in 3101 milliseconds in the WDQS.

PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX p: <http://www.wikidata.org/prop/>
PREFIX ps: <http://www.wikidata.org/prop/statement/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
SELECT ?item WHERE {
 ?item p:P31/ps:P31/wdt:P279* wd:Q11446 .
}

The following query runs in QLever in 3.5 seconds with 147,605 results and times out in the WDQS.

PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX p: <http://www.wikidata.org/prop/>
PREFIX ps: <http://www.wikidata.org/prop/statement/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT DISTINCT ?item WHERE {
 ?i wdt:P1647* wd:P31 .
 ?i wikibase:directClaim ?ic .
 ?item ?ic ?class .
 ?class wdt:P279* wd:Q11446 .
}

The following query runs in 29 seconds in QLever with 147,703 results and times out in WDQS.

PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX p: <http://www.wikidata.org/prop/>
PREFIX ps: <http://www.wikidata.org/prop/statement/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT DISTINCT ?item WHERE {
 ?i wdt:P1647* wd:P31 .
 ?i wikibase:claim ?ic .
 ?i wikibase:statementProperty ?ip .
 ?item ?ic ?itemst .
 ?itemst ?ip ?class .
 ?class wdt:P279* wd:Q11446 .
}

QLever does considerable caching, apparently even of parts of queries. If you run a query and later ask for it with names, the name query runs in only a few milliseconds. Because of the caching, you have to be careful when timing QLever.

QLever is not always faster, probably because it tries hard to optimize query execution. This is most notable in complex queries that produce few results. For example

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT DISTINCT ?ind ?indLabel ( SAMPLE (?i) AS ?iSample ) ( SAMPLE (?iLabel) AS ?iSampleLabel ) WHERE {
   ?ind wdt:P31/wdt:P279* wd:Q105939340 .
   { SELECT DISTINCT ?ind ?i WHERE { { ?i wdt:P31 ?ind . } UNION { ?ind wdt:P279 ?i } UNION { ?i wdt:P279 ?ind } } }
   OPTIONAL { ?ind rdfs:label ?indLabel . FILTER ( lang(?indLabel) = 'en' ) }
   OPTIONAL { ?i rdfs:label ?iLabel . FILTER ( lang(?iLabel) = 'en' ) }
} GROUP BY ?ind ?indLabel

produces no results and takes 1.6 seconds in QLever versus 60ms in the WDQS.

Names:

QLever has its own way of adding names to results. In the web interface you just click a button and name columns are added to the result by enclosing the query in another query that looks up the names. This is currently not as general as the WDQS and requires that there be a name in English. (A fix for requring names is easy.) But it is fast and actually easier to use than the BlazeGraph service. Adding the name part to queries can cause a timeout in the WDQS even though the base query only takes a few seconds.

User Amenities:

The QLever web interface has similar but different facilities for turning item labels into Wikidata identifiers and for showing the label of a Wikidata identifier. I'm not as good at using them, but I think this is just a matter of familiarity.

Incompleteness and memory usage:

There are a few parts of SPARQL 1.1 query that QLever does not implement. Most of them have easy workarounds.

I found a deficiency in the QLever query optimizer. Query components like `?v1 wdt:P31/wdt:P279* ?v2` can cause QLever to request a large amount of storage if both variables are relatively unconstrained. The partial workaround is to rewrite as `?v1 wdt:P31 ?tmp . ?tmp wdt:P279* ?v2`. Even then, QLever can request large amounts of storage. These queries tend to time out in the WDQS.

QLever has some other glitches. If QLever runs out of some resource its service may become unavailable for some time. The error message produced by the web interface in this situation is not useful.

Resource consumption and speed loading Wikidata:

QLever is surprisingly fast loading Wikidata and can run on a surprisingly small machine.

I downloaded QLever from github and compiled it into a Docker image. The process was easy and took about 15 minutes. QLever worked fine on several small- and medium-sized RDF datasets. I loaded Wikidata into QLever on my machine - an old desktop machine with an i5-3570K (4 cores, 4 threads), 32GB of not-very-fast DDR3 memory, and a 2TB SSD. I had to change several settings to reduce QLever's memory usage but after I did to QLever loaded all of Wikidata (as of mid-December 2023) in 14.5 hours.

I then ran the QLever SPARQL query engine on the machine. The queries I tried run at aboout half the speed as they do on the QLever Wikidata query service hosted at Freiburg.

Updates and Summary:

QLever does not yet have in-place update facilities. When update facilities are added to QLever I think it would dominate BlazeGraph and the current WDQS in essentially every way. Even without update facilities QLever is a very useful resource for users who want to run complex queries against Wikidata. I suggest that WMDE set up a QLever Wikidata service based on the Wikidata dumps to support these users. I'm willing to help in this effort.

Support @Peter F. Patel-Schneider: Thanks for this - very convincing. As a first step do you think Wikidata should provide a link to the Freiburg service from WDQS or the Wikidata navigation bar or somewhere else like that? Obviously it's not under wikimedia control, but it's clearly very useful. On formatting here, note that your indented lines show differently than the non-indented ones, probably not what you wanted. You might want to try the SPARQL template here? Example
```
SELECT DISTINCT ?ind ?indLabel ( SAMPLE (?i) AS ?iSample ) ( SAMPLE (?iLabel) AS ?iSampleLabel ) WHERE {{
    ?ind wdt:P31/wdt:P279* wd:Q105939340 .
    {{ SELECT DISTINCT ?ind ?i WHERE {{ [[:Template:?i wdt:P31 ?ind .]] UNION [[:Template:?ind wdt:P279 ?i]] UNION [[:Template:?i wdt:P279 ?ind]] }} }}
    OPTIONAL [[:Template:?ind rdfs:label ?indLabel . FILTER ( lang(?indLabel) = 'en' )]]
    OPTIONAL [[:Template:?i rdfs:label ?iLabel . FILTER ( lang(?iLabel) = 'en' )]]
}} GROUP BY ?ind ?indLabel
```
Try it!
ArthurPSmith (talk) 21:10, 8 January 2024 (UTC)[reply]

Thanks. I "fixed" the SPARQL examples by simply indenting all the lines (and changing double braces to single braces).

I do think that there could be some sort of link to the Freiburg service, provided that the Freiburg people agree. There should be a disclaimer, however, that QLever is not a complete SPARQL implementation and does not have the BlazeGraph naming service. I think that this would be a preliminary step, as I expect that the Freiburg service goes down more often than the WDQS does. Peter F. Patel-Schneider (talk) 21:38, 8 January 2024 (UTC)[reply]

Support See https://phabricator.wikimedia.org/T291903 which is now closed, but could be reopened if WMDE/WMF want to provide QLever before the SPARQL Update-development has finished.--So9q (talk) 20:35, 17 February 2024 (UTC)[reply]

A performance evaluation of QLever, Virtuoso, Blazegraph, GraphDB, Stardog, Jena, and Oxigraph

Here are the results of a performance evaluation and comparison of QLever, Virtuoso, Blazegraph, GraphDB, Stardog, Apache Jena, and Oxigraph on a moderately sized dataset (DBLP, 390 million triples) on one and the same machine. The following table provides a summary, the details can be found here. That page also provides a performance comparison of four SPARQL endpoints (based on Blazegraph, QLever, Virtuoso, MilleniumDB) for the complete Wikidata, on 298 example queries from the Wikidata query service. --Hannah Bast (talk) 02:54, 11 April 2024 (UTC)[reply]

SPARQL engine	Code	Loading time	Loading speed	Index size	Avg. query time	Ease of setup
Oxigraph	Rust	640s	0.6 M/s	67 GB	93s	very easy
Apache Jena	Java	2392s	0.2 M/s	42 GB	69s	very easy
Stardog	Java	724s	0.5 M/s	28 GB	17s	many hurdles
GraphDB	Java	1066s	0.4 M/s	28 GB	16s	some hurdles
Blazegraph	Java	6326s	<0.1 M/s	67 GB	4.3s	some hurdles
Virtuoso	C	561s	0.7 M/s	13 GB	2.2s	many hurdles
QLever	C++	231s	1.7 M/s	8 GB	0.7s	very easy

WDBENCH+? Wikidata performance benchmarks

It would be helpful to formalize some of the recent tests, benchmarks, and rubrics for evaluating backends for WD and other comparable or larger agglutinate datasets.

WDBENCH (Aranda + Rojas, et al.): uses data from Wikidata Truthy (Markus Krötzsch, et al.)
Wikidata Graph Pattern Benchmark [WGPB] : 50 instances of 17 different abstract query patterns = 850 SPARQL queries]
Peter's complex SPARQL queries (not yet published; benched on his local machine + WD)
Recent rubric [above, by Hannah Bast]: loading time, loading speed, index size, average query time
Wikidata Scaling rubric, 2021 (used for Wikidata:SPARQL query service/WDQS backend update)

Perhaps this could all be combined into a new WDBENCH update? And we could ask the leading contenders to self-eval and folks could spot check on a VM... Sj (talk) 21:12, 10 September 2024 (UTC)[reply]