Shortcuts: WD:PP/COMP, WD:PP/Computing

Wikidata:Property proposal/Computing

From Wikidata
Jump to navigation Jump to search


Property proposal: Generic Authority control Person Organization
Creative work Place Sports Sister projects
Transportation Natural science Computing Lexeme

See also

[edit]

This page is for the proposal of new properties.

Before proposing a property

  1. Search if the property already exists.
  2. Search if the property has already been proposed.
  3. Check if you can give a similar label and definition as an existing Wikipedia infobox parameter, or if it can be matched to an infobox, to or from which data can be transferred automatically.
  4. Select the right datatype for the property.
  5. Read Wikidata:Creating a property proposal for guidelines you should follow when proposing new property.
  6. Start writing the documentation based on the preload form below by editing the two templates at the top of the page to add proposal details.

Creating the property

  1. Once consensus is reached, change status=ready on the template, to attract the attention of a property creator.
  2. Creation can be done 1 week after the creation of the proposal, by a property creator or an administrator.
  3. See property creation policy.

General

[edit]

serves resource

[edit]
   On hold
Descriptionresource served by the subject; unless otherwise qualified with protocol (P2700) the protocol is derived from instance of (P31) (e.g. HTTP for instances of website (Q35127))
Data typeItem
Domaindata item instance of website (Q35127), web application (Q189210), web user interface (Q1981057) or web API (Q557770) (other classes can be added as needed, this constraint is just to allow for efficient inferring of the default protocol)
Example 1
Example 2
Example 3

Motivation

[edit]

This proposal seeks to introduce a way to model the HTTP routes of web interfaces (be they web user interfaces or web APIs).

The semantics of HTTP routes are very much of interest for:

  • web search engines which want to understand the function of a particular web page
  • web crawlers which want to crawl the web for resources of a specific type, or want to avoid crawling resources of a specific type, or want to crawl resources of a specific type more/less frequently than resources of another specific type
  • hyperlinking, creating a hyperlink to a specific resource within a specific context
    • this enables other useful properties e.g. adding references to git commits only via their commit hash, see Wikidata:Property proposal/changeset
    • this enables very useful userscripts e.g. you could have a button that automatically takes you to the API endpoint for the specific resource you're currently viewing
  • evaluating compatibility between web interfaces
  • providing compatibility e.g. if you self-host a cgit (Q28974765) instance but decide that you want to switch to a self-hosted GitLab (Q16639197) instance. If the semantics of their most important routes were modeled in Wikidata, there could be a script that automatically generates an nginx (Q306144) config for you to provide server-side redirects from the old cgit routes to the GitLab routes (or vice versa if you were to switch in the opposite direction)

Description

[edit]

To achieve that this proposal proposes one property for data items and three accompanying qualifer-only properties.

The core idea is that URL suffix formatter can be used to qualify under which URL suffix the resource is served, as with formatter URL (P1630) $1 can be replaced with the identifier of the resource.

Since resources are often hierarchical we also introduce a parent resource qualifier with the following semantics:

  • Xserves resourceappleURL suffix formatter/apples/$1
  • Xserves resourceorangeURL suffix formatter/oranges/$1parent resourceapple

means that oranges are served at /apples/$appleId/oranges/$orangeId.

Our system is already powerful enough to model the majority of HTTP routes of many web interfaces, however there are two things we currently do not cover:

  • some HTTP routes cannot modeled in this way, e.g. imagine that the route instead was /apples/$orangeId/$appleId
  • some resources have multiple kinds of identifiers (e.g. a Phabricator project has a slug (Q99601940) as well as a numeric identifier (Q93868746))

We can address both concerns by introducing a fourth and last qualifying property "URL parameter", with the following semantics:

The names of the placeholders do not matter, what matters is that they are in the same order as the "URL parameter" qualifiers.

Cheers, Push-f (talk) 00:21, 21 November 2022 (UTC)[reply]

Discussion

[edit]
  • WikiProject Informatics has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead. Notified participants of WikiProject Websites. --Push-f (talk) 00:33, 21 November 2022 (UTC)[reply]
  •  Support, great proposal. I do wonder if we're stretching the limits of what Wikidata can represent, especially when it comes to the ability of human editors to build the knowledge subgraphs when modeled to such detail and complexity. This should not be a blocker for adopting this proposal, mind you, but we should probably consider having detailed examples and maybe even a diagram or two to help editors understand how these properties are meant to be used. Otherwise we may end up with a situation similar to the modeling of books which is famously inconsistent in Wikidata as it clashes with people's intuition about what a book is.  – The preceding unsigned comment was added by Waldyrious (talk • contribs).
    Thanks! Yes I agree about these properties needing thorough documentation; on top of that I will probably develop a user script to view the routes implied by the proposed properties :) --Push-f (talk) 09:40, 21 November 2022 (UTC)[reply]
  •  Comment Very solid and thoughtful proposal. However, your examples show that storing this information in Wikidata and consuming it can be kinda awkward. Consider creating your own specialized database that we can link to from Wikidata ;) Dexxor (talk) 17:52, 22 November 2022 (UTC)[reply]
    Thanks :) Wikibase isn't easy to self-host[1] and this is without getting into setting up a Blazegraph (Q20127748) server so that you can still query the data with SPARQL and setting up some third-party login to lower the barrier to entry ... but even then I think people are more likely to contribute to Wikidata than some niche database.
    Besides with the right tooling I think this proposal could really address a fundamental problem of Wikidata, which is that we don't want to create a bunch of properties for every website/common HTTP route out there. Have you looked at my changeset proposal? There @Dexxor: argued that "changeset formatter URL suffix" is too specific, so I withdrew that idea in favor of this more reusable and powerful approach. I guess another example is also my Wikimedia Phabricator project proposal where @Arlo Barnes: has argued that a more general property would be better.
    So I think the solution is to develop specialized tooling for these properties in Wikidata (e.g. user scripts for contributors and libraries/bots for consumers), rather than outsourcing that data somewhere else entirely.
    --Push-f (talk) 10:00, 23 November 2022 (UTC)[reply]
  • On hold Apparently qualifiers of data type item are unordered, so the "URL parameter" idea doesn't work. The best alternative I can think of would be encoding the parameter class IDs directly into the URL suffix formatter, e.g. ${Q6} ... however encoding item IDs into string obviously isn't ideal. --Push-f (talk) 02:30, 27 November 2022 (UTC)[reply]

URL suffix formatter

[edit]
   On hold
Description(qualifier only) under which URL suffix the resource is served; "$1" can be automatically replaced with the identifier of the resource; if the template requires additional parameters they may be specified as other placeholders matching the regular expression \$[a-z]+, in this case the data types of the parameters must be qualified with "URL parameter"
Data typeString
Domainmay only be used as a qualifier for "serves resource"
Example 1see #serves resource
Example 2see #serves resource
Example 3see #serves resource
See alsoformatter URL (P1630)

See #serves resource for the motivation and discussion.

parent resource

[edit]
   On hold
Description(qualifier only) indicates that this resource is a subresource of the given resource (and that the "URL suffix formatter" of the given resource comes before the "URL suffix formatter" of this resource)
Data typeItem
Domainmay only be used as a qualifier for "serves resource"
Example 1see #serves resource
Example 2see #serves resource
Example 3see #serves resource

See #serves resource for the motivation and discussion.

URL parameter

[edit]
   On hold
Description(qualifier only) meant to qualify the data types of placeholders in "URL suffix formatter" values; if there are several URL parameters this qualifier must be specified multiple times in the same order that the URL parameter placeholders appear in the "URL suffix formatter" value
Data typeItem
Domainmay only be used as a qualifier for "serves resource"
Example 1see #serves resource
Example 2see #serves resource
Example 3see #serves resource

See #serves resource for the motivation and discussion.

Wikimedia Phabricator project PHID

[edit]
   On hold
DescriptionPHID of the Wikimedia Phabricator project for the subject
Data typeExternal identifier
Example 1Wikibase Repository (Q21679301)vumw5jyyw4r3fv52k34y
Example 2Extension:Wikibase Client (Q21679293)46yqqwzqvnxmbabmz3tc
Example 3Pywikibot (Q15169668)orw42whe2lepxc7gghdq
Formatter URLhttps://phabricator.wikimedia.org/maniphest/?project=PHID-PROJ-$1&statuses=open()&order=newest#R
See alsoissue tracker URL (P1401), Wikimedia Incubator URL (P9748)

Motivation

[edit]

Most MediaWiki software uses https://phabricator.wikimedia.org/ as its issue tracker. Currently the issue trackers are linked via issue tracker URL (P1401), which is however a bit messy since there are many different ways to link to a project on Phabricator:

While project slugs are the most human-friendly of these, project slugs can change and one project can have multiple slugs, making them suboptimal for a Wikidata identifier property. So the choice remains between the numeric ids and the PHIDs. I think PHIDs are the clear winner because the Phabricator API to search tasks maniphest.query only accepts PHIDs for projects and we don't want to force data consumers to do a project.query lookup to translate the id to the PHID every time they want to query the tasks of a project.

How to find the PHID? If you are at a project page, click on Open Tasks and then you can find the PHID in the URL.

A bot could be written to import these identifiers from https://www.mediawiki.org/, because it has the project slugs in various templates:

Cheers, --Push-f (talk) 16:09, 21 November 2022 (UTC)[reply]

Discussion

[edit]

‎has forks

[edit]
   Under discussion
DescriptionNotable software forks of this software
Representsfork (Q332903)
Data typeItem
Example 1youtube-dl (Q28401317) has forks: yt-dlp (Q108454371)
Example 2Visual Studio Code (Q19841877) has forks: VSCodium (Q111967621)
Example 3OpenOffice.org (Q511977) has forks: LibreOffice (Q10135)
Example 4Firefox (Q698) has forks: LibreWolf (Q105623664)

Motivation

[edit]

Currently, which notable/used forks a software has and which software was originally forked from (and also when) is either not specified or using the based on property. A bot/script could populate this standardized property. I think it would be useful but many items for notable forks are still missing (there wasn't even one for VSCodium). For example, one could query for all notable forks created in some year or have a software's forks linked at a page about the respective software which can be useful e.g. to people using that software.

based on (Property:P144) is ambiguous and using instance of with an of qualifier is even less usable. I don't know if I can propose two properties at once but if so I'd also like to propose property is fork of and if not would like to propose that later on if nobody else does.

Previous discussion (1 reply).

--Prototyperspective (talk) 17:38, 1 October 2024 (UTC)[reply]

Discussion

[edit]
  •  Conditional support I agree with @Dexxor: from the previous discussion that the inverse "is fork of" would be better for this data. -wd-Ryan (Talk/Edits) 00:26, 2 October 2024 (UTC)[reply]
    The thing is I think both are needed and useful. On a page about some software (it doesn't have to be a Wikipedia page), a way to view all forks of this software would often be useful and querying that data via the has forks property seems like the best way and it would then be also included in the respective Wikidata item. This is similar to e.g. the has parts and part of properties: it needs both except if the second item somehow gets this info added as well dynamically if it's added to the other (currently bots may do this for such dual properties). Prototyperspective (talk) 10:50, 2 October 2024 (UTC)[reply]
  •  Oppose as "has fork"; would support as "is fork of" per Wd-Ryan. Mahir256 (talk) 17:03, 2 October 2024 (UTC)[reply]
    Why not both? Forks are useful information at the respective software just as much as the info what software a given software was forked from. Making a new query for every software on some page which forks it has may not be possible but one could also fetch the has forks values. Why is there an inverse for has part Property:P527 part of Property:P361? Same reasons apply here too. Prototyperspective (talk) 21:04, 2 October 2024 (UTC)[reply]
     Oppose for "has fork" per Mahir256's reasoning. If you look on the actual usages of has part(s) (P527) and part of (P361), there are many cases were we want to list claims in one direction and not in the other. If a software is a fork of another software we however would always want to list that information on the item for the software. The two are not used in a way where you can just add the reserve claims with a bot.
The amount of triple's that a triple store can save is limited, so we would waste resources when we allow inverse properties in cases like this. If you copy over properties with a bot, dealing with data changes can get messy. ChristianKl21:23, 7 October 2024 (UTC)[reply]
Well I thought it was the other way around – that inverse properties are useful mainly when the data is relevant at both items. Here's examples for illustration: if one wanted to create some Wikidata list with Listeria of art production software or whatever each with a list of its forks then this data could not be retrieved if there's only the is fork of property in the respective forked software item. Likewise, if there is some Software-Wiki powered by Wikidata (if it's ever more complete on software), then it would need to have a separate query to fetch the forks of the given software (or multiple) instead of just pulling it from the Wikidata item. If Wikipedia shows forks of a software the article is about in some template (maybe the infobox but probably some other one) then it couldn't get that data except if one enables to query things like that instead of just properties of the article's WD item. etc Prototyperspective (talk) 23:33, 7 October 2024 (UTC)[reply]
Hopefully people here see these examples and change their vote if they can't think of a plausible good rebuttal. There's many props that are used in both ways. Prototyperspective (talk) 09:47, 25 October 2024 (UTC)[reply]
P527 and P361 are needed in both ways. Bouzinac💬✒️💛 06:31, 13 October 2024 (UTC)[reply]
Yes and has forks / fork of are also needed in both ways. Prototyperspective (talk) 09:28, 13 October 2024 (UTC)[reply]