[go: up one dir, main page]

Page MenuHomePhabricator

Send Api-User-Agent header from MediaWiki client-side code
Open, LowPublic

Description

In the User-Agent Policy, we encourage clients to set the Api-User-Agent header when making requests from a browser, where the User-Agent header cannot be set.

At the moment, we are not using this information, we don't even know if people send it, and our own client code doesn't send it.

Sending this header ourselves will allow us to distinguish requests from our own code from requests coming from third parties und user scripts.

Ideally, we'd be able to set this header to a value that doesn't only include the MediaWiki version, but also the component/extension and ideally even the gadget or script that is making the call.

Event Timeline

@daniel is this meant to be a discussion prompt or is there a team that should actually own moving this forward? Asking mostly because I don't think this should be a DST/Codex responsibility.

@daniel is this meant to be a discussion prompt or is there a team that should actually own moving this forward? Asking mostly because I don't think this should be a DST/Codex responsibility.

I tagged DST/Codex for awareness. It would have to be done by whoever ownes the api client code - I suppose that would be the web team.

As to urgency: this would be useful for the Interfaces team, because we'd get more meaningful signals. It's not super urgent, especially since we need T373871: Log Api-User-Agent header in Turnilo first.

Hi @daniel the api client code lacks an owner right now ( https://www.mediawiki.org/wiki/Developers/Maintainers#MediaWiki_core )
Web team doesn't own API client code currently - only skin code in MediaWiki core so we'd need to work this out.

@daniel - I assume you are referring to mw.Api client code, but there are other libraries that hit the API too (for example Vector's search uses the native fetch function - https://gerrit.wikimedia.org/g/mediawiki/skins/Vector/+/5f944947e470629172ac17149d456f607c3b87b8/resources/skins.vector.search/fetch.js#32) so applying an API user agent header across the entire MediaWiki product will likely require multiple patches in multiple places.

@daniel - I assume you are referring to mw.Api client code, but there are other libraries that hit the API too (for example Vector's search uses the native fetch function - https://gerrit.wikimedia.org/g/mediawiki/skins/Vector/+/5f944947e470629172ac17149d456f607c3b87b8/resources/skins.vector.search/fetch.js#32) so applying an API user agent header across the entire MediaWiki product will likely require multiple patches in multiple places.

Yea, I was afraid you'd say that :)

Do you have an idea how to find the relevant callers? And why they are not going through mw.Api?

Do you have an idea how to find the relevant callers?

You'd want to do an audit of different APIs. I assume some clients might be using $.ajax or event or XMLHttpRequest directly for example. Are gadgets in scope for this header? If so you'd need to consider other APIs.

Sending this header ourselves will allow us to distinguish requests from our own code from requests coming from third parties und user scripts.

And why they are not going through mw.Api?

It's kinda like asking "why are you not using X library to make API requests?". There's no requirement to use mw.Api and usually where it is used, it is because the plus sides outweigh the downsides.

As someone who uses fetch, one reason I sometimes don't use mw.Api is that there is no npm library and often I'm writing code that I want to run inside and outside MediaWiki.

In this case it might be to allow it to be abortable, I'm not sure if that's supported by mw.Api off the top of my head but the fact I don't know without consulting documentation should say something :-) - but there's nothing wrong with using fetch IMO and it side steps having to learn how to use the bespoke mw.Api if you don't use it regularly (I often use fetch these days over mw.Api for most of my use cases - note mw.Api is not available on npm for example).

I think we'll be seeing fetch more over time - it's a API which is universally understood by new JavaScript developers. Generally newer developers haven't needed to support older browsers and typically gravitate towards newer tech. There are developers now who have never used jQuery for example!

Sending this header ourselves will allow us to distinguish requests from our own code from requests coming from third parties und user scripts.

What is the intention here? Could you provide a little background on the problem you are trying to solve, rather than what you are trying to achieve? I suspect there may be other solutions or we might be able to reframe the problem in a way which doesn't require finding every API request and updating it to send the header (or at least limits the scope!).

What's the use case? Where do we want to see it used, and why? I suggest re-titling this task to describe a specific problem. Right now it's not a proposed solution since there's no area of use specified. What would qualify as "use"?

Note that @aaron is using this in the ApiFeatureUsage extension in patch https://gerrit.wikimedia.org/r/c/mediawiki/extensions/ApiFeatureUsage/+/1058726.

I imagine that SRE also sometimes use it already, when throttling external traffic in relation to https://foundation.wikimedia.org/wiki/Policy:User-Agent_policy. As with regular UA strings, it isn't used much by default. But when we reach for it, it is available in raw requests and used in the same circumstances. Eg at the traffic level in HAProxy and VCL, both headers are available to requestctl filters, and in ad hoc varnishlog queries.

I have used it in the past when analysing traffic in varnishlog. Codesearch shows several first-party clients also set it.

It's not currently copied from Api-User-Agent to User-Agent, and not stored in Hadoop, Logstash, and indeed numerous other places.