[go: up one dir, main page]

Page MenuHomePhabricator

Investigate why Toolforge www is slow
Closed, ResolvedPublic

Description

Investigate why https requests from outside are also more than ten times slower than requests from inside Toolforge. As it takes more than 0.5s, even for small static files, something is most likely not working as it should be.

Timing from Flask web-service
Request time for imagehash search when done from Toolforge locally, it is fast. 0.05s. Content of the response is just JSON with body`[]`. The transferred data is almost none.

from Toolforge

zache-tool@tools-sgebastion-10:~$ time curl -o - "https://imagehash.toolforge.org/search?dhash=8173317847711556465&phash=17187026160256130104"
[]
real	0m0.050s
user	0m0.022s
sys	0m0.005s

; However, if i try same request from remote locations speed difference is 1:10, ie. it takes more than 0.5s.

from local OS X laptop (Helsinki, Finland)

time curl -o - "https://imagehash.toolforge.org/search?dhash=8173317847711556465&phash=17187026160256130104" 
[]

0.01s user 
0.01s system 
4% cpu 0.498 total

from ajapaik.ee Linux server (Estonia)

zache@ajapaik:~$ time curl -o - "https://imagehash.toolforge.org/search?dhash=8173317847711556465&phash=17187026160256130104"
[]

real	0m0.513s
user	0m0.027s
sys	0m0.004s

Timing from PHP web-service/static files

from Toolforge

zache-tool@tools-sgebastion-10:~$ time curl -o - "https://fiwiki-tools.toolforge.org/helloworld.txt"  
Hello World

real	0m0.035s
user	0m0.015s
sys	0m0.005s

from local OS X laptop (Helsinki, Finland)

time curl -o - "https://fiwiki-tools.toolforge.org/helloworld.txt"                                           
Hello World
curl -o - "https://fiwiki-tools.toolforge.org/helloworld.txt" 
0.02s user 
0.01s system 
5% cpu 0.485 total

from ajapaik.ee Linux server (Estonia)

time curl -o - "https://fiwiki-tools.toolforge.org/helloworld.txt"  
Hello World

real	0m0.654s
user	0m0.022s
sys	0m0.013s

Event Timeline

Hi!

The main factor here is network delay. Toolforge is currently hosted at the WMF's eqiad data center in Ashburn, Virginia. That's starting to be far enough away from Europe that the speed of light will have a "noticeable" effect on network delays. A traceroute shows that the round-trip time from my Finnish connection to Toolforge is about 150ms:

taavi@runko:~ $ mtr -w imagehash.toolforge.org
Start: 2023-10-11T11:42:07+0300
HOST: runko                                         Loss%   Snt   Last   Avg  Best  Wrst StDev
  1.|-- [redacted]                                     0.0%    10    0.8   0.5   0.3   0.8   0.2
  2.|-- [redacted]                                     0.0%    10   12.4  11.3   8.6  14.0   1.9
  3.|-- van3-sr4.dnaip.fi                              0.0%    10    9.3  12.0   9.1  19.2   3.1
  4.|-- ???                                           100.0    10    0.0   0.0   0.0   0.0   0.0
  5.|-- 87-94-160-0.rev.dnaip.fi                       0.0%    10    9.8  11.6   9.8  16.8   2.1
  6.|-- hel7-sr22.net.dnaip.fi                         0.0%    10   10.9  10.8   9.7  12.3   1.0
  7.|-- 213.249.107.90                                 0.0%    10   36.0  19.7  16.5  36.0   5.9
  8.|-- ae64.edge1.Stockholm1.Level3.net               0.0%    10   16.6  17.0  15.5  19.0   1.3
  9.|-- ???                                           100.0    10    0.0   0.0   0.0   0.0   0.0
 10.|-- WIKIMEDIA-F.bear2.Washington111.Level3.net     0.0%    10  144.4 143.4 141.5 144.5   1.0
 11.|-- xe-0-0-0-1102.cloudsw1-c8-eqiad.wikimedia.org  0.0%    10  161.5 210.0 152.7 567.2 130.2
 12.|-- cloudgw1001.eqiad1.wikimediacloud.org          0.0%    10  143.6 142.6 140.8 145.3   1.4
 13.|-- toolforge.org                                  0.0%    10  142.8 142.9 141.0 145.8   1.4

So combine that with the couple of round-trips needed for the TCP and TLS handshakes, etc., and times of about 0.5s for a single request is starting to seem reasonable.

Similarly, on a server in New York the network delay is much smaller, and requests take less time to complete:

taavi@puppet-03 ~ $ mtr -w imagehash.toolforge.org
Start: 2023-10-11T08:45:59+0000
HOST: puppet-03                                     Loss%   Snt   Last   Avg  Best  Wrst StDev
  1.|-- ???                                           100.0    10    0.0   0.0   0.0   0.0   0.0
  2.|-- 10.74.194.210                                  0.0%    10    0.5   0.6   0.4   1.3   0.2
  3.|-- 138.197.251.2                                  0.0%    10    0.9   1.5   0.8   3.6   1.0
  4.|-- 138.197.248.16                                 0.0%    10    3.3   1.4   0.5   5.2   1.6
  5.|-- nyk-b11-link.ip.twelve99.net                  30.0%    10    1.0   1.4   1.0   2.9   0.7
  6.|-- nyk-bb1-link.ip.twelve99.net                   0.0%    10    2.5   2.5   2.3   2.8   0.1
  7.|-- rest-bb1-link.ip.twelve99.net                  0.0%    10    7.7   7.8   7.6   8.3   0.2
  8.|-- ash-b2-link.ip.twelve99.net                    0.0%    10    8.7   8.6   8.4   9.1   0.2
  9.|-- wikimedia-ic-308845.ip.twelve99-cust.net       0.0%    10    7.2   7.7   7.1  11.2   1.3
 10.|-- xe-0-0-0-1103.cloudsw1-d5-eqiad.wikimedia.org  0.0%    10   24.2  54.1  21.2 313.6  91.3
 11.|-- cloudgw1001.eqiad1.wikimediacloud.org          0.0%    10    7.1   7.2   7.0   7.9   0.3
 12.|-- toolforge.org                                  0.0%    10    7.1   7.3   7.1   7.7   0.2
taavi@puppet-03 ~ $ time curl -o - "https://fiwiki-tools.toolforge.org/helloworld.txt"
Hello World

real	0m0.080s
user	0m0.012s
sys	0m0.012s

I investigated this a bit further trying to find some potential solutions.

To confirm that there are 3 round-trips, I had a look at the numbers in Firefox under Console/Network/Timing: ~150ms connecting, ~150ms TLS setup, plus ~150ms waiting (the actual request). Hence ~450ms in total.

Would one improvement be to support HTTP/3 (T238034)? I think it can make a request in one round-trip when it knows enough about the server. Thus, the total would be ~150ms.

Meanwhile, one can send multiple requests over a single HTTP connection. This way, 10 requests can be done in 2+10 round-trips instead of 30 round-trips, so in ~1800ms instead of ~4500ms. (In Python's requests, you just create a session object and make your requests through it.)

$ time curl -o - "https://fiwiki-tools.toolforge.org/helloworld.txt" "https://fiwiki-tools.toolforge.org/helloworld.txt" "https://fiwiki-tools.toolforge.org/helloworld.txt" "https://fiwiki-tools.toolforge.org/helloworld.txt" "https://fiwiki-tools.toolforge.org/helloworld.txt" "https://fiwiki-tools.toolforge.org/helloworld.txt" "https://fiwiki-tools.toolforge.org/helloworld.txt" "https://fiwiki-tools.toolforge.org/helloworld.txt" "https://fiwiki-tools.toolforge.org/helloworld.txt" "https://fiwiki-tools.toolforge.org/helloworld.txt"
Hello World
Hello World
Hello World
Hello World
Hello World
Hello World
Hello World
Hello World
Hello World
Hello World

real	0m1.813s
user	0m0.046s
sys	0m0.008s

Yet an improvement would be to multiplex or pipeline the requests to get multiple responses in the time that one request takes, e.g. 10 requests in ~450ms. For some reason, my curl above doesn't seem to multiplex the requests but e.g. wget2 does:

$ time wget2 "https://fiwiki-tools.toolforge.org/helloworld1.txt" "https://fiwiki-tools.toolforge.org/helloworld2.txt" "https://fiwiki-tools.toolforge.org/helloworld3.txt" "https://fiwiki-tools.toolforge.org/helloworld4.txt" "https://fiwiki-tools.toolforge.org/helloworld5.txt" "https://fiwiki-tools.toolforge.org/helloworld6.txt" "https://fiwiki-tools.toolforge.org/helloworld7.txt" "https://fiwiki-tools.toolforge.org/helloworld8.txt" "https://fiwiki-tools.toolforge.org/helloworld9.txt" "https://fiwiki-tools.toolforge.org/helloworld10.txt"
[4] Downloading 'https://fiwiki-tools.toolforge.org/helloworld4.txt' ...
[4] Downloading 'https://fiwiki-tools.toolforge.org/helloworld6.txt' ...
[4] Downloading 'https://fiwiki-tools.toolforge.org/helloworld7.txt' ...
[4] Downloading 'https://fiwiki-tools.toolforge.org/helloworld8.txt' ...
[4] Downloading 'https://fiwiki-tools.toolforge.org/helloworld9.txt' ...
[4] Downloading 'https://fiwiki-tools.toolforge.org/helloworld10.txt' ...
[3] Downloading 'https://fiwiki-tools.toolforge.org/helloworld5.txt' ...
[1] Downloading 'https://fiwiki-tools.toolforge.org/helloworld3.txt' ...
[2] Downloading 'https://fiwiki-tools.toolforge.org/helloworld2.txt' ...
[0] Downloading 'https://fiwiki-tools.toolforge.org/helloworld1.txt' ...
HTTP ERROR response 404  [https://fiwiki-tools.toolforge.org/helloworld4.txt]
HTTP ERROR response 404  [https://fiwiki-tools.toolforge.org/helloworld10.txt]
HTTP ERROR response 404  [https://fiwiki-tools.toolforge.org/helloworld7.txt]
HTTP ERROR response 404  [https://fiwiki-tools.toolforge.org/helloworld9.txt]
HTTP ERROR response 404  [https://fiwiki-tools.toolforge.org/helloworld8.txt]
HTTP ERROR response 404  [https://fiwiki-tools.toolforge.org/helloworld6.txt]
HTTP ERROR response 404  [https://fiwiki-tools.toolforge.org/helloworld5.txt]
HTTP ERROR response 404  [https://fiwiki-tools.toolforge.org/helloworld3.txt]
HTTP ERROR response 404  [https://fiwiki-tools.toolforge.org/helloworld2.txt]
HTTP ERROR response 404  [https://fiwiki-tools.toolforge.org/helloworld1.txt]

real	0m0.484s
user	0m0.051s
sys	0m0.014s

I don't think requests can pipeline or multiplex requests, but e.g. httpx can when using async Python and HTTP/2 (if the code is structured suitably):

#!/usr/bin/env python3

import asyncio, httpx, time

start = time.time()

async def log_request(request):
    print("[{:.3f}] Sending request".format(time.time()-start))

async def log_response(response):
    request = response.request
    print("[{:.3f}] Response received".format(time.time()-start))

async def main():
    client = httpx.AsyncClient(http2=True, event_hooks={
        "request": [log_request],
        "response": [log_response]
    })
    requests = [
        client.get("https://fiwiki-tools.toolforge.org/helloworld.txt")
        for i in range(10)
    ]
    responses = await asyncio.gather(*requests)
    for response in responses:
        print(response.text, end="")
    print("[{:.3f}] Done".format(time.time()-start))

asyncio.run(main())

Output:

$ time python3 multiplexing.py 
[0.028] Sending request
[0.032] Sending request
[0.032] Sending request
[0.033] Sending request
[0.033] Sending request
[0.033] Sending request
[0.033] Sending request
[0.034] Sending request
[0.035] Sending request
[0.035] Sending request
[0.501] Response received
[0.504] Response received
[0.506] Response received
[0.507] Response received
[0.508] Response received
[0.508] Response received
[0.509] Response received
[0.624] Response received
[0.627] Response received
[0.627] Response received
Hello World
Hello World
Hello World
Hello World
Hello World
Hello World
Hello World
Hello World
Hello World
Hello World
[0.628] Done

real	0m0.967s
user	0m0.372s
sys	0m0.040s

(I haven't been able to find out why, but usually some responses arrive in ~500ms from start like here, but there are always some (or all) that arrive in ~620ms.)

@taavi : Thanks for the latency explanation. Do you know if the Cloud VPS instances are also running physically in the US so they would be also affected to roundtrip? Also, would it be possible to use HTTP/3 in Could VPS instance? (ie. can you have udp traffic from outside world to virtual machine)

@TuukkaH Thanks, i think that i change code so that it will use sessions and see how much difference it makes. Thank you also for HTTP/3 suggestion.

CloudVPS instances are all in Ashburn, Virginia with toolforge (which is a bunch of instances), yes.

I closed the ticket as I got an answer why www was slow. Thank you for everybody.