[go: up one dir, main page]

|
|
Subscribe / Log in / New account

Continued attacks on HTTP/2

By Daroc Alden
April 10, 2024

On April 3 security researcher Bartek Nowotarski published the details of a new denial-of-service (DoS) attack, called a "continuation flood", against many HTTP/2-capable web servers. While the attack is not terribly complex, it affects many independent implementations of the HTTP/2 protocol, even though multiple similar vulnerabilities over the years have given implementers plenty of warning.

The attack itself involves sending an unending stream of HTTP headers to the target server. This is nothing new — the Slowloris attack against web servers using HTTP/1.1 from 2009 worked in the same way. In Slowloris, the attacker makes many simultaneous requests to a web server. Each request has an unending stream of headers, so that the request never completes and continues tying up the server's resources. The trick is to make these requests extremely slowly, so that the attacker has to send relatively little traffic to keep all the requests alive.

In the wake of the Slowloris attack, most web servers were updated to place limits on the number of simultaneous connections from a single IP address, the overall size of headers, and on how long the software would wait for request headers to complete before dropping the connection. In some web servers, however, these limits were not carried forward to HTTP/2.

In 2019, there were eight CVEs reported for vulnerabilities exploitable in similar ways. These vulnerabilities share two characteristics — they involve the attacker doing unusual things that are not explicitly forbidden by the HTTP/2 specification, and they affect a wide variety of different servers. Web servers frequently have to tolerate clients that misbehave in a variety of ways, but the fact that these vulnerabilities went so long before being reported is perhaps an indication that there are few truly broken clients in use.

Two of these vulnerabilities in particular, CVE-2019-9516 and CVE-2019-9518, can involve sending streams of empty headers, which take a disproportional amount of CPU and memory for the receiving server to process compared to the effort required to generate them. Nowotarski's attack seems like an obvious variation — sending a stream of headers with actual content in them. The attack is perhaps less obvious than it seems, given that it took five years for anyone to notice the possibility.

Continuation flooding

HTTP/2 is a binary protocol that divides communications between the client and the server into frames. Headers are carried in two kinds of frame: an initial HEADERS frame, followed by some number of CONTINUATION frames. The continuation flood attack involves sending a never-ending string of continuation frames, with random header values packed inside them. Because they are random, these headers are certainly not meaningful to the receiving server. Despite this, some servers still allocate space for them, slowly filling the server's available memory. Even servers which do place a limit on the size of headers they will accept usually choose a large limit, making it relatively straightforward to consume their memory using multiple connections.

Another wrinkle is that HTTP/2 requests are not considered complete until the last continuation frame is received. Several servers that are vulnerable to this attack don't log requests — failed or otherwise — until they are complete, meaning that the server can die before any indication of what happened makes it into the logs. The fact that continuation flooding requires only one request also means that traditional abuse-prevention tools, which rely on noticing a large number of connections or traffic from one source, are unlikely to automatically detect the attack.

Nowotarski listed eleven servers that are confirmed to be vulnerable to the attack, including the Apache HTTP Server, Node.js, and the server from the Go standard library. The same announcement stated that other popular servers, including NGINX and HAProxy, were not affected.

It is tempting to say that all these attacks — the eight from 2019, and now continuation flooding — are possible because HTTP/2 is a complex, binary protocol. It is true that HTTP/2 is substantially more complicated than HTTP/1.1, but every version of HTTP had had its share of vulnerabilities. The unfortunate truth is that implementing any protocol with as many divergent use cases as HTTP is difficult — especially when context is lost between designers and implementers.

The designers of HTTP/2 were well aware of the potential danger of DoS attacks. In July 2014, Roberto Peon sent a message to the ietf-http-wg mailing list talking about the potential for headers to be used in an attack:

There are three modes of DoS attack using headers:
1) Stalling a connection by never finishing the sending of a full set of headers.
2) Resource exhaustion of CPU.
3) Resource exhaustion of memory.

[...]

I think #3 is the interesting attack vector.

The HTTP/2 standard does not set a limit on the size of headers, but it does permit servers to set their own limits: "A server that receives a larger header block than it is willing to handle can send an HTTP 431 (Request Header Fields Too Large) status code." Yet despite this awareness on the part of the protocol designers, many implementers had not chosen to include such a limit.

In this case, fixing the vulnerability is relatively straightforward. For example nghttp2, the HTTP/2 library used by the Apache HTTP Server and Node.js, imposed a maximum of eight continuation frames on any one request. However, this vulnerability still raises questions about the security and robustness of the web-server software we rely on.

HTTP/2 is a critical piece of the internet. It accounts for somewhere between 35.5% and 64% of web sites, depending on how the measurement is conducted. There are several tools to help implementers produce correct clients and servers. There is a publicly available conformance testing tool — h2spec — to supplement each individual project's unit and integration tests. nghttp2 ships its own load-testing tool, and Google's OSS-fuzz provides fuzz testing for several servers. These tools hardly seem sufficient, however, in light of the ongoing discovery of vulnerabilities based on slight deviations from the protocol.

The continuation flood attack is not particularly dangerous or difficult to fix, but the fact that it affects so many independent implementations nearly nine years after the introduction of HTTP/2 is a stark wakeup call. Hopefully we will see not only fixes for continuation flooding, but also increased attention on web server reliability, and the tests to ensure the next issue of this kind does not catch us by surprise.


Index entries for this article
SecurityVulnerabilities/Denial of service


to post comments

Continued attacks on HTTP/2

Posted Apr 10, 2024 14:21 UTC (Wed) by josh (subscriber, #17465) [Link] (10 responses)

HTTP/2 has less and less of a use case justifying it. I wouldn't be surprised if some new frameworks start only supporting HTTP/1.1 (for compatibility) and HTTP/3 (for performance).

Continued attacks on HTTP/2

Posted Apr 10, 2024 20:04 UTC (Wed) by barryascott (subscriber, #80640) [Link] (9 responses)

Http/3 would be attackable in the same way as http/2 I assume as the difference between them is that UDP is used instead of TCP.

Both will need memory limits and suitable timeouts to defend against attacks or client bugs.

Continued attacks on HTTP/2

Posted Apr 10, 2024 21:09 UTC (Wed) by Heretic_Blacksheep (subscriber, #169992) [Link]

Probably. Josh's argument would fit nearly any HTTP/1.x replacement. "No need for HTTP/4, HTTP/1.x for compatibility & HTTP/5 for 'performance'." ... Ooops forgot to set limits on resource limits for all HTTP/5 sessions... HTTP/6 for "performance.... Pretty soon it's XKCD territory again.

Continued attacks on HTTP/2

Posted Apr 10, 2024 21:38 UTC (Wed) by josh (subscriber, #17465) [Link]

I'm not suggesting otherwise. I'm observing that it's easier to secure and optimize two versions of HTTP than three versions of HTTP, in terms of attack surface area.

(And no, the differences between HTTP/2 and HTTP/3 are more substantial than just TCP vs UDP.)

Continued attacks on HTTP/2

Posted Apr 10, 2024 22:19 UTC (Wed) by flussence (guest, #85566) [Link] (4 responses)

If you install a mainstream webserver today, HTTP/3 practically doesn't even exist. It's unlikely to exist in that realm for a year or two, and when it finally becomes an option it'll be an invasive change for system administrators. I'm sure we'll start seeing the usual CVE per month when that happens. It's going to be fun to discover the consequences of throwing out the last 15 years of TCP congestion control work for userspace rate-control code of varying quality too.

Maybe the reason we haven't heard much of security problems with H3 yet is that it's more or less the exclusive domain of global domination CDNs with enough money to afford top-to-bottom control of their stack; people who only concern themselves with situations that show up as an outlier on a slick, javascript-animated metrics dashboard graph, perhaps hosted on a vanity gTLD. At its core H3 currently suffers from the same phenomenon that saddled btrfs with a decade-long reputation for losing your data: Facebook has a million-server failover setup and does not care about “Earth dweller” matters like RAID5/6 being broken.

Not that this is to imply HTTP/1.1 is secure, mind you. The push for H2 adoption was in part a collective panic from finding out how many ways a simple line-based text protocol can be screwed up.

Continued attacks on HTTP/2

Posted Apr 11, 2024 9:31 UTC (Thu) by paulj (subscriber, #341) [Link]

HTTP/3 runs over QUIC, and QUIC requires a congestion controller to be implemented. That's usually either BBRv1 and/or CUBIC in implementations I'm familiar with. I.e., they're doing the same congestion control as TCP, and they should work fairly with existing TCP flows.

HTTP/3 packaging problems: is OpenSSL at fault?

Posted Apr 13, 2024 13:57 UTC (Sat) by DemiMarie (subscriber, #164188) [Link] (2 responses)

Is this because of OpenSSL not supporting third-party QUIC implementations?

HTTP/3 packaging problems: is OpenSSL at fault?

Posted Apr 13, 2024 17:24 UTC (Sat) by mbunkus (subscriber, #87248) [Link] (1 responses)

Somewhat related: I saw a toot[1] by Daniel Stenberg (the main curl developer/maintainer) the other day about the still bad performance of the QUIC implementation of even the latest & greatest OpenSSL release 3.3. It linked to the a mailing list post[2] which included some more details about both the API being insufficient (having to fall back to inefficient pulling) & the performance severely lacking.

Make of that what you will.

[1] https://mastodon.social/@bagder/112243310605678729
[2] https://curl.se/mail/distros-2024-04/0001.html

HTTP/3 packaging problems: is OpenSSL at fault?

Posted Apr 14, 2024 5:42 UTC (Sun) by wtarreau (subscriber, #51152) [Link]

The problem is that the openssl devs are stubborn and decided that it suddenly became their job to develop a transport layer that would work everywhere. There are plenty of different QUIC stacks and each of them is different precisely because the constraints are not the same in each implementation. What was needed from them was just to accept the 28 patches from quictls to expose the minimally needed internals, nothing more. But apparently this was not giving a good image of their work to the foundation so they preferred to pretend they would develop the ultimate QUIC stack to keep some funding. They'd rather work on fixing their monstrous locking issues that plague their code on modern systems instead. That's where everyone is expecting them to spend their time, in order to address the mess they created and that only them can fix.

Continued attacks on HTTP/2

Posted Apr 11, 2024 9:29 UTC (Thu) by paulj (subscriber, #341) [Link]

I don't know if there's any significant difference between HTTP/3 and 2 at the HTTP application layer. However, HTTP/3 runs over QUIC streams. QUIC streams are individually flow-controlled. The receiver will advertise a window - it's buffer size for that stream - and the sender can not send more until the receiver has processed some or all of the buffer.

tl;dr: The described DoS does not apply to at least the QUIC streams portion of HTTP/3.

Continued attacks on HTTP/2

Posted Apr 11, 2024 18:09 UTC (Thu) by danielkza (subscriber, #66161) [Link]

I also thought that was the case, but QPACK used in HTTP/3 for header encoding (replacing HPACK from HTTP/2) works differently to avoid requiring blocks of header data to be processed in order, and has strict limits on delays and memory usage that need to observed by the sender, otherwise risking complete rejection of the request.

Though I found QPACK quite complex (having been able to implement HPACK myself without much difficulty in the past), it does seem to be safer against Slowloris-style attacks by design. Implementing the RFC as-is already gets you quite close to avoiding these attacks altogether.

Continued attacks on HTTP/2

Posted Apr 10, 2024 22:59 UTC (Wed) by Curan (subscriber, #66186) [Link] (2 responses)

> It is tempting to say that all these attacks — the eight from 2019, and now continuation flooding — are possible because HTTP/2 is a complex, binary protocol.

It is and it is most likely true, isn't it? What was the real benefit of HTTP/2 over HTTP/1.1? Was it the compressed headers? Than it might just be easier to get rid of all the pointless ones and save space. Are there real issues we could not have saved by mandating pipelining for HTTP/2 or a hypothetical HTTP/1.3 over 1.1? Scheduling of responses on the server is also an issue thanks to the wonderful idea of multiplexing. And many more issues.

And hey, there is always https://queue.acm.org/detail.cfm?id=2716278 to point to.

Continued attacks on HTTP/2

Posted Apr 11, 2024 17:13 UTC (Thu) by barryascott (subscriber, #80640) [Link] (1 responses)

http/2 only needs to do one TCP connection and then multiplexies lots of HTTP transactions over that one connection. This speeds up loading many resources in parallel.

I cannot remember if there is header compression, I don't recall seeing that, but I've not been keeping up.

http/3 uses UDP so that the big problem of http/2, head-of-line blocking, could be solved.

Continued attacks on HTTP/2

Posted Apr 11, 2024 19:07 UTC (Thu) by danielkza (subscriber, #66161) [Link]

HTTP2 uses a custom encoding/compression for headers named HPACK, HTTP3 uses a sucessor based on same principles named QPACK.

Continued attacks on HTTP/2

Posted Apr 11, 2024 12:26 UTC (Thu) by wtarreau (subscriber, #51152) [Link] (60 responses)

What's really annoying with the security threater these days is that the same vulnerabilities are "rediscovered" in loops every 10 years because neither implementers nor security researchers read specs. The risks of DoS around CONTINUATION were discussed to death 10 years ago during the design phase between those of us who just didn't want it, those who wanted infinite headers, and led to tons of proposals and "I will never ever implement this dangerous crap" for every proposal. In the end the current status on CONTINUATION was the least rejected, was clearly documented as dangerous in section 10.5 of the spec by then, and guess what ? 10 years later you find that the stacks that are finally completely unaffected are the ones from those who were saying "warning, DoS in sight". Just found the archives, they're here (most of the longest threads such as "striving for compromise", "jumo != fragmentation", "large frames", "continuation" etc):
https://lists.w3.org/Archives/Public/ietf-http-wg/2014Jul...

While I can hardly imagine why someone would be {cr,l}azy enough to want to allocate RAM for an incomplete in-order frame instead of naturally memcpy() it at the end of the previous one (except maybe when dealing with problematic languages that make it hard to simply store data in anything but an object), I find it ironical that in the end it's the Go, Rust, JS and C++ stacks that are vulnerable and that the 3 pure-C ones are not :-) It's at least a hint to fuel my suspicion that not being constrained by a language to do things only following certain (possibly wrong) ways avoids certain classes of control-based bugs.

Continued attacks on HTTP/2

Posted Apr 11, 2024 15:07 UTC (Thu) by tialaramex (subscriber, #21167) [Link] (47 responses)

I would suggest that what you've seen is that in a language (C) where even easy things are difficult, people refrained from trying to do hard things entirely, and so you won't find any bugs in the hard things only because they weren't implemented at all.

If we're sure hard things are just always a bad idea then I agree that's a win for C. But, maybe hard things are sometimes a good idea, and then in C you're going to really struggle.

Continued attacks on HTTP/2

Posted Apr 11, 2024 17:53 UTC (Thu) by wtarreau (subscriber, #51152) [Link] (44 responses)

> you won't find any bugs in the hard things only because they weren't implemented at all

I'm definitely certain that it's not the case since this stuff was designed (based on the working group feedback) to just be of moderate difficulty, and was implemented as necessary by everyone. And when I did it on my side, I did it with DoS in mind. I.e. the vast majority of the time the code will be used will not be because Chrome wants to grease the connection but because an attack is sending 10 million frames a second to the front machine and that one must resist.

> If we're sure hard things are just always a bad idea then I agree that's a win for C. But, maybe hard things are sometimes a good idea, and then in C you're going to really struggle.

For some complex features that's generally true. I.e. I wouldn't write a browser. But when dealing with network protocols and wire parsers in general, C is really well suited because operations are generally simple, and you want to be able to use every opportunity to save every single possible nanosecond since it's just a matter of life or death for your program when facing the brutality of the net.

Continued attacks on HTTP/2

Posted Apr 13, 2024 7:42 UTC (Sat) by epa (subscriber, #39769) [Link] (43 responses)

What you say is probably true for expert programmers. But for anyone less than perfect C is a dangerous choice for wire parsers or for decoding any binary format. The past thirty years have seen a steady stream of vulnerabilities from file format parsers written in C. A safer language (with checked array access, error on overflow, and less need for pointer manipulation) would be a better choice for most fallible humans when faced with a skilled attacker who only needs to be lucky once.

Continued attacks on HTTP/2

Posted Apr 15, 2024 12:22 UTC (Mon) by wtarreau (subscriber, #51152) [Link] (5 responses)

The problem is more about the practice and the way the language was taught rather than the language itself (others have the same limitations, starting from assembly). The problem is that when you're first taught programming, it's explained to you that if you go out of bounds, it will segfault and crash. And particularly since systems have been starting to ship with ulimit -c 0 by default, this has been perceived as the cheap error handling: why write a length check and go through error handling if the result is the same ?

What we need is computer teachers explaining first how to exploit a parser bug via a config file, a log line, a large HTTP header, etc so that early developers keep this in mind and never ever start with a bad habit. Laziness should not be an option. In parallel, compilers should make it easier to write correct code. For now they're doing the opposite, they're making it ultra-difficult to write correct code, forcing users to case, put attributes and what not to shut stupid warnings, so the simplest path to valid code is the least safe.

Continued attacks on HTTP/2

Posted Apr 15, 2024 12:47 UTC (Mon) by Cyberax (✭ supporter ✭, #52523) [Link] (1 responses)

> And particularly since systems have been starting to ship with ulimit -c 0 by default, this has been perceived as the cheap error handling: why write a length check and go through error handling if the result is the same ?

When I was studying at a university about 25 years ago, we actually had a mandatory course where we wrote an exploit, with simple shell code, for a deliberately vulnerable server written in C. So people certainly know about the danger of OOB access.

Continued attacks on HTTP/2

Posted Apr 16, 2024 3:23 UTC (Tue) by wtarreau (subscriber, #51152) [Link]

That's great that you had this opportunity. The first time a person taught me about the ability to overflow a buffer and execute code 30 years ago, I almost laughed, and said "you'd be lucky if that would surprisingly work", and he told me "it works more often than you think". That's when I started experimenting with it and figured how hard it was to achieve on sparc (due to switched register banks) that I wrote a generic exploitation tool for this and finally managed to get root on some systems :-) I just felt sad that it was so much ignored by teachers themselves.

Continued attacks on HTTP/2

Posted Apr 15, 2024 13:30 UTC (Mon) by farnz (subscriber, #17727) [Link]

The person I worked with who was worst for writing code with security bugs was taught in exactly the way you describe; his attitude after graduating was that this was "just theory", and therefore he didn't have to care about secure handling of predictable errors since "it crashes, so we'll know if it's got a bug because we'll get bug reports". He was great at exploiting bugs, but useless at preventing them.

IME, the thing that helps you learn is to work in languages where you simply cannot write certain classes of bug without a compiler error; writing code that compiles in Agda is a lot harder to learn to do than writing code that C compilers will accept, but if you're used to thinking in terms of "how do I write this in a way that the compiler can see is bug-free?", you're better at writing code that is genuinely bug free, even when you then learn how to write C (albeit that you're also more likely to write a program extractor that takes your Agda, removes the proof-related bits, and outputs C).

Continued attacks on HTTP/2

Posted Apr 15, 2024 19:52 UTC (Mon) by epa (subscriber, #39769) [Link] (1 responses)

Even if it did reliably segfault and crash, that’s still not a good choice for a format parser run in-process as part of a larger application. Heck, even a command line tool would be considered buggy if it crashed on bad input instead of giving a helpful error.

Continued attacks on HTTP/2

Posted Apr 16, 2024 11:05 UTC (Tue) by paulj (subscriber, #341) [Link]

This was kind of the experience with the "wise" C network code I maintained. The parsers - as per other comments - were structured to use a simple, checking abstraction layer to read/write atoms. If the higher-level parser made a logical mistake and issued an out of bounds read or write, the checking layer would abort().

This solved the problem of buffer overflows. However, we would still generally have a DoS security bug from the process ending itself. Obviously, an abort() and possible DoS is still /way/ better than an overflow and RCE security bug, but also still not ideal.

The next challenge was to make the parsers logically robust. Memory safe languages do not solve this.

Explicit state machines, in languages (programming or DSLs) that can verify error events are always handled (e.g., by proceeding to some common error state) can help. And even C these days can trivially ensure that all events are handled in an a state machine. Requires programmer discipline though.

It's worth noting in this story (IIUC) that the implementations in "memory safe" languages were vulnerable to the bug, and implementation in the "unsafe" language was not. ;)

Continued attacks on HTTP/2

Posted Apr 15, 2024 14:07 UTC (Mon) by paulj (subscriber, #341) [Link] (36 responses)

It is possible write parsers safely, in C. All that is needed is to separate the parser from the data manipulation. E.g., by providing a small, simple buffer abstraction that provides methods to safely read and write the various atoms which the parser should deal in. The methods do the bounds checking. Often such abstractions provide a cursor, from which to next read or write (or separate read and write cursors), but not necessary.

Such an abstraction is pretty trivial to write. It's pretty simple, and low in branches - trivial to make provably correct (by inspection, by 100% branch coverage, etc.), and it's still going to be fast in C. The parser can be guaranteed to be immune to buffer overflows, and guaranteed that it will terminate cleanly if it makes a syntactical error. It's also good practice in writing parsers to structure code this way, regardless of how safe the language is.

Doesn't handle logical errors in state, but then... neither do "safe" languages.

I've worked on network code using such, with significant internet exposed parsers, and never had any buffer overflow bugs or undefined behaviour, in... decades. (Thank you Kunihiro Ishiguro for your wisdom on this - back in the 90s, an era full of C network code that lacked your wisdom, consequences of which we still pay for today).

That said... There is the problem that other programmers will sometimes come along, the kind who are sure they can write parsers that directly twiddle pointers and who think such abstractions are overhead, and they will.... ignore those abstractions, and write direct-pointer-twiddling parsing code. And such programmers - of course - are never as infallible and great as they thought, and such code does eventually end-up having overflow and attacker-controlled, undefined memory write bugs. And even a more disciplined maintainer/reviewer type tries to say no to these programmers, they will take umbrage - causing other problems, and they may find ways to get around it and get their code in anyway.

So, if you have the discipline to correctly layer your parsing code, into a low-level, simple, checked, lexer (of whatever atoms - binary or otherwise) a higher-level logical parser, you can do this safely and trivially in C. However, I also understand the argument that - through ignorance and/or ego - some programmers simply can not be trusted to exercise that discipline, and a programming language that enforces it on them is required.

Continued attacks on HTTP/2

Posted Apr 15, 2024 14:46 UTC (Mon) by Wol (subscriber, #4433) [Link] (1 responses)

> So, if you have the discipline to correctly layer your parsing code, into a low-level, simple, checked, lexer (of whatever atoms - binary or otherwise) a higher-level logical parser, you can do this safely and trivially in C.

Again, this is what I bang on about always having a state table - in your head if nowhere else! If you have two input state variables, and three output states, then you have a massive logic hole that someone can exploit! I take Pizza's point about that 1024x1024 state table, but at the end of the day dealing with that is (mathematically) trivial. Especially if you can say "these are the state bits I am dealing with, everything else is EXPLICITLY a no-op".

Talking of parsers, my supervisor wrote a lexer/parser he was very proud of years ago. A couple of months later I was having a load of trouble with it (it was meant to process simple calculations in financial reporting sheets, so getting the right answer was rather important...). So I got out my brother's book on compilers, replaced EIGHT pages of printout that was the lexer with one line - a statement call that was built into the language - and rewrote the parser to output an RPN state table. In the process, I discovered that my supervisor had messed up his lexing rules and all the bugs were down to partially-lexed output being fed into the parser. Whoops. My resulting engine was much more robust, and much more powerful (in that it had fully-functional precedence rules).

ALWAYS handle ALL possible states, even if it's a no-op, but it has to be a conscious no-op! Anything else is an error, a bug, a potential vulnerability.

Cheers,
Wol

Continued attacks on HTTP/2

Posted Apr 15, 2024 15:25 UTC (Mon) by paulj (subscriber, #341) [Link]

Dealing with state machines in a structured way is also good in parsing, yeah.

There are tools out there for this, though, most seem geared for FSMs for text. So you probably have to roll your own. But that isn't difficult either in most languages.

Network protocol state machines are highly unlikely to have a distinct space of 1024 states by 1024 events though. It's usually much more tractable. And if a network protocol state machine did have that, I'd start looking at backing away from that protocol. (E.g., by supporting only a simplified subset in some compat mode, and designing my own simpler version for my needs). ;)

Continued attacks on HTTP/2

Posted Apr 15, 2024 15:06 UTC (Mon) by mb (subscriber, #50428) [Link] (14 responses)

Yep, simply don't write bugs and you'll end up having completely bug free code.
It's that simple.

Except that is apparently isn't. The reality proves that.

We have been trying that for decades. But still about half of the security relevant bugs are simple memory corruption bugs.

It's great that you know how to write a safe parser. You are an experienced developer and you did your parser implementations mistakes probably decades ago. You leant from it.

Most people are not on that level, though.
But still, they will write parsers. For example because their employer demands it. Or because they think they are able to correctly do it.

And that is exactly where memory safe languages come into play. They detect bugs that beginners make and they give an assurance to professionals that their code is indeed correct. Both are a win.

Continued attacks on HTTP/2

Posted Apr 15, 2024 15:18 UTC (Mon) by paulj (subscriber, #341) [Link] (1 responses)

I'm old enough to have the scars, yeah.

I guess my point is that:
1. I agree that it's better to use safe languages by default.
2. However, I would /disagree/ with anyone who said we should /never/ use C anymore for network parsers. If done correctly, with the right layering of checked abstractions, it can be done safely. It's not automatically bad. And performance does still matter.

I'll echo Willy's point about CPU and energy use mattering in DC contexts. Every % of extra CPU in lower-level network code costs significant amounts of money per year in both OpEx and CapEx when it is underpinning the majority of connections in all the applications you're running on X thousands of servers.

Continued attacks on HTTP/2

Posted Apr 15, 2024 16:09 UTC (Mon) by mb (subscriber, #50428) [Link]

>And performance does still matter.

I don't think that a Rust implementation would be any slower than your hand written verify-every-raw-tokenization C implementation.
In fact, the Rust implementation could potentially even be faster, if it is able to eliminate certain down-stream checks by proving that they can't happen. (e.g. token enum can't have an invalid value).
But at the very least it will be as fast as the C code, in this case. The verification checks of each token access in the raw stream are required in both cases.

Yes, there's probably no reason to rush and rewrite all existing parsers in memory safe languages.
But there's also hardly any reason to still write new parsers in unsafe languages.

Continued attacks on HTTP/2

Posted Apr 15, 2024 15:37 UTC (Mon) by paulj (subscriber, #341) [Link] (2 responses)

Oh, and my other point there was there is a very simple way to make parsing in C safe - just a very simple checking abstraction. It's reallly not much code, nor complex.

You don't have to write it. You can find battle-tested ones. They are very easy to use.

For some reason, for decades, C programmers just - by and large - have ignored this option. It goes beyond ignorance or hubris of individual programmers. It speaks to a failure in Software Engineering as a profession. That we have so many engineers who simply don't understand how to do safe parsing in a low-level language.

Cause it isn't hard. It really is not.

It's actually a trivial problem. That /specific/ problem should have been fixed /decades/ ago simply by educating young engineers correctly. It should /not/ have needed a) the invention of new languages with either complex runtimes and/or complex type systems; b) the wide-spread adoption of such languages; c) the rewriting of old code into such languages. As a), b) and c) obviously imply a very significant amount of time to solve the problem.

I am not arguing against a, b and c - because they can solve many other issues besides just the specific problem of preventing buffer overflows in network parsers due to syntactical or logical errors. However, that specific problem is trivial and should have been addressed decades ago.

There is a deeper problem here in the profession of software engineering. We suck at being "professional engineers".

Continued attacks on HTTP/2

Posted Apr 16, 2024 8:47 UTC (Tue) by paulj (subscriber, #341) [Link] (1 responses)

And just to illustrate how. Consider how many times in professional publications (inc. LWN) you have read articles of the following form:

1. Here's a shiny new language, which can fix the regularly occurring problem of buffer overflows in network exposed software, leading to serious or catastrophic security issues, which has been a long standing issue. Here are the shiny features which can prevent such issues, and make programmers more productive. The language is almost stable, and more and more people are using it, and we're seeing more serious software being written in it. The more it's used the better! Shiny!

2. Buffer overflows in network exposed software, leading to serious or catastrophic security issues, are a long standing issue in our profession. The language of 1 or 2 (or ...) has a lot of promise in systematically solving this problem. However, existing language are certain to remain in use until language X is stable and widely used. Further it will take a long time before much software in existing language is rewritten to language 1. This article covers simple techniques that can be used to make network exposed parsers at least secure to buffer overflows, which every programmer writing network code in existing languages should know. It includes (pointers to) proven, simple, library code that can be used.

Think how many times over the last few *decades* you have seen articles of the first form, and how often of the latter form. I am sure you have seen the first form many times. I will wager it is rare you have the seen former, indeed, possibly never. Yet, a _serious_ profession of engineering would _ensure_ (via the conscientiousness of the engineers practicing it) that articles of the latter form were regularly printed, to drum it in other engineers.

Worse, there is a culture of even putting down ad-hoc comments making the points of form 2, (and I'm _not_ trying to make a dig at commenters here!) with "Sure, yeah, just write bug-free software!" usually with something like "Seriously, shiny language is the only way.". And yes, there is a lot of *truth* to it that $SHINY_LANGUAGE is the way forward for the future. However that is on a _long_ term basis. In the here and now, techniques that apply to current languages, and current software, and do /not/ need to wait for shiny language to mature, stabilise and be widely deployed, understood and with a body of software to build on (library code, learning from, etc.), are _required_ to solve problems _until then_. Frowning on attempts to distribute that knowledge has _not_ helped this engineering profession avoid many security bugs over the last decade+ while we have waited for (e.g.) Rust.

A serious engineering profession would be able to _both_ look forward to the next-generation shiny stuff, _and_ be able to disseminate critical information about best-practice techniques for _existing_ widely-used tooling. It's not an either-or! :)

Fellow engineers, let's do better. :)

Continued attacks on HTTP/2

Posted Apr 16, 2024 14:08 UTC (Tue) by wtarreau (subscriber, #51152) [Link]

100% agreed.

But there's also another factor which does not help, and I'm sure many of us have known such people: the lower level the language you practise, the most expert you look to some people. And there are a lot of people using C and ASM who regularly brag saying "ah, you didn't know this ? I genuinely thought everyone did". That hurts a lot because it puts a barrier in the transmission of knowledge on how to do better. In addition, with the C spec not being freely available (OK the latest draft is, but that's not a clean approach), you figure that a lot of corner cases have long been ignored by a lot of practisers (by far the vast majority, starting from teachers at school). For example I learned that signed ints were not supposed to wrap only after 25 years of routinely using them that way, when the GCC folks suddenly decided to abuse that UB. Having done ASM long before C, what a shock to me to discover that the compiler was no longer compatible with the hardware and that the language supported this decision! Other communities are probably more welcoming to newbies and do not try to impose their tricks saying "look, that's how experienced people do it". As such I think that some langauges develop a culture but that there's still little culture around C or ASM, outside of a few important projects using these languages like our preferred operating system.

Continued attacks on HTTP/2

Posted Apr 15, 2024 17:12 UTC (Mon) by farnz (subscriber, #17727) [Link] (8 responses)

Yep, simply don't write bugs and you'll end up having completely bug free code. It's that simple.

Except that is apparently isn't. The reality proves that.

No; what reality proves is that most of us can't dependably do the "don't write bugs" part of that, and that the temptation of 1% better on a metric is enough that you will almost always find someone who succumbs to temptation to move the metric, even at the risk of introducing serious bugs.

For example, if you design your code such that the parser attempts to consume as many bytes from the buffer as it can, returning a request that I/O grows the buffer if it can't produce another item, then you get a great pattern for building reliable network parsers in any language, and the only thing that makes it easier in newer languages is that they're got more expressive type systems.

But it's always tempting to do some "pre-parsing" in the I/O layer: for example, checking a length field to see if the buffer is big enough to parse another item. And then, once you're doing that, and eliding the call to the "full" parser when it will clearly fail, it becomes tempting to inline "simple" parsing in the I/O function, which will perform slightly better on some benchmarks (not those compiled with both PGO and LTO, but not everyone takes care to first get the build setup right then benchmark).

And, of course, it also gets tempting to do just a little I/O in the parser - why not see if there's one more byte buffered if that's all you need to complete your parse. If one byte is OK, why not 2 or 3? Or maybe a few more?

That's where tooling comes in really handy - if you use tokio_util::codec to write your parser, it's now really hard to make either of those mistakes without forking tokio_util, and it's thus something that's less likely to happen than if you wrote both parts as separate modules in the same project. But there's nothing about this that "needs" Rust; you can do it all in assembly language, C, FORTRAN 77, or any other language you care to name - it's "just" a tradeoff between personal discipline of everyone on the project and tool enforcement.

Continued attacks on HTTP/2

Posted Apr 15, 2024 17:41 UTC (Mon) by mb (subscriber, #50428) [Link] (7 responses)

>personal discipline

Yeah. We have been told that for decades.
"Just do it right."
"Adjust your personal discipline."
"Educate yourself before writing code."

But it doesn't work. Otherwise we would not still have massive amounts of memory corruption bugs in C programs.

Rust takes the required "personal discipline" part and throws it right back onto the developer, if she does not behave.
Bad programs won't compile or will panic.

Forcing the developer to explicitly mark code with "I know what I'm doing, I educated myself" (a.k.a. unsafe) is really the only thing that successfully actually reduced memory corruption bugs in a systems programming language, so far.

So, if you *really* know what you are doing, feel free to use the unchecked variants. You just need to mark it with "unsafe".
It's fine.

Continued attacks on HTTP/2

Posted Apr 16, 2024 9:08 UTC (Tue) by farnz (subscriber, #17727) [Link] (6 responses)

It does work - if it didn't work, then Rust would fail because people would just use unsafe without thinking. Rust lowers the needed level of personal discipline by making it clearer when you're "cheating", because you have to write unsafe to indicate that you're going into the high-discipline subset, but otherwise it's the same as C.

So, by your assertions, I should expect to see a large amount of Rust code with undisciplined uses of unsafe, since all developers cheat on the required programming discipline that way. In practice, I don't see this when I look at crates.io, which leads me to think that your assertion that no programmer is capable of remaining disciplined in the face of temptation is false.

What we do know is that population-wide, we have insufficient programmers capable of upholding the required level of discipline to use C or C++ safely even under pressure to deliver results. It's entirely consistent to say that wtarreau and paulj are capable of sustaining this level of discipline in a C codebase, while also saying that the majority of developers will be tempted to move a metric by 1% at the risk of unsafety in a C codebase.

Continued attacks on HTTP/2

Posted Apr 16, 2024 9:52 UTC (Tue) by mb (subscriber, #50428) [Link] (1 responses)

Nope, you mis-interpreted what I was trying to say.
Keep in mind that in C everything is unsafe.

Continued attacks on HTTP/2

Posted Apr 16, 2024 10:40 UTC (Tue) by farnz (subscriber, #17727) [Link]

Nope, you misinterpreted what I was trying to say.

Keep in mind that things being unsafe is only a problem if you fail to maintain sufficient discipline - and that individuals often can maintain that discipline, even when a wider group can't.

Continued attacks on HTTP/2

Posted Apr 16, 2024 10:10 UTC (Tue) by Wol (subscriber, #4433) [Link] (1 responses)

> What we do know is that population-wide, we have insufficient programmers capable of upholding the required level of discipline to use C or C++ safely even under pressure to deliver results. It's entirely consistent to say that wtarreau and paulj are capable of sustaining this level of discipline in a C codebase, while also saying that the majority of developers will be tempted to move a metric by 1% at the risk of unsafety in a C codebase.

The point of "unsafe" is the programmer has to EXPLICITLY opt in to it. The problem with C, and assembler, and languages like that, is that the programmer can use unsafe without even realising it.

That's also my point about state tables. EVERY combination should be EXPLICITLY acknowledged. The problem is that here the programmer has to actively opt in to safe behaviour. Very few do. I certainly try, but doubt I succeed. But if somebody reported a bug against my code and said "this is a state you haven't addressed" I'd certainly acknowledge it as a bug - whether I have time to fix it or not. It might get filed as a Round Tuit, but it would almost certainly be commented, in the code, as "this needs dealing with".

Cheers,
Wol

Continued attacks on HTTP/2

Posted Apr 16, 2024 11:27 UTC (Tue) by farnz (subscriber, #17727) [Link]

It is entirely possible to have sufficient discipline (as an individual) to not use unsafe code without realising it, and to have a comment in place that acknowledges every use of a partially defined operation (which is the issue with unsafe code - there are operations in unsafe code that are partially defined) and justifies how you're sticking to just the defined subset of the operation.

However, this takes discipline to resist the temptation to do something partially defined that works in testing; and the big lesson of the last 200 years is that there's only two cases where people can resist the temptation:

  1. Liability for latent faults lies with an individual, who is thus incentivized to ensure that there are no latent faults. This is how civil engineering today handles it - and it took us decades to get the process for this liability handling correct such that either a structure has sign-off from a qualified individual who takes the blame if there are latent faults, or the appropriate people get penalised for starting construction without sign-off.
  2. It's simple and easy to verify that no latent faults exist; this is a capability introduced from mathematics into engineering, where proving something is hard, but verifying an existing proof is simple. This is also used in civil engineering - actually proving that a structure built with certain materials will stand up is hard, but it's trivial to confirm that the proof that it will stand up is valid under the assumptions we care about.

And that's where modern languages help; we know from mathematics that it's possible to construct a system with a small set of axioms, and to then have proofs of correctness that are easy to verify (even if some are hard to construct). Modern languages apply this knowledge to cases where we know that, as a profession, we make problematic mistakes frequently, and thus make it easier to catch cases where the human programmer has made a mistake.

Continued attacks on HTTP/2

Posted Apr 16, 2024 11:11 UTC (Tue) by paulj (subscriber, #341) [Link] (1 responses)

> the majority of developers will be tempted to move a metric by 1% at the risk of unsafety in a C codebase.

And then those developers will write features ignoring the checking abstraction that makes the parsers safe (at least from overflows), and submit patches with the typical, hairy, dangerous, C parser-directly-twiddling-memory code. And they'll get annoyed when the maintainer objects and asks them to rewrite using the safe abstraction that has protected the project well for years.

Sigh.

Continued attacks on HTTP/2

Posted Apr 16, 2024 11:22 UTC (Tue) by farnz (subscriber, #17727) [Link]

Exactly, and the direction of travel from hand-writing machine code, through assembly, then macro assemblers, and into BLISS, C, Rust and other languages is to move from "this works just fine, it's just hard to verify" to "this is easy to verify, since the machine helps a lot, and the human can easily do the rest".

But it's not that Rust makes it possible to avoid mistakes that you cannot avoid in C; it's that Rust makes it easier to avoid certain mistakes than C, just as C makes it easier to avoid certain mistakes than assembly does, and assembly is easier to get right than hand-crafted machine code.

Continued attacks on HTTP/2

Posted Apr 15, 2024 15:07 UTC (Mon) by paulj (subscriber, #341) [Link]

Oh, I mostly have network protocols in mind where there is a priori knowledge of the length of an atom, before parsing that atom. E.g., integers of fixed sizes, or an array or structure with an up-front length field (TLVs). Parsing other formats, without an up-front bound (text e.g.), you want an additional buffer abstraction to handle the look ahead, and not complicate the lexer too much with those details. DoS bugs like this CONTINUATION bug can then be bounded at this lowest layer - by imposing an absolute bound on any possible look-ahead, regardless of higher layers. (Higher layers may well have their own, tighter, atom or parsing-context specific bounds - but if they mess up, that lower layer can still catch).

All standard parsing stuff, but... the kinds of programmers who get into network coding often differ from the kinds of programmers who learn about how to do well-structured parsing. ;)

I am possibly projecting my own tendencies here, as the former type of programmer who long was - and stilll mostly is - ignorant of the latter ;). However my experience of others in networking is that there might be a wider truth to it.

Continued attacks on HTTP/2

Posted Apr 16, 2024 3:34 UTC (Tue) by wtarreau (subscriber, #51152) [Link] (17 responses)

I generally agree with the points you make. For having written some parsers using an FSM in a switch/case loop and with direct gotos between all cases (hidden in macros), it ended up as one of the most auditable, reliable and unchanged code over 2 decades, that could trivially be improved to support evolving protocols and whose performance remained unbeaten by a large margin compared to many other approaches I've seen used. It takes some time but not that much, it mostly requires a less common approach of the problem and a particular mindset of course.

One reason that it's rarely used is probably that before it's complete, it does nothing, and it's only testable once complete, contrary to stuffing strncmp() everywhere to get a progressive (but asymptotic) approach to the problem.

Continued attacks on HTTP/2

Posted Apr 16, 2024 8:52 UTC (Tue) by paulj (subscriber, #341) [Link] (16 responses)

Nice.

With direct jumps between the different handlers? I'd be curious how you structured that to be readable? Functions conforming to some "interface" function pointer are a common way to give handlers some structure. But I don't know of any reasonable way to avoid the indirect jump.

Continued attacks on HTTP/2

Posted Apr 16, 2024 9:11 UTC (Tue) by wtarreau (subscriber, #51152) [Link] (15 responses)

The handlers were simple and all inlined in the switch/case so that was easy. Nowadays the code looks like this:
https://github.com/haproxy/haproxy/blob/master/src/h1.c#L503

It does a little bit more than it used to but it remains quite maintainable (it has never been an issue to adapt protocol processing there). The few macros like EAT_AND_JUMP_OR_RETURN() perform the boundary checks and decide to continue or stop here so that most of the checks are hidden from the visible code.

Continued attacks on HTTP/2

Posted Apr 16, 2024 14:56 UTC (Tue) by foom (subscriber, #14868) [Link] (14 responses)

A few lines down looks like a nice example of a performance hack which introduces UB in the code.

https://github.com/haproxy/haproxy/blob/50d8c187423d6b7e9...

That's a great illustratation of the impossibility of remembering all the rules you must follow in C to avoid undefined behavior, and/or the widespread culture in the C/C++ development community of believing it's okay to "cheat" those rules. "Surely it is harmless to cheat, if you know what your CPU architecture guarantees?", even though the compiler makes no such guarantee.

In this case, building with -fsanitize=undefined (or more specifically -fsanitize=alignment) could catch this intentional-mistake.

Continued attacks on HTTP/2

Posted Apr 16, 2024 16:44 UTC (Tue) by adobriyan (subscriber, #30858) [Link]

manual shift is probably unnecessary too

> if (likely((unsigned char)(*ptr - 33) <= 93)) { /* 33 to 126 included */

Continued attacks on HTTP/2

Posted Apr 16, 2024 16:48 UTC (Tue) by wtarreau (subscriber, #51152) [Link] (12 responses)

Not at all, it's not cheating nor a mistake. It's perfectly defined thanks to the ifdef above it, which guarantees that we *do* support unaligned accesses on this arch. Rest assured that such code runs fine on sparc, mips, riscv64, armbe/le/64, x86 of course, and used to on PARISC, Alpha and VAX though I haven't tested it there for at least 10 years so I wouldn't promise anything on that front :-)

There are other places where we need to read possibly unaligned data (in protocol essentially) and it's done reliably instead.

Continued attacks on HTTP/2

Posted Apr 16, 2024 17:41 UTC (Tue) by farnz (subscriber, #17727) [Link]

If I understand correctly, it breaks the C rules for type punning. Note that I'm not completely clear on the rules myself (I used to do C++, I now do Rust, both of which have different rules to C), but my understanding is that in standard C, you can cast any type to char * and dereference, but you cannot cast char * to anything and dereference it. You can do this via a union, and (at least GCC and Clang) compilers have -fno-strict-aliasing to change the rules such that this is OK (and I've not looked at your build system to know if you unconditionally set -fno-strict-aliasing).

Continued attacks on HTTP/2

Posted Apr 16, 2024 20:13 UTC (Tue) by foom (subscriber, #14868) [Link] (10 responses)

It is cheating.

You have an ifdef testing for architectures that have unaligned memory access instructions at the machine level. But it is undefined behavior to read misaligned pointers at the C-language level, even so. No compiler I'm aware of has made any guarantee that'll work, even on your list of architectures.

Yes, empirically this code generates a working program on common compilers for those architectures today, despite the intentional bug. But that does NOT make the code correct. It may well stop working at any compiler upgrade, because you are breaking the rules.

But I don't mean to pick on just this code: this sort of thinking is ubiquitous in the C culture. And it's a serious cultural problem.

(And, btw, there's not a good reason to break the rules here: a memcpy from the buffer into a local int could generate the exact same machine instructions, without the UB.)

Continued attacks on HTTP/2

Posted Apr 17, 2024 9:21 UTC (Wed) by farnz (subscriber, #17727) [Link] (9 responses)

There's a significant bit of history behind this sort of thinking, though. It's only relatively recently (late 1990s) that C compilers became more sophisticated than the combination of a macro assembler with peephole optimization and register allocation.

If you learnt C "back in the day", then you almost certainly built a mental model of how C works based on this compilation model: each piece of C code turns into a predictable sequence of instructions (so c = a + b always turns into the sequence "load a into virtual register for a, load b into virtual register for b, set virtual register for c to the sum of virtual register for a plus virtual register for b, store virtual register for c to c's memory location"), then the compiler goes through and removes surplus instructions (e.g. if a and b are already loaded into registers, no need to reload), then it does register allocation to turn all the virtual registers into real registers with spilling to the stack as needed.

That's not the modern compilation model at all, and as a result intuition about C that dates from that era is often wrong.

Continued attacks on HTTP/2

Posted Apr 17, 2024 12:37 UTC (Wed) by wtarreau (subscriber, #51152) [Link] (8 responses)

"This thinking" is just what everyone educated about processors does when they care about performance. CPU vendors go through great difficulties to make sure that these extremely common patterns work efficiently because they're in critical paths, to the point that old CPUs which didn't support them are now dead (armv5, mips, sparc).

One would be completely crazy or ignorant to ruin the performance of their program doing one byte at a time on a machine which purposely dedicates silicon to address such common patterns. Look at compressors, hashing functions etc. Everyone uses this. A quick grep in the kernel shows me 8828 occurrences. Originally "undefined behavior" meant "not portable, will depend on the underlying hardware", just like the problem with signed addition overflow, shifts by more than the size of the word, etc. It's only recently that within the clang vs gcc battle it was found funny to purposely break programs relying on trustable and reliable implementations after carefully detecting them.

And even for the unaligned access I'm not even sure it's UB, I seem to remember it was mentioned as implementation specific, because that's just a constraint that's 100% hardware-specific, then should be relevant to the psABI only. And if you had programmed under MS-DOS, you'd know that type-based alignment was just not a thing by then, it was perfectly normal *not* to align types. All holes were filled. Yet C did already exist. It was only the 386 that brought this alignment flag whose purpose was not much understood by then and that nobody used. In any case the compiler has no reason to insert specific code to make your aligned accesses work and detect the unaligned ones and make them fail. Such casts exist because they're both needed and useful.

Actually, I'm a bit annoyed by that new culture consisting of denying everything that exists, trying to invent hypothetical problems that would require tremendous efforts to implement, just for the sake of denigrating the facilities offered by hardware that others naturally make use of after reading the specs. But it makes people talk and comment, that's great already. It would be better if they would talk about stuff they know and that are useful to others, of course.

Continued attacks on HTTP/2

Posted Apr 17, 2024 13:03 UTC (Wed) by farnz (subscriber, #17727) [Link] (6 responses)

"This thinking" is just what everyone educated about processors does when they care about performance. CPU vendors go through great difficulties to make sure that these extremely common patterns work efficiently because they're in critical paths, to the point that old CPUs which didn't support them are now dead (armv5, mips, sparc).

No, it's not. And one of the reasons that C is such a mess is that people who treat C like a macro assembler assume that everyone else thinks about code the way they do, and make sweeping false statements about the ways other people think.

Everything else in your comment flows from "if the compiler acts like a macro assembler, then this is all true, and nothing else can possibly be true as a result"; however, a well-written language specification and compiler is perfectly capable of detecting that you're doing a memcpy of 4 unaligned bytes into a 32 bit unsigned integer, and converting that into a load instruction.

The compiler should also then notice that your operations have exactly the same effect regardless of endianness, and will therefore not bother with byteswapping, since the effect of byteswapping is a no-op, and will optimize still further on that basis.

You'll notice that this is a very different style of performance reasoning, since the compiler is now part of your reasoning. But it is what most people educated about processors have done when I've been hand-optimizing code with them; they've been sticking within the defined semantics of the language, known what optimizations the compiler can do, and have cross-checked assembly output against what they want to be confident that they're getting what they want.

And you'll note that I didn't talk about MS-DOS; I talked about all old compilers; compilers of that era simply didn't do any complex analysis of the codebase to optimize it, and would pessimise the code that a modern expert on processor performance would write.

I'm very annoyed that there's a bunch of old-timers who have a culture of denying everything that's improved since the 1970s, trying to invent hypothetical reasons why compilers can't do the stuff that they do day-in, day-out, just for the sake of denigrating the facilities offered by hardware that others make use of after reading the specs. It would be better if they talked about stuff they know and that is useful to others, rather than decrying change because it's made things different to the ancient era.

Continued attacks on HTTP/2

Posted Apr 17, 2024 14:52 UTC (Wed) by mb (subscriber, #50428) [Link]

Well said.

Continued attacks on HTTP/2

Posted Apr 17, 2024 14:59 UTC (Wed) by wtarreau (subscriber, #51152) [Link] (2 responses)

I think we'll never agree anyway. You seem to consider that the compiler always knows well, and I'm among those who spend 20% of their time fighting the massive de-optimization caused by those compilers who think they know what you're trying to do. And I'll always disagree with the memcpy() hack, because it's a hack. The compiler is free to call an external function for this (and there's a reason memcpy() exists as a function) and you have no control over what is done. I can assure you that on quite a bunch of compilers I've seen real calls, that just defeated all the purpose of the access. All what you explain is fine for a GUI where nanoseconds do not exist, they're just totally unrealistic for low-level programming. It has nothing to do with being "old-timers" or whatever. Just trying to pretend people introduce bugs by doing wrong things while in practice they're just doing what the language offers as a standard to support what the architecture supports is non-sense. It would be fine if the code was not protected etc but when the case is specifically handled for the supported platforms, the language obiously supports this since it's a well-known and much used case.

In this specific case, what would be wrong would be to use memcpy(), because you'd just give up the ifdef and always use that, and doing that on a strict-alignment CPU would cost a lost. In this case you precisely want to work one byte at a time and certainly not memcpy()!

As you called me names "old-timers" I would equally say that this new way of thinking comes from junkies, but I'll not, you'll probably have plenty of time left in your life to learn about all the nice stuff computers can do when they're not forced to burn cycles, coal or gas just to make the code look nice.

Fortunately there's room for everyone, those who prefer code that is highly readable and flexible and supports fast evolutions and those who prefer to work in areas that do not stand as fast changes because time is important. What's
sure is that those are rarely the same persons. Different preferences etc come into play. But both are needed to make the computers and systems we're using today.

Continued attacks on HTTP/2

Posted Apr 17, 2024 19:30 UTC (Wed) by foom (subscriber, #14868) [Link]

> In this specific case, what would be wrong would be to use memcpy(), because you'd just give up the ifdef and always use that, and doing that on a strict-alignment CPU would cost a lost.

Why would you delete the ifdef just because you fixed the UB?

There's nothing wrong with keeping an ifdef to choose between two different correct implementations, one of which has better performance on certain architectures...

Continued attacks on HTTP/2

Posted Apr 18, 2024 11:39 UTC (Thu) by kleptog (subscriber, #1183) [Link]

> Fortunately there's room for everyone, those who prefer code that is highly readable and flexible

Well, that code is readable in the sense that you see what the code does. However, I spent a good five minutes looking at it to see if I could prove to myself it actually does what the comment claims it does. I'm still not really sure, but I guess it must if everyone else claims it's fine. Using memcpy() won't fix that. There's this niggling worry that the borrowing during the subtractions opens a hole somewhere. There's probably an appropriate way to look it that does make sense.

But it is an interesting question: what are the chances the compiler will see that code, figure that if it unrolls the loop a few times it can use SSE instructions to make it even faster, and they *do* require alignment. It probably won't happen for the same reason I can't prove to myself the code works now.

Would it make a difference if you used unsigned int instead? At least then subtraction underflow is well defined. I don't know.

Continued attacks on HTTP/2

Posted Apr 17, 2024 16:22 UTC (Wed) by Wol (subscriber, #4433) [Link] (1 responses)

> No, it's not. And one of the reasons that C is such a mess is that people who treat C like a macro assembler assume that everyone else thinks about code the way they do, and make sweeping false statements about the ways other people think.

Thing is, that's the way Kernighan and Ritchie thought. So everyone who thinks that way, thinks the way the designers of C thought.

Maybe times should move on. But if you're going to rewrite the entire philosophy of the language, DON'T. It will inevitably lead to the current religious mess we've now got! It's worse than The Judean People's Front vs. the People's Front of Judea!

Cheers,
Wol

Continued attacks on HTTP/2

Posted Apr 17, 2024 17:07 UTC (Wed) by farnz (subscriber, #17727) [Link]

That ship sailed in the 1980s (before Standard C existed), when C was ported to Cray systems. It's been a problem for a long time; it would be less of a problem if C90 had been "the language as a macro assembler", and C99 had changed the model, but the issue remains that Standard C has never worked this way, since by the time we standardised C, this wasn't the model used by several significant compiler vendors.

And the underlying tension is that people who want C compilers to work as macro assemblers with knobs on aren't willing to switch to C compilers that work that way; if people used TinyCC instead of GCC and Clang, for example, they'd get what they want. They'd just have to admit that C the way they think of it is not C the way MSVC, GCC or Clang implements it, but instead C the way TinyCC and similar projects implement it.

Continued attacks on HTTP/2

Posted Apr 17, 2024 16:16 UTC (Wed) by foom (subscriber, #14868) [Link]

> under MS-DOS, you'd know that type-based alignment was just not a thing by then, it was perfectly normal *not* to align types. All holes were filled. Yet C did already exist.

An implementation may set the alignment of every type to 1, if it desired. In such a case, you'd never need to worry about misaligned pointers.

> hypothetical problems

Okay, a non-hypothetical case of this cheating causing problems. "Modern" 32bit Arm hardware accepts unaligned pointers for 1, 2, and 4-byte load/store operations. People saw this hardware specification, and broke the rules in their C code. That code worked great in practice despite breaking the rules. Yay.

But, on that same hardware, 8-byte load/store operations require 4-byte-aligned pointers. And at some point, the compiler leaned how to coalesce two sequential known-4-byte-aligned 4-byte loads into a single 8-byte load. This is good for performance. And this broke real code which broke the rules and was accessing misaligned "int*".

Compiler writers of course say: Your fault. We followed the spec, those accesses were already prohibited, what do you want? Don't break the rules if you want your program to work.

Users say: But I followed the unwritten spec of "I'm allowed to cheat the rules and do whatever the heck I want, as long as it seems to work right now." And besides who can even understand all the rules, there's too many and they're too complex to understand.

Continued attacks on HTTP/2

Posted Apr 11, 2024 22:30 UTC (Thu) by ballombe (subscriber, #9523) [Link] (1 responses)

> If we're sure hard things are just always a bad idea then I agree that's a win for C. But, maybe hard things are sometimes a good idea, and then in C you're going to really struggle.

Agreed. But the tendency is to do hard thing when it is not needed. Security requires you to pick
the simplest possible solution instead.

Continued attacks on HTTP/2

Posted Apr 12, 2024 3:25 UTC (Fri) by wtarreau (subscriber, #51152) [Link]

> Security requires you to pick the simplest possible solution instead.

It's rarely that simple, otherwise all products would be secure by design. The reality is that it depends a lot on how the product is used and the risks it faces. The biggest security risk for edge HTTP servers is DoS and dealing with DoS requires to do complex things to remain efficient in all situations. What makes some vulnerable to DoS precisely is to apply the principle of the simplest solution. Simple is easy to audit but often also easy to DoS.

Continued attacks on HTTP/2

Posted Apr 11, 2024 16:07 UTC (Thu) by nomaxx117 (subscriber, #169603) [Link] (11 responses)

At least in the case of Rust's hyper crate (which I am one of the maintainers of), our impact was more to do with the fact that parsing frames like this is cpu usage that users wouldn't be able to catch normally via looking at logs - we actually never had the memory issues because we already limited the amount of headers we will buffer and store.

Rust is honestly a pretty good language for this stuff specifically because it's a lot more flexible than C is and we can express much simpler and easier to follow logic around asynchronous patterns than in C. It's a lot easier to review and audit code using futures rather than a complex state machine in an epoll callback table.

Also, a lot of the folks using Rust to build these sorts of things are pretty experienced in those useful patterns that come from C ;)

Continued attacks on HTTP/2

Posted Apr 11, 2024 17:59 UTC (Thu) by wtarreau (subscriber, #51152) [Link] (10 responses)

> At least in the case of Rust's hyper crate (which I am one of the maintainers of), our impact was more to do with the fact that parsing frames like this is cpu usage that users wouldn't be able to catch normally via looking at logs - we actually never had the memory issues because we already limited the amount of headers we will buffer and store.

OK that's good to know.

> Rust is honestly a pretty good language for this stuff specifically because it's a lot more flexible than C is and we can express much simpler and easier to follow logic around asynchronous patterns than in C.

It's a matter of taste, I'm still finding it impossible to understand it :-/ Too many unpronounceable characters per line for me.

> It's a lot easier to review and audit code using futures rather than a complex state machine in an epoll callback table.

Well, I *want* my epoll loop so that I can optimize every single bit of it to save whatever expensive syscall can be saved, and can handle differently the FDs that are shared between threads and those that are local to each thread, etc. One thing I wouldn't deny is that it's time consuming, but the returns on investments are worth the time spent.

Continued attacks on HTTP/2

Posted Apr 12, 2024 10:34 UTC (Fri) by tialaramex (subscriber, #21167) [Link] (9 responses)

> One thing I wouldn't deny is that it's time consuming, but the returns on investments are worth the time spent.

That's the part I don't believe, as somebody who spent many years programming C. Or rather, I never experienced for myself and never saw anybody else experience having sufficient available time capital to make every net profitable investment. At work we could definitely do every single card on the board, they're all a good idea, they all improve the software and the users want those improvements. But that would cost far more time than we have, so it won't happen, some cards won't get done this month, or this year, or ever.

Continued attacks on HTTP/2

Posted Apr 13, 2024 16:18 UTC (Sat) by wtarreau (subscriber, #51152) [Link] (7 responses)

I can assure you that for some of our users and customers, the savings are huge, especially when it allows them to halve the number of servers compared to the closest competitor and they're even willing to pay a lot for that because that can literally save them millions per year when the saved servers are counted in thousands. In the end, everyone wins: the devs are happy do to interesting work, the company can pay them to optimize their work, the customers are happy to save money and energy, and other users are happy to use even less resources.

Paradoxically the cloud platforms which encourage a total waste of resources are also the ones that favor optimizations to cut costs!

Continued attacks on HTTP/2

Posted Apr 15, 2024 21:06 UTC (Mon) by tialaramex (subscriber, #21167) [Link] (6 responses)

> the savings are huge, especially when it allows them to halve the number of servers compared to the closest competitor

But that's a very *specific* improvement. So now it's not just "it's better" which I don't deny, but that specifically the choice to do this is buying you a 50% cost saving, which is huge. By Amdahl's law this means *at least half* of the resource was apparently wasted by whatever it is the closest competitor is doing that you're not. In your hypothetical that's... using language sugar rather than hand writing a C epoll state machine. Does that feel plausible? That the machine sugar cost the same as the actual work ?

My guess is the reality is that competitor just isn't very good, which is great for your ego, but might promise a future battering if you can't "literally save them millions per year" over a competitor who shows up some day with a cheaper product without hand writing C epoll state machines. Or indeed if somebody decides that for this much money they'll hand write the machine code to scrape a few extra cycles you can't reach from C and perform *better* than you.

Continued attacks on HTTP/2

Posted Apr 16, 2024 3:19 UTC (Tue) by wtarreau (subscriber, #51152) [Link] (2 responses)

> if somebody decides that for this much money they'll hand write the machine code to scrape a few extra cycles you can't reach from C and perform *better* than you.

Absolutely, it's a matter of invested effort for diminishing returns. But if we can improve our own design ourselves, others can also do better. It's just that having a fully-controllable base offers way more optimization opportunities than being forced to rely on something already done and coming pre-made as a lib. In the distant past, when threads were not a thing and we had to squeeze every CPU cycle, I even remember reimplementing some of the critical syscalls in asm to bypass glibc's overhead, then doing them using the VDSO, then experimenting with KML (kernel-mode linux) which was an interesting trick reducing the cost of syscalls. All of these things used to provide measurable savings, and that's the reward of having full control over your code.

Continued attacks on HTTP/2

Posted Apr 16, 2024 8:50 UTC (Tue) by tialaramex (subscriber, #21167) [Link] (1 responses)

No, no, "invested effort for diminishing returns" was exactly the point I tried to make above and you contradicted it, insisting that instead you can afford to invest in absolutely *everything* because it shows such massive returns - the thing I'd never seen, and, from the sounds of it, still haven't.

The "fully-controllable base" doesn't really mean anything in software, others too could make such a choice to use things or not use them as they wish. Swap to OpenBSD or Windows, write everything in Pascal or in PowerPC machine code, implement it on an FPGA or custom design silicon, whatever. The skill set matters of course, you struggle with all the non-alphabetic symbols in Rust, whereas say I never really "got" indefinite integrals, probably neither of us is an EDA wizard. But for a corporate entity they can hire in skills to fill those gaps.

If you've decided actually the diminishing returns do matter then we're precisely back to actually hand writing the epoll loop probably isn't the best way to expend those limited resources for almost anybody and you've got a singular anecdote where it worked out OK for you, which is good news for you of course.

Continued attacks on HTTP/2

Posted Apr 16, 2024 9:21 UTC (Tue) by wtarreau (subscriber, #51152) [Link]

> If you've decided actually the diminishing returns do matter then we're precisely back to actually hand writing the epoll loop probably isn't the best way to expend those limited resources for almost anybody and you've got a singular anecdote where it worked out OK for you, which is good news for you of course.

Often that's how you can prioritize your long-term todo-list. But sometimes having to rely on 3rd-party blocks (e.g. libev+ssl) gets you closer to your goal faster, then blocks you in a corner because the day you say "enough is enough, I'm going to rewrite that part now", you have an immense amount of work to do to convert all the existing stuff to the new preferred approach. Instead when your started small (possibly even with select() or poll()) and progressively grew your system based on production feedback, it's easier to do more frequent baby steps in the right direction, and to adjust that direction based on feedback.

Typically one lib I'm not seeing myself replace with a home-grown one is openssl, and god, it's the one causing me the most grief! But for the rest, I'm glad I haven't remained stuck into generic implementations.

Continued attacks on HTTP/2

Posted Apr 16, 2024 11:19 UTC (Tue) by paulj (subscriber, #341) [Link] (2 responses)

Have you ever considered that in some cases there might be code that is open-sourced by a giant tech company, that potentially has _deliberately_ bad performance, because if a competitor of theirs uses that code and ends up with bad performance that doesn't hurt said giant tech company?

Even if not outright deliberate, the giant tech company at least has no motivation to make the open sourced code perform well.

As an example, the Google QUIC code-base - from Chromium - has _terrible_ performance server side. It's kind of the reference implementation for QUIC. And I'm pretty sure it's /not/ what GOOG are using internally on their servers. Based on presentations they've given on how they've improved server performance, they've clearly worked on it and not released the improvements.

It's in C++.

One of the quickest implementations I know of (though, I havn't tested Willy's version ;) ) is LSQUIC. C.

I think a lot of it is more philosophy of the programmer than the actual language. However, a performance philosophy is more frequently found in C programmers. There is a cultural element, which correlates to languages.

Continued attacks on HTTP/2

Posted Apr 16, 2024 13:55 UTC (Tue) by wtarreau (subscriber, #51152) [Link] (1 responses)

> One of the quickest implementations I know of (though, I havn't tested Willy's version ;) ) is LSQUIC. C.

Haven't benchmarked it but we've pulled 260 Gbps out of a server a year ago from ours, it didn't seem bad by then, but we should improve that soon ;-)

> I think a lot of it is more philosophy of the programmer than the actual language. However, a performance philosophy is more frequently found in C programmers. There is a cultural element, which correlates to languages.

I totally agree with this. The number of times I've heard "you're spending your time for a millisecond?" to which I replied "but if this millisecond is wasted 1000 times a second, we're spending our whole lives in it". And it's true that the culture varies with languages, in fact, languages tend to attract people sharing the same culture and/or main concerns (performance, safety, ease of writing code, abundance of libraries, wide community etc).

Continued attacks on HTTP/2

Posted Apr 16, 2024 14:08 UTC (Tue) by Wol (subscriber, #4433) [Link]

> > I think a lot of it is more philosophy of the programmer than the actual language. However, a performance philosophy is more frequently found in C programmers. There is a cultural element, which correlates to languages.

I dunno about C. I have the same philosophy and I started with FORTRAN. I think it's partly age, and partly field of programming.

My first computer that I bought was a Jupiter Ace. 3 KILObytes of ram. My contemporaries had ZX80s, which I think was only 1 KB. The first computer I worked on was a Pr1me 25/30, which had 256KB for 20 users. That's the age thing - we didn't have resource to squander.

And one job I always mention on my CV is, I was asked to automate a job and told I had 6 weeks to do it - they needed the results for a customer. 4 weeks in, I'd completed the basic program, then estimated the *run* time to complete the job (given sole use of the mini available to me) as "about 5 weeks". Two days of hard work optimising the program to speed it up, and I handed it over to the team who were actually running the job for the customer. We made the deadline (although, immediately on handing the program over, I went sick and said "ring me if you need me". Had a lovely week off :-) And that's the field of programming - if you're short of resource for the job in hand (still true for many microcontrollers?) you don't have resource to squander.

Cheers,
Wol

Continued attacks on HTTP/2

Posted Apr 15, 2024 20:39 UTC (Mon) by Wol (subscriber, #4433) [Link]

> But that would cost far more time than we have, so it won't happen, some cards won't get done this month, or this year, or ever.

You're asking the wrong question (or rather sounds like your bosses are). I overheard some of our bot engineers making the same mistake. "It costs several hundred pounds per bot. Over all our bots that's millions of quid. It's too expensive, it won't happen". WHAT'S THE PAYBACK TIME?

Who cares if it costs a couple hundred quid a time. If it saves a couple of hundred quid over a year or so, you find the money! (Or you SHOULD.)

I'm not very good at putting a cost figure on the work I'm trying to do. I justify it in time - if I can make the work of downstream easier for our hard-pressed planners, and enable them to do more work (we spend FAR too much time watching the spinning hourglass ...) then I can justify it that way. Time is money :-)

Bosses should be asking "is this going to pay for itself", and if the answer is yes, you find a way of doing it. (Yes I know ... :-)

Cheers,
Wol


Copyright © 2024, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds