The hidden cost of QUIC and TOU (2016)

dylz · on Nov 2, 2018

Here's the problem: other companies ruined it for the good ISPs. They destroyed their trust by screwing with end user transfers, tampering with TLS, tampering with connections, modifying resources in transit, MITMing in horrific fashion, invading privacy, and breaking protocols for profit.

Arguably there would be a lot less insane push to make everything a blackbox if companies didn't keep doing dumb crap for an extra buck. See also the TLSv1.3 middlebox arguments.

I certainly enjoy the fact that when using an AT&T hotspot I can't reach certain websites over TLS, with Verizon they would append advertising tracking headers to requests on their end, on Sprint if I wget the same image multiple times it'll be in a different quality and not match md5-wise each time, all of the above do DPI and negatively degrade web traffic no matter what port it runs on, TCP connections fail if I try to run non-web traffic over port 80 randomly, TCP connections are randomly killed if long lived.

CobrastanJorji · on Nov 2, 2018

Exactly. Would the Internet be way easier to design and use if we could assume that everybody is a good actor? Heck yeah! It'd be great. There are some really useful things you can do when you're allowed to intercept the conversations between two people. Caching, debugging, fixing spelling errors, etc.

But we can't assume that the folks between me and you are good, and sharing anything at all with them is a bad idea. Does that mean a whole bunch of really useful techniques are harder or straight up impossible now? Yes, absolutely, and we should plan ahead and make sure to come up with ways to debug issues like this.

But stepping back on security and privacy because it makes it harder to benevolently spy? Screw that noise.

xyzzy123 · on Nov 2, 2018

The problem I have with TLS 1.3 proposal is it’s gonna stop me having control over the things that go out of my own network :(

Realise that this is a multifaceted debate but it’s going to destroy the utility of anything I can’t install a certificate on, and also force me to install certs on everything I own before I can comfortably let those things on the network. It’s gonna screw a lot of enterprise use cases as well.

I’m not sure people fully get that the privacy extended to say “dissidents in Syria” is also going to apply to HP printers on their own networks trying to figure out whether to show “toner low” dialogs.

Personally I don’t think the “hey the dissidents” value is worth it since those people are pretty screwed anyway - filtering at scale can still work out what you’re connecting to (ips, latency, response size patterns, blah blah) but it really messes up anyone (person or enterprise) who wants to use stuff but also know what’s going on.

tialaramex · on Nov 2, 2018

The threat you're worried about doesn't require that an IETF Working Group spend years defining a new protocol, whether that's QUIC or TLS 1.3 itself. Any bozo could roll their own Noise-based encrypted protocol and it wouldn't be decrypted by whatever edge "security" you think is protecting you.

Worse, chances are that a belief you presently have "control over the things that go out of my own network" but believe TLS 1.3 would hurt that means you're relying on "Next Generation Firewall" type technologies which are hopelessly broken.

If you go stare at the TLS 1.3 "compatibility" changes in later drafts (particularly Draft 28 IIRC) you'll see that it's basically the equivalent of wearing a boiler suit with an embroidered "OTIS Lifts" logo to get waved through the gate check without needing a pass. Except the boiler suit says "TLS 1.2 Session Resumption". It didn't require the IETF to do this, presumably Bad Guys have been doing it for years without writing a document explaining how.

The recurring theme in people's TLS 1.3 horror stories is that they were being eaten by cannibals all along, but TLS 1.3 asked them why they can't feel their legs...

Example: Palo Alto and Cisco both shipped products that trip the TLS 1.3 downgrade detection feature. They were told about this months ago, but of course they waited until the last moment (indeed for PAN they still haven't shipped a fix for some supported versions) because it's just a compatibility problem...

Except, it's not - the only way to trip that downgrade detection "by mistake" is to not choose random numbers where the TLS 1.0, 1.1 and 1.2 standards all say that it's imperative to use random numbers. If those numbers are instead copied from somewhere predictable (which they are in affected Cisco and Palo Alto systems) then much of the security of your TLS connections through these "security" devices was illusory.

pilif · on Nov 2, 2018

>also force me to install certs on everything I own before I can comfortably let those things on the network

I'm curious: How is TLS 1.3 different from TLS 1.2 in that regard? How would you implement an intercepting proxy with TLS 1.2 and without installing a CA cert in the clients?

ameshkov · on Nov 2, 2018

I guess the encrypted SNI draft is what's meant here.

xyzzy123 · on Nov 2, 2018

Thanks, sorry, you are totally right and I should be more specific.

Yes. It’s a metadata leak and generally I am pro the end-to-end principle but encrypted SNI actually forces everyone to MITM. Whether that is good or bad is a value judgement but for people who have been doing “light touch” egress filtering it is a huge PITA. It is actually going to force more invasive surveillance in basically any regulated workplace.

pilif · on Nov 2, 2018

Personally, I think it's a good thing it forces MITM. Either you monitor your users browsing habits or you don't. If you do, it's only fair if they have a chance to know that you do and seeing an SSL connection be "protected" by a company-internal cert makes that totally clear.

amaccuish · on Nov 2, 2018

The thing is, most users won't look for that. I like what Android does, where if you switch on a VPN or install an extra CA, you get a "your network use may be being monitored". That should appear in the browser.

icebraining · on Nov 2, 2018

How could you rely on SNI for filtering? How did you know they weren't just domain fronting? Or was it about blocking access to regular sites?

ryukafalz · on Nov 2, 2018

As another commenter noted, you already don't have control over what leaves your network in that case. The new standards may make it more likely, but there's nothing stopping a device manufacturer from doing something similar now. Open a VPN tunnel back to the mothership and have fun analyzing that.

The real problem here is that you can't trust the devices on your network. Devices should give the user control, not work against them, which sadly seems to be uncommon these days.

jerf · on Nov 2, 2018

Opening a VPN tunnel back to the mothership can be blocked. I can vouch that plenty of real security-conscious sites take steps that would prevent that for working for you. There are plenty of sites that seem to do a full whitelist-only connection list for their external network, with things in between checking the protocol internals to the extent they can too.

You can push a VPN connection over any port, but honestly, given some of the scrutiny I've been put through for some of the network stuff I've put out, I still wouldn't care to guarantee that some high-security customer out there wouldn't notice that your "HTTP connection" is actually a VPN connection. By the time you're writing something deceptive enough to get through that, you're running the risk of some very nasty stories being run in the security press about your practices.

It is not the case that everybody in the world just throws all their devices on to the network and then lets everything on it have unfettered outbound access.

ryukafalz · on Nov 2, 2018

And of course it's fine to block connections you don't recognize, or to whitelist connections in the first place. But I maintain that within a network of devices you own, the solution to untrustworthy devices on your network is to use more trustworthy devices, not to weaken internet standards for everyone else.

forgottenpass · on Nov 2, 2018

the solution to untrustworthy devices on your network is to use more trustworthy devices

This kind of "oh, only buy perfect end devices" is just as worthless advice as "oh, only buy service from perfect ISPs that don't make you want to encrypt traffic."

ryukafalz · on Nov 2, 2018

It doesn't have to be perfect. It should, however, not actively work against its owner, and the manufacturer should provide enough information and access that the device's owner can be reasonably confident that the device is acting in their interests.

caf · on Nov 2, 2018

Here's the problem: other companies ruined it for the good ISPs.

There's a phrase for that situation: "Poisoning the Well".

TeMPOraL · on Nov 2, 2018

How's it called when some companies are run by good humans, and other by evil reptilians, and the reptilians poison the well with something they can drink, but humans can't?

felipemnoa · on Nov 2, 2018

terraforming

dstroot · on Nov 2, 2018

> for the good ISPs...

and those are? Do they actually exist?

dylz · on Nov 2, 2018

Some do. Monkeybrains in SFBA and AAISP, Bogons in UK are a few possible examples?

olliej · on Nov 2, 2018

The other fun bit is the “encryption” of web sockets (outside of tls, etc).

Basically (hand waving here) a websocket connection starts with a 64 bit (or 128, I really can’t remember anymore) value, and then essentially just xors that over all subsequent data.

This isn’t needed for security of the user or the server. It’s because so many middleware boxes are so poorly build that you could make them crash and/or get code execution if you had sufficient control over enough of the right payload bytes. Java applets exposed the exact same problem, so given they existed a decade before websockets and yet the middleware boxes were still broken enough that this nonsense was required in the websocket spec should tell you everything you need to know.

Matthias247 · on Nov 2, 2018

That's not only about Middleboxes (which mostly act up to L4). Most likely it's more about preventing JS from attacking arbitrary servers (e.g. broken mail servers, file servers, etc) that might have an open TCP port. The "encryption" at least prevents JS attacks from being deterministic.

olliej · on Nov 2, 2018

no, it was specifically due to broken middleware that had existing well known yet unfixed flaws.

aidenn0 · on Nov 2, 2018

It was about Application layer middleboxes (specifically broken http proxies).

_urga · on Nov 2, 2018

And now, even WebSockets over TLS need servers to waste CPU on XOR masking.

jefftk · on Nov 2, 2018

XOR masking is ridiculously cheap compared to doing anything over the network or running js in general.

adrianN · on Nov 2, 2018

Everything is slow because of a million "don't worry, it's cheap compared to the rest" decisions accumulating over time.

jefftk · on Nov 2, 2018

I think "why is everything slow" is actually a really important question, but I'm not sure it's primarily lots of "this is cheap compared to X" decisions. A big driver here I think is layers of abstractions and inner platforms. I check my email in by browser now, I used to use mutt. Webmail on top of JS on top of browser on top of OS (with several of those systems having their own layers and places that add latency) is great from a perspective of checking email from anywhere but it's a lot slower than C on the OS directly.

I recently got into doing a lot of live music for dances on my mac, and I've ended up writing everything directly against CoreMIDI in C. Sure, it's would be more convenient to write in Python, but latency in music is even more painful than elsewhere.

ric2b · on Nov 6, 2018

And yet the network is still horribly slower, so I guess those assumptions are correct?

gnode · on Nov 2, 2018

This may be because it's common for load balancers / reverse proxies to strip the TLS off before handing it to the back end (which may not even support WS over TLS).

If the TLS encapsulated WS removed the XOR mask, it would require such a reverse proxy to add it on, rather than blindly forward the connection.

olliej · on Nov 2, 2018

Correct, the problem is edge routers unwrapping and inspecting tls packets before forwarding to the internal network. So it’s not wasted due to tls.

I mean it’s still a dumb waste of cpu cycles to support unfixed/unfixable routers, but there isn’t really a choice. Womp womp.

olliej · on Nov 2, 2018

Nah, it’s still needed - see @gnode’s answer

nealmueller · on Nov 2, 2018

The author is a middlebox employee (IPS, IDS, Firewall, NAT, WAN optimizers, LBs). Middlebox people want unencrypted transport headers, because they literally profit from unencrypted headers. :) Everyone else, including users, site operators, and software engineers writing network software prefer that middleboxes not be able to see or tamper with transport headers (both for privacy, avoiding bugs, and being able to evolve software).

From the original article: "What's wrong with encrypted transport headers? One possible argument is that middleboxes actually serve a critical function in the network, and crippling them isn't a great idea. Do you really want a world where firewalls are unviable? But I work on middleboxes, so of course I'd say that. (Disclaimer: these are my own opinions, not my employer's)."

(Credit for this observation goes to my friend NC.)

zamadatix · on Nov 2, 2018

Transparent proxy was the wrong way to implement IPS, IDS, FW, NAT, LB, and WAN optimization. For the cases you have a reason to be in the middle these services should have been explicit proxies from the start.

jsnell · on Nov 2, 2018

There's been an attempt to fix this during the last two years, as the QUIC standardization talks really got going. A bunch of operator people expressed a need for some sort of in-path measurability, while the privacy people have expressed the need to avoid any sort of session linkage or other forms of information leakage. The most viable compromise proposal seems to be the spin bit[0] which gives RTT measurements, but it's not agreed whether a version of that will make it to the first release.

[0] https://quicwg.org/base-drafts/draft-ietf-quic-spin-exp.html

pdkl95 · on Nov 2, 2018

> The most viable compromise proposal seems to be the spin bit

That's utterly useless. If the "spin bit" becomes widely used, I intend to write a trivial patch that sets it randomly on each packet.

If packets could be dropped unless the bit is set to specific values at specific times, too much session information is leaking to middleboxen. More likely, the state of the bit doesn't matter so setting it randomly will discourage wasting a bit in the protocol with this kind of nonsense in the future.

quickben · on Nov 2, 2018

When advertisement companies push for L4 redesign... This is going to be a fun one to watch how it plays out.

vinay_ys · on Nov 2, 2018

This argument is broken. An ISP engineer debugging need only look at IP packet drops in their part of the network. If their end clients are asking them to debug an issue then they should debug it at the end points, not in the middle. This attitude is what lead them to stick more e and more buffers in the middle and do other shaping devices in the name of problem solving or adding value and ended up ossifying protocols. I think endpoints will evolve to better protocols to solve their own problems if the ISPs stopped putting hacks (like deep buffers) in their networks.

jsnell · on Nov 2, 2018

> This argument is broken. An ISP engineer debugging need only look at IP packet drops in their part of the network.

That's just not true in practice, pretty much on any level.

First, you need to look at a lot more things than just packet drops (e.g. reordering, queue buildup, corruption).

Second, even getting full visibility into your own network is highly non-trivial since nobody has active probes on every link. My experience is that arranging for packet captures from an arbitrary point in the network could take a week. And if you guessed wrong about which node was at fault, you'd need to do it again in a binary search pattern.

Third, you absolutely do need to know about things other than your own network. Otherwise you don't even know whether there is a problem you can solve. If the bottleneck is in the server, or the client, or the transit links, there is no point in debugging the core or the access network.

> If their end clients are asking them to debug an issue then they should debug it at the end points, not in the middle.

The endpoints are not going to be available. Do you think that Youtube is going to give an operator some kind of server access or even insight to the traffic? Do you really want to see a world where a customer having a complaint needs to first root their phone and install packet capturing software?

What you're really saying here is that no problem should ever be debugged, and we should just hope that the network doesn't break. And hope is not much of a strategy.

denormalfloat · on Nov 2, 2018

Alright, so existing debugging tools don't work with QUIC. We will need to make some new tools that can expose the information we need. If the hops between the source and destination are willing to expose the info, (and we can assume they do, as the author has), then we can figure out what packets go in that never come back out.

Instead of say "this won't work because ____", why don't people say "it would work if we could ____"? Someone (or some company) needs to improve the Internet, and it seems like the world just harangues them for their effort.

ncmncm · on Nov 2, 2018

When you tunnel your protocol in UDP, and control both ends, you can get overwhelmingly better flow control than TCP, which cannot trust the other end, so must rely on packet drops to get a reliable signal.

When you can trust the other end, rate of change of packet transfer time (delta packet delay) reveals congestion exactly -- i.e., increasing time means you had better slow down, decreasing, you can go faster.

Only problem is, the receiver has the signal, but the sender needs it, and the useful lifetime is less than the packet delay. So, you need a predictor on the sender, fed by corrections from the receiver. This is control theory applied to network flow.

This is how all of Hollywood sends reels around the basin to effects houses, and completed movies to digital projection theaters.

tinus_hn · on Nov 2, 2018

It would be nice if these protocols could be easily decrypted using a key available on the client. Other than that, tough luck for the ‘transparent’ proxies and friends.

tialaramex · on Nov 2, 2018

Of necessity both client and server have the session keys.

In principle it would be possible for the client to lack keys needed for the server to read data sent by the client, and vice versa, but in practice this is never done.

Under Forward Secrecy a Middlebox must learn fresh session keys for every connection or it can't decrypt it. Both clients (e.g. Firefox) and servers (e.g. using Java or OpenSSL) have facilities to dump the session keys out somewhere, and this is adequate for debugging although obviously you will need to acquire new skills if you're used to being able to get stuff done with a paperclip and a copy of tcpdump. At scale this get hard, arguably that's fine because a minute ago we said we wanted this for "debugging" but people who got their foot in the door with a "debugging" argument often actually want to decrypt everything, always, and so they're unhappy about this.

If you don't want Forward Secrecy you have two options. Firstly, when the specification says to think of a random number for the key exchange protocol, you can always pick the same number, or a number chosen in some predictable fashion, the Middlebox can know this number (or method for predicting it) and then it can snoop as normal. This works in TLS 1.3, obviously it weakens your security (if bad guys learn how to predict the numbers you are screwed) but that's your choice.

Secondly you could use a key exchange process that doesn't have any Forward Secrecy by design, such as the RSA key exchange from SSL that's grandfathered into TLS 1.0 through 1.2. In this case you just give the server's private RSA key to the middleboxes and they can decrypt everything.

As you may notice in all the above scenarios, this is very bad for your security. But if "debugging" is really the problem that's almost certainly acceptable to you.

rocqua · on Nov 2, 2018

Dumping out those session keys comes with major logistical and security problems. All of a sudden, debugging requires

1) Updates on every client you might want to debug

2) Securely transporting the session keys from those clients to the person debugging.

Those are some massive challenges. An alternative is to always MitM all your devices. This comes with obvious downsides. Moreover, I could see providers doing cert-pinning that isn't over-ridden by user installed certificates. That would make it literally impossible to MitM your own devices.

This kind of cert-pinning really scares me, because it takes away any possibility to inspect your own network communications.

hornetblack · on Nov 2, 2018

I know that for https in NSS browsers (Chrome and FF) you can dump the secrets to a file that Wireshark can read. Given that Chrome is currently the biggest user i would be supprised if that was easy to update.

nouseforaname · on Nov 2, 2018

I believe chrome currently uses BoringSSL not NSS, but still has some facilities for writing the same keylogfile.

sly010 · on Nov 2, 2018

Arguably you wouldn't have to debug bad connections if the middleboxes just did what they were supposed to.

cm2187 · on Nov 2, 2018

Actually it raises an important point I hadn't thought about. By moving the transport protocol from layer 4 to layer 5, google is taking it off the hand of the OS. I understand why they do it (easier than to get all 3 or 4 major OS to implement it), but there is another cost in term of inter-operability. It means that every software consuming QUIC needs to have its own implementation of the protocol. It means every language in which you write that software needs to have libraries available implementing QUIC, or that the libraries you consume must themselves have a QUIC implementation. That's introducing a non insignificant inefficiency if QUIC becomes prevalent.

tialaramex · on Nov 2, 2018

The operating system can offer QUIC on top of UDP the same way it offers UDP on top of IP. BSD sockets aren't necessarily the ideal way to use QUIC but there's no reason you couldn't use them.

Researchers have even repeatedly built Linux protocol modules that do TLS, either all of it, or the encrypted record layer (so the bulk but not the tricky negotiation decisions at the start). There's just a new TLS protocol you ask for instead of TCP and then the kernel handles encryption and so on.

nicenewsbc · on Nov 2, 2018

There is no combination between them.

ocdtrekkie · on Nov 2, 2018

You know what a lot of middleboxes do? They block ads/malware. Shocking that the two largest ad companies are trying to push standards which break things that block their ads.

pilif · on Nov 2, 2018

This isn’t as much about stopping middleboxes on your perimeter that you fully control, but a lot about middleboxes between you and the destination that are outside of your or the destinations control.

Those are often used to add ads rather than block them.

If you yourself want to block ads, do so on your machine (where traffic has already been decrypted) or on your router (which will then decrypt traffic for you and re-encrypt it with its certificate that you have added to your machine).

icebraining · on Nov 2, 2018

That's news to me, I've never heard of a widespread use of middleboxes for blocking ads. Which boxes are these and how did you know that there are a lot of them deployed?

ocdtrekkie · on Nov 2, 2018

So, an example of a middlebox that's exceedingly common is a "web security gateway", which is your average web filter and logger in a corporate environment. Obviously it logs employee web activity, blocks access to adult websites, and maintains it's own malware definitions to try and block malicious content as well. It's quite often for these to also block domains used by ads and popups by default. When these sorts of devices are configured to inspect HTTPS, this adds a significant additional complexity: Network PCs need to be configured to permit a certificate from the box for all domains, which intercepts, decrypts, and re-encrypts all traffic.

Of course, the same type of technology a corporation might use to manage their network could be used by a state actor or a hostile ISP.

icebraining · on Nov 2, 2018

Oh, alright. I'm behind one of those right now, but never used any that blocked ads. I wonder how prevalent that is that Google would devote significant effort into developing a whole replacement for TCP just to get people behind them to watch ads.

Plus seems like those companies will be able to block QUIC for the foreseeable future; disabling HTTP(S) will probably take as long as disabling IPv4.

ocdtrekkie · on Nov 2, 2018

Yeah, I've got a policy configured that disables QUIC in Chrome, as it also makes it harder for the firewall to do it's job. Firewalls track TCP connections in order to determine if inbound traffic is a response to a legitimate request. Doesn't work on UDP-based traffic.

brazzledazzle · on Nov 2, 2018

If you’re only using it to block domains you can do that at the DNS level with the added bonus of it being more efficient.

vetinari · on Nov 2, 2018

This goes out of the window with apps doing their own host resolution with DoH.

ocdtrekkie · on Nov 2, 2018

I'm aware, I use a Pi-hole at home. :) Web filtering hardware in corporate environments is a fair bit more sophisticated and contains a lot of other features.

zamadatix · on Nov 2, 2018

Web security gateways at enterprises should have been explicit proxies from the start, this whole transparent proxy business has been a disaster for protocol development.