Google’s QUIC protocol: moving the web from TCP to UDP

cm3 · on Aug 1, 2016

If you're looking for a proven protocol that's better suited to transferring vast amounts of data across the globe, while being based on UDP, and you have been using rsync or bittorrent, here's something that's less known but offers a good speed advantage: https://en.wikipedia.org/wiki/UDP-based_Data_Transfer_Protoc...

Please don't complain about Sourceforge, UDT is an old project: http://udt.sourceforge.net/

And here's their distributed filesystem built on top of UDT: http://sector.sourceforge.net/

Edit: Protocol-level security is being worked on for UDT5, but in the meantime there's a Rust-based experimental attempt at replacing SCP with UDT-based solution: https://github.com/mcginty/shoop

_asummers · on Aug 1, 2016

Re: SourceForge, my understanding is that the previous owners sold SF in the past few months. The new owners have apparently taken down the malware and are working to try and restore it to its former name, despite the large incline of that hill.

https://www.reddit.com/r/sysadmin/comments/4n3e1s/the_state_...

cm3 · on Aug 1, 2016

Given that gforge is a fork of the original sourceforge code, it's probably possible to host a mirror to not repeat the loss of projects as happened with Google Code. There's just too much abandoned code on there, and most of it is still valuable. And even stuff not useful today is still valuable for archival purposes. Is there already a neocities like attempt at doing that?

Scaevolus · on Aug 1, 2016

I found UDT-rsync (the only file copying program that seems to incorporate it) to reliably segfault on files >1GB.

I use uftp[1] now for bulk transfers across oceans-- it's a little confusing to use, but it generally accomplishes the goal of pushing packets at 70Mbps from Europe to Seattle.

[1]: http://uftp-multicast.sourceforge.net/

cm3 · on Aug 1, 2016

Sounds like a bug/limitation in the code and goes against what UDT was designed for. Is there a ticket open?

Thanks for the link to uftp, didn't know about.

Also, UDT works in a LAN too, of course, and before finding UDT, I had been wondering how Weta Digital shares huge media files with Universal Studios in LA. Seems like UDT is built into a lot of commercial products.

mevodig · on Aug 1, 2016

The "industry standard" software for this would be Aspera, now owned by IBM. They use their own UDP-based protocol called FASP.

x3ro · on Aug 1, 2016

For the interested: A working group to standardize QUIC was accepted into the Transport and Services area of the IETF at the last meeting in Berlin a couple of weeks ago [1].

The WG charter is this one [2]:

Define a new standards track IETF transport protocol based on deployment experience with QUIC. Four focus areas:

* Core transport work: wire format, basic mechanisms

* Security: TLS 1.3 to protect QUIC header and payload

* Application semantic mapping: initial focus on HTTP/2

* Extension to multipath for migration and load sharing

[1]: http://etherpad.tools.ietf.org:9000/p/notes-ietf-96-quic?use...

[2]: https://www.ietf.org/proceedings/96/slides/slides-96-quic-0....

amluto · on Aug 1, 2016

> * Security: TLS 1.3 to protect QUIC header and payload

That seems odd. QUIC's crypto is much nicer than any TLS version or draft last time I looked.

skywhopper · on Aug 1, 2016

Interesting stuff and a good technical overview. I remember hearing something about this a while ago but I hadn't seen the details. I found the time spent on the need to unblock 443/udp on server firewalls amusing because that's the absolute least concern I would have about the protocol.

I know Google has the most genius geniuses working for them, but it's important to be wary of the risks of this sort of thing. Not that TLS/SSL itself has a fantastic track record, but replacing the TLS handshake with something new and shorter and mixing it with a protocol that can accept incoming packets from multiple source IPs sounds like a recipe for a thousand new security vulnerabilities. If not in Google's code, then in the other implementations. Researchers, take note.

nmalaguti · on Aug 1, 2016

The post didn't discuss The backoff behavior of QUIC. For TCP, a dropped packet results in a halving of the data rate to help with assumed congestion of the network.

Does QUIC act as a good network citizen? Are they experimenting with different approaches?

serialx · on Aug 1, 2016

QUIC uses TCP friendly backoff algorithm called CUBIC. Linux TCP implementation currently uses CUBIC too. One difference between TCP and QUIC is that the parameter beta of CUBIC is different.

One QUIC connection is equivalent to two TCP connections in that regard. So QUIC will only backoff half amount compared to TCP. In the design docs, they mention it's okay since one QUIC connection is equivalent to multiple TCP connections that a browser makes.

jwr · on Aug 1, 2016

My thoughts exactly. It's easy to get better performance than TCP if you're the only one using a different protocol and don't care about others. People have been doing that for years, using UDP and forward error correction (transform your data, embed error correcting codes, stuff the pipe with your data as quickly as you can, and hope that the other end will recover enough to reconstruct the original).

What happens when we all move to QUIC or whatever else?

r1ch · on Aug 1, 2016

TCP congestion control is one of the main offenders in holding back high speed broadband. It can take minutes for TCP to discover the link bandwidth and a single lost packet can cut performance in half and the ramp back up to full speed may never happen.

Is congestion collapse still a risk on today's internet? Do we need such aggressive congestion control?

wtallis · on Aug 1, 2016

Congestion control is as important as ever. Your packets can traverse links of just a few Mbps then 1Gbps then back down to a few Mbps, all within your own house on the way from your computer to your ISP's network. Common consumer networking equipment will do extremely stupid things in the face of congestion, such as storing packets in a FIFO buffer more than one second long. Initial window sizes still need to be small because there will always be users at the fringes of a wireless network where speeds are low enough that a burst of a few dozen full-size packets is a major call-dropping problem.

The best solution is for AQM and ECN to be deployed widely, so that congestion can be identified and dealt with before it gets bad enough to require drastic rate decreases. QUIC currently cannot use ECN because those bits of the IP header typically aren't accessible from the APIs for UDP. Modern TCPs operating on networks that keep buffering delays low and signal congestion without dropping packets don't have trouble determining link bandwidth quickly.

vvanders · on Aug 1, 2016

Yeah, also: MTU discovery?

Last time I did UDP this was a huge pain to do and required quite a few round-trips to get right. FEC isn't going to save you here as none of your packets are going to make it to the host.

Or do they just fix it at 576 and not try to get any better efficiency from larger packets?

falcolas · on Aug 1, 2016

Somewhat of a meta observation - we seem to be entering an era where mantras like "Those who write their own networking protocol are doomed to re-create TCP, badly" are becoming less true, at least for a small subset of the programming population.

It's fascinating to watch some of the foundations upon which we do our work being shaken up a bit. I just hope they settle into more stable and more secure foundations, not just "better".

chillydawg · on Aug 1, 2016

Virtually no one has access to a network they own end to end that is big enough to create and test a credible alternative. Google, and the tiny number of people inside it who work on this stuff, are one of probably 10 orgs in the world qualified and well placed to do it. Other contenders would be MS, US Navy, Apple..who else?

_qjt0 · on Aug 1, 2016

Is packet loss in the real world random or correlated?

TCP's head-of-line blocking problem goes away if packet loss turns out to be correlated — lose a packet belonging to one stream within the connection, and you're going to lose packets belonging to other streams as well, so QUIC doesn't help.

Is packet loss in the real world random or correlated?

orasis · on Aug 1, 2016

With Forward Error Correction the packet loss issue is somewhat mitigated.

jthol · on Aug 1, 2016

I think the question is: If I have a 1000 packets and I lose 10 is the packet loss spread out of is it clustered? If it's clustered you'll have to retransmit anyway.

orasis · on Aug 3, 2016

Interleaving the FEC can also avoid bursts. The tradeoff is the more you interleave, the higher the latency on reconstruction. They could do something like using round trip time to calculate the number of packets in the air and interleave based on that.

I was working on this type of stuff 15 years ago and now might be the time to do it. Bandwidth keeps increasing but latency stays the same, so it makes sense to waste some bandwidth to improve latency.

crazyloglad · on Aug 1, 2016

UDP based and Encrypted from the get-go? sounds like Daniel Bernstein, ~6 years ago..

[1] https://curvecp.org/index.html

[2] https://www.youtube.com/watch?v=K8EGA834Nok

anonbanker · on Aug 1, 2016

wow. any reason why it hasn't caught on?

a_imho · on Aug 1, 2016

Total networking layman here. Am I understanding correctly the author uses a bit outdated example to show how quic can be theoretically faster for the initial connection while at the same time admits TCP fast open will reduce RTT?

Also, with quic you actually receive 10% less data, but then saying a packet retransmit would take longer is not convincing to me. It should depend on how many packets are lost (on average) like 1 RTT = 20 UDP packets * 10% thus using quic/FEC on a stable network would actually decrease performance and drive up data plan costs.

Also I don't see why a few packets matter that much to actually introduce a new networking stack with all of its own problems from e.g. increasing complexity. Just opened a news site on desktop, it was over 2 MB in size without the ads. If we should be concerned about percentages, we would surely be cutting down on the JS/CSS bloat first.

exDM69 · on Aug 1, 2016

TCP is a stream oriented protocol, the bytes must arrive in the order they were sent. Packet loss will cause going back in the sliding window.

For sending an entire file over the network, you don't care which order the packets arrive. If a packet goes missing in the middle, there's no reason to stop transmitting and receiving later packets. Just re-transmit the missing piece from the middle at any point in the future. This is assuming that the entire file is stored on the sender's side.

edit: the protocol design document (linked in this thread) also mentions multiplexing many "connections" using the same sockets in SPDY. With TCP, a lost packet in any of the mux'ed connections will slow down the others.

Using UDP instead of TCP allows increasing bandwidth and decreasing latency. Website bloat is an orthogonal issue and should be addressed but QUIC can mitigate the problems a little.

k__ · on Aug 1, 2016

TCP has a few ugly problems that enhance each other. First you need to wait till all the handshaking stuff is done and then you can't even transmit with full speed.

If you use UDP, you have to implement much of the TCP stuff yourself. But you can use the experience with web connections to implement it and leave out things that weren't needed.

wtallis · on Aug 1, 2016

> [...] and then you can't even transmit with full speed.

This is nothing specific to TCP. "Full speed" is a fundamentally unknowable quantity in advance in the general case. It varies with time and endpoints. If you try to start a connection transmitting at the full speed of your first-hop link (the only one you have any chance of knowing the bandwidth of), you just put the congestion control a hop or more away from the box that has the information necessary to do it right.

QUIC can have an advantage in re-using a connection in situations where multiple sequential TCP connections might have been made. Probing for link bandwidth fewer times is not the same as not having to probe for link bandwidth. And HTTP/2 has already addressed the most common case of this without abandoning TCP.

k__ · on Aug 1, 2016

Ah, didn't know this.

I thought this was bacause of the tsc slow start

signa11 · on Aug 1, 2016

check out QUIC: Design Document and Specification Rationale available at https://docs.google.com/document/d/1RNHkx_VvKWyWg6Lr8SZ-saqs... for way more detailed description of what is going on and why...

Qantourisc · on Aug 1, 2016

Shouldn't this eventually be put in OSI layer 4 (on top of IP instead of on top of UDP) ?

drdaeman · on Aug 1, 2016

The problem is, we've basically lost L4. Too many networks (firewalls, and especially NATs) just don't pass anything but protocols 6 (TCP) and 17 (UDP). Even ICMP is not always reliably available (leading to all sort of weird PMTU issues[1]), or heavily filtered.

I think that's why many good protocols (e.g. SCTP) are rare sights and frequently aren't even considered as an option.

[1] Yes, some idiots just block ICMP completely because they heard it's "secure" and make DHCP or PPP cap MTU at 1200 with "uh, it works just fine" attitude.

panic · on Aug 1, 2016

It's too late for IPv4, but maybe a strong push could get a protocol like SCTP supported on IPv6 networks?

drdaeman · on Aug 1, 2016

Given that IPv6 doesn't normally use NAT (and for IPv4, it's NAT, not firewalls, is the primary source of problems why non-TCP/UDP protocols don't work in practice) it's a possibility that it would be okay.

IPv6 adoption is a problem, though. I think despite all IANA efforts to push v6, we'll be stuck with IPv4 for long years.

wtbob · on Aug 1, 2016

IIRC IPv6 requires ICMPv6 to work. I'd be in favour of SCTP support as well; dunno how it compares to QUIC for mobile devices though.

wtbob · on Aug 1, 2016

I think that this is basically what SCTP is/was (maybe with a few improvements: I don't recall how SCTP handled client mobility). The problem, as drdaeman indicates, is that too many networks consider themselves TCP (really, TCP/{80,443}) networks, not actual networks providing IP the way they are supposed to.

yes_or_gnome · on Aug 1, 2016

It's hard, for me, to imagine what they would do differently than UDP (because it's minimal, just two port numbers).

Having not yet read the spec for the QUIC protocol, there must be some amount header that would immediately follow the eight UDP bytes. So, assuming that QUIC takes off and gains broad support, then all that would have to be done is give it a new IP protocol number and redefine what is the transport layer and what is session/application layer.

exDM69 · on Aug 1, 2016

It would me much more difficult and laborious to implement with no obvious advantages. UDP is already well specified and implemented, it's lightweight and goes through existing firewalls.

The overhead of UDP here is pretty much negligible.

quickben · on Aug 1, 2016

The advantage is obvious for Google and the rest of us that run hobby search engines.

For example: https://tools.ietf.org/html/draft-tsvwg-quic-protocol-00

Section 4.3 "A QUIC receiver advertises the absolute byte offset within each stream upto which the receiver is willing to receive data." This is saving the crawler bandwidth by a ton. Limit stuff to 64k and skip all non text data upfront in a very clear cut fashion.

6.2.1.2 prevents hosts spamming you with data after your decided to stop receiving from them.

6.3.2 makes this system fire and forget on a 10m limit and then decide what to do with the host that did or didn't respond with data from the request.

10. Properly download everything from the website (actual priority stuff, central to she protocol, aren't even written, it's a to-do :).

All in all, great protocol to help whoever is running search engines.

For the rest, won't be the default (they seem to be aware of this in this in 11.5)

So trying to guess the future: push quic to Apache/nginx (because 11.5 bypass lots of deployment stuff too), hit websites once and determine who had latest code, cash in on bandwidth.

Not bad for a company start.

Do we need another protocol? Probably not. Will this see light of day? Probably yes since Google's money are pushing it.

pc2g4d · on Aug 1, 2016

As I read this post, QUIC has been introduced at least partly to solve the problem of head-of-line blocking. Head-of-line blocking in turn has become a problem since the adoption of HTTP/2, formerly known as SPDY, which is the protocol Google previously foisted upon the world. So the core protocols involved in the web are now the playthings of Google, and when Google screws up the answer is yet another new and shiny protocol with unknown sideeffects.

I know this misconstrues Google's role in all of this somewhat, but it's an interpretation that crossed my mind.

arpa · on Aug 2, 2016

And it does ring true.

nimrody · on Aug 1, 2016

Do mobile networks (cellular networks) these day commonly allow anything but TCP? (or even HTTP? Most firewalls block anything that is not HTTP/HTTPS)

otoburb · on Aug 1, 2016

Carrier firewall policies generally allow any outbound traffic. Some carriers in specific countries may have more restrictive policies for either regulatory or carrier-specific reasons, such as the United Arab Emirates Telecommunications Regulatory Authority mandating all VoIP services obtain licenses[1], resulting in specific blocks on VoIP traffic such as Skype, WhatsApp, Viber, etc.[2].

[1] https://www.tra.gov.ae/en/faq.aspx (check out the VOIP section)

[2] http://gulfbusiness.com/snapchat-voice-and-video-calling-blo...

anilgulecha · on Aug 1, 2016

Don't see why not -- many popular VoIP protocols run over UDP.

azylman · on Aug 1, 2016

Lots of video games (including mobile) use UDP so I'd be very surprised if mobile networks blocked UDP.

clarry · on Aug 3, 2016

In Finland, the big operators use CGN by default. So if it works, it only works as a one-way street and p2p or hosting your game at home is no-go.

At least one operator allows you to lift the restriction though. With money.

pookeh · on Aug 1, 2016

> The QUIC protocol implements its own crypto-layer so does not make use of the existing TLS 1.2.

So in addition to proving lower layer solution to things like network congestion they also have to prove their crypto layer? Sounds like an equally large task, if not larger.

And even if all TCP implementors decide to adopt the next task is to make servers and clients adopt the crypto layer.

z52 · on Aug 3, 2016

How about creating QUIC protocol tuneling tool (which would be needed on both sides of connection) - for example for tuneling mysql tcp traffic between client and server in configurations where those are on diffrent networks? (pings >1ms)

meganvito · on Aug 1, 2016

I think before googlers stopping add dramas to our daily drop sorry typo, job. Maybe I kindaly sort of suggest them dedup the www and provide web-wise random file access.

anonbanker · on Aug 1, 2016

How long until this becomes a transport over BATMAN?