For many Internet services, reducing latency improves the user experience and increases revenue for the service provider. While in principle latencies could nearly match the speed of light, we find that infrastructural inefficiencies and protocol overheads cause today’s Internet to be much slower than this bound: typically by more than one, and often, by more than two orders of magnitude. Bridging this large gap would not only add value to today’s Internet applications, but could also open the door to exciting new applications. Thus, we propose a grand challenge for the networking research community: a speed-of-light Internet.
The Internet is too slow!
We fetched just the HTML for landing pages of 22,800 popular Websites from 102_ PlanetLab nodes using cURL. For each connection, we geolocated the Web server using six commercial geolocation services, and (since we do not have any basis for deciding which service is better than another) used the location identified by their majority vote. We computed the time it would take for light to travel round-trip along the shortest path between the same end-points, i.e., the c-latency. We refer to the ratio of the fetch time to c-latency as the Internet’s latency inflation. The figure below shows the CDF of this inflation over 1.9 million connections. The time to finish HTML retrieval is, in the median, 36.5× the c-latency, while the 80th percentile exceeds 100×. Thus, the Internet is typically more than an order of magnitude slower than the speed of light. Moreover, since PlanetLab nodes are generally well-connected, latency can be expected to be poorer from the Internet’s true edge.
But why is the Internet so slow?
To identify the causes of Internet latency inflation, we break down the fetch time across layers, from inflation in the physical path followed by packets to the TCP transfer time.
DNS resolutions are shown to be faster than c-latency 14% of the time. This is an artifact of the baseline we use — in these cases, the Web server happens to be farther than the DNS resolver, and we always use the c-latency to the Web server as the baseline. (The DNS curve is clipped at the left to more clearly display the other results.) In the median, DNS resolutions are 6.6× inflated over c-latency.
The TCP transfer time shows significant inflation — 12.6× in the median. With most pages being at most tens of KB (median page size is 73 KB), bandwidth is not the problem, but TCP’s slow start causes even small data transfers to require several RTTs. 6% of all pages have transfer times less than the c-latency — this is due to all the data being received in the first TCP window. The TCP handshake (counting only the SYN and SYN-ACK) and the minimum ping time are 3.2× and 3.1× inflated in the median. The request-response time is 6.5× inflated in the median, i.e., roughly twice the median RTT. 24% of the connections, however, use less than 10 ms of server processing time (estimated by subtracting one RTT from the request-response time). The median c-latency, in comparison, is 47 ms. The medians of inflation in DNS time, TCP handshake time, request-response time, and TCP transfer time add up to 28.8×, lower than the measured median total time of 36.5×, since the distributions are heavy-tailed.
In line with the community’s understanding, our measurements affirm that TCP transfer and DNS resolution are important factors causing latency inflation. Our measurements also reveal that, however, the Internet’s infrastructural inefficiencies are an equally, if not more important culprit.
Based on the above figure, DNS resolution (6.6× inflated over c-latency), TCP handshake (3.2×), request-response time (6.5×), and TCP transfer (12.6×), all contribute to a total time inflation of 36.5×. With these numbers, it may be tempting to dismiss the 3.1× inflation in the minimum ping time. But this would be incorrect because lower-layer inflation, embodied in RTT, has a multiplicative effect on each of DNS, TCP handshake, request-response, and TCP transfer time. The total time for a page fetch (without TLS) can be broken down roughly (ignoring minor factors like the client stack) as: Ttotal = TDNS + Thandshake + Trequest + Tserverproc + Tresponse + Ttransfer. If we changed the network’s RTTs as a whole by a factor of x, everything on the right-hand side except the server processing time (which can be made quite small in practice) changes by a factor of x (to an approximation; TCP transfer time’s dependence on RTTs is a bit more complex), thus changing Ttotal by approximately a factor of x as well.
The Grand Challenge
There is a large body of work on reducing Internet latency. This work, however, has been limited in its scope, its scale, and most crucially, its ambition. The central question we have not seen answered, or even posed before, is “Why are we so far from the speed of light?“. Even the ramifications of a speed-of-light Internet have not been explored in any depth.
Speed-of-light Internet connectivity would be a technological leap with the potential for new applications, instant response, and radical changes in the interactions between people and computing. To shed light on what’s keeping us from this vision, we quantified the latency gaps introduced by the Internet’s physical infrastructure and its network protocols. Our analysis suggests that the networking community should, in addition to continuing efforts for protocol improvements, also explore methods of reducing latency at the lowest layers.
For more information on our analysis, data sets, and vision refer to our publications.