Home » Internet

Use of latency in broadband ranking is silly

By George Ou 2 October 2009 2 Comments

oxfordThe University of Oxford just released their 2009 Broadband Quality Study (BQS) report.  This is the second such report released since last year which appears to be another study of speedtest.net user generated data.  The difference is that the BQS metric is a composite weighted score of downstream, upstream, and latency measurements.  While the report may have compiled some useful data, the fact that the latency values appear to be completely arbitrary and unnecessary means that the BQS rankings may be inaccurate.

There are two problems with the latency measurements used in the Oxford study.  The first problem is that latency (note these are all round-trip measurements) was used at all and the second problem is that the values measured are completely arbitrary and at times nonsensical.

Why comparing latency between nations is wrong

It is wrong to compare latency within each nation because latency is primarily a function of the speed of light (within glass), the distances between the two end points measuring the latency, plus some additional switching and routing delays.  For example, Korea is a little smaller than Japan and Japan is a little smaller than the state of California.  So because intra-California latencies tend to be within 30 milliseconds unless there is a problem on one of the core routers, intra-Japan latencies or intra-Korea latencies will also tend to be within 30 milliseconds.  But the contiguous 50 states in the United States will always have a latency of 80 ms because of the size of the country and that would put us at a permanent disadvantage in the BQS score which makes the BQS ranking dubious.

Example 1:
The distance from my city to Cambridge Massachusetts is roughly 3127 road miles.  Assuming that my packets follow a similarly short trip (it may actually be longer), and since we know that the speed of light in fiber optic cabling is approximately 123 miles per millisecond, it takes 51 milliseconds for light to make a round trip.  When I ping (a tool used to determine round trip latency) OC11-RTR-1-BACKBONE-2.MIT.EDU which is one of Massachusetts Institute of Technology’s primary routers, it returns 90ms.  12 milliseconds is the switching delay on my DSL connection and this is no different all over the world for DSL (including FTTN variants) or Cable broadband.  The total switching and routing delay was no more than 39 milliseconds.

Example 2:
The distance from my city to San Diego is 457 miles, and this is generally the sort of distance you’ll see within Japan or Korea or most other nations.  The round trip time for light over fiber is 7.4 milliseconds if the path over the Internet is the same distance as the road.  My ping to adcom1-nodem-720-p2p-10ge.ucsd.edu which is a router at UCSD in San Diego is a very consistent 27 milliseconds.  At these much shorter distances, the switching and routing delay is more prominent in percentage terms and accounts for 19.6 milliseconds most of which is the 12 millisecond latency of my DSL connection.

Example 3:
The distance from my city to Australia is about 7,416 miles.  The theoretical round trip constrained by the speed of light would be 121 milliseconds.  Pinging po4-0-0.cr1.nrt1.asianetcom.net located somewhere in the continent of Australia (not necessarily mainland) yields a very consistent 140 milliseconds.  The switching and routing overhead is approximately 19 milliseconds which is tiny compared to the speed-of-light delay.

It is important to note that the routing and switching delay is fairly universal across each nation since the routing and switching equipment used worldwide is more or less the same, so this is not a basis of differentiation between the quality of Internet infrastructure of different nations.  The routing and switching delay is minimal to begin with and it becomes less pronounced at greater distances.  Furthermore, routing and switching delay hasn’t really improved much over the last decade whereas bandwidth capacity at a given price has improved thousands of times.  That means latency measurements aren’t a good way of measuring technological progress over time.

Simply put, physically larger countries have larger latencies within them and smaller countries have lower latencies within their borders.  It’s not as if Japanese fiber is better than Korean fiber which is better than American fiber.  The routers that each nation uses are more or less the same and latencies are generally stable beyond the broadband connection.  Problems can occur on some segments within the Internet if there’s a problem circuit but these are generally resolved within a few hours.  The biggest problem with latency is sudden spikes in latency called jitter which mainly happens on the broadband segment, especially when someone uses a P2P application.  Even low bandwidth P2P causes high jitter because of the erratic clumping of packets that get shoved into the network at the same time.  On the other hand, high bandwidth applications like IPTV doesn’t cause any jitter due to the fact that its packets are nicely spaced out.

Why the BQS latency data is dubious to begin with

When we look at the BQS 2009 Appendix which contains a latency chart, we can see that it makes very little sense to begin with.  I’ve marked up a version of the chart below to highlight some of the problems.

BQS-latency

Last year’s data stuck out to me as highly questionable because of the outrageous latency values of 157 ms for countries like Korea which should be well within 30 ms if measured inside Korea.  This year’s data doesn’t look much better because the data still looks completely arbitrary.  The 2009 data shows Japan and Korea at approximately 51 ms and 70 ms respectively which is still much too high for countries that small.  The U.S. is listed around 80 ms which is accurate for coast to coast, but gamers typically find servers in their own region that’s less than 30 ms away round trip and that hasn’t changed since I started gaming on Broadband 10 years ago.

China came in at an unbelievable 270 ms which is only valid if a Chinese citizen was trying to reach a US based server, but most Chinese citizens I know visit Chinese sites hosted in China and those are typically within 50 milliseconds.  The extremely wealthy broadband users of Luxembourg received an eye popping 310 milliseconds which doesn’t even pass the smell test since Luxembourg can reach any part of the European Union within 50 milliseconds round trip.  My tests show that Luxembourg to the UK is 25 ms round trip time and much less for bordering nations of France and Germany.

Now imagine if a Chinese University had performed a broadband study where it selected some Chinese servers as the reference point.  China would have scored within 50 milliseconds for most of traditional China while European nations would have scored poorly at over 100 milliseconds and the U.S. would have scored extremely poorly with latency measurements of 270 milliseconds.  Would Americans and Europeans take this report seriously if it had declared China number 1?  I seriously doubt it so why would nations like China and Luxembourg take the Oxford BQS study seriously?

The reality is that for applications where low latency is crucial such as online gaming, users will tend to stick to their own geographic regions.  Not only is the latency lower, but it’s also a bit more fun yelling at opponents who speak the same language.  Gaming is one of the best examples of an application where razor slim latencies matter the most because it is an application based on human reaction times, and gamers will try to find servers with the lowest latencies (preferably within 30 ms) or “ping” as they refer to it.  American gamers have lots of nearby servers they can pick and choose from within 30 ms as and European gamers tend to play on servers residing within the EC or sometimes in East Coast of the U.S. which measures at least 110 ms away.  VoIP is a little more tolerant than gaming and it can tolerate stable latency values of up to 200 ms (1/5th of a second) before the delay becomes noticeable to humans.  This can mostly be achieved from any wired broadband connection to any other wired broadband connection on Earth.

Clarifying the role of latency

The focus on latency appears to have caused much confusion in the media as this particular story mistakenly believes that latency has something to do with large file transfers or iPlayer video streaming.  The fact is that neither file transfer or the flash-based web streaming version of iPlayer or the peer-to-peer (P2P) version of iPlayer cares about latency or jitter.  File transfers aren’t affected by latency at all especially if they’re using multiple TCP connections to transmit the data and they only care about the number of bytes transferred per minute and not how many fractions of a second each packet takes to traverse the network.  Web streaming is buffered by tens of thousands of milliseconds so it’s very unlikely to be affected by a few hundred milliseconds.

Web applications may be sensitive to latency if they are not well tuned to minimize the number of round trips per transaction.  When I was responsible for server and network infrastructure at an IT consulting firm, the application developers would always tell the CIO that the servers are to blame.  When I prove to them it isn’t the servers, they blame the network latency.  Then I explain to them that we can’t change the speed of light or do much about the small amount of switching/routing overhead, they go back and tune their application to minimize round trips.  Some TCP accelerators can also reduce the number of round trips.  Putting the web application on something like Citrix terminal services which hosts the application close to the server and forwards the minimal screen changes to the user is another option that works extremely well.  These are really the only options available to a web applications developer.  While this is an expense businesses would prefer not to incur, that’s one of the fundamental challenges of developing web applications for global access.

The most important point that has been missed here is that it is really the jitter (the variation in the latency which can go above 1000 ms) that really matters.  VoIP, video communications, and online gaming are extremely sensitive to jitter.  Good network engineering, modern advanced routers, and good public policy that allows for reasonable discrimination and doesn’t buy into the “dump pipe” dogma of the end-to-end movement is what’s going to keep a nation’s internet infrastructure strong and ready for the future.

2 Comments »

Leave your response!

Add your comment below, or trackback from your own site. You can also subscribe to these comments via RSS.

Be nice. Keep it clean. Stay on topic. No spam.

You can use these tags:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

This is a Gravatar-enabled weblog. To get your own globally-recognized-avatar, please register at Gravatar.