Networks and Buffering

http://sne.science.uva.nl/ http://uva.nl/
Proposed Solutions
Related Projects
Related publications


    The predominant transport protocol on the Internet is TCP/IP. TCP guarantees delevery of streams of bytes. It does so by keeping buffers at the sending and receiving side so that it is capable of retransmitting lost data. The buffers get updated with new data when both ends agree that previous data is received and delivered to the application. Another problem that TCP tries to solve is congestion. If all computers connected to networks try to send and receive at full speed at the same time (collective events are notorious, like everybody wanting to look at the same football game at the same time) congestion is unavoidable. It is not unlike trafficjams on the highways during rush hour. The TCP protocol tries to avoid an internet meltdown by monitoring dropped packets and reduce the rate it transmits packets when certain patterns of lost packets occur. Also TCP tries to increase the rate when no packets are lost for some time under the assumption that there is more headroom. For a good efficient and highest throughput operation the above mentioned buffers that TCP utilizes need to be as big as the amount of data that can be transported during one round trip time from sender to receiver and back (RTT). TCP typically implements congestion avoidance by increasing and decreasing these buffers. This implicitly rate-limits TCP as when the buffers are decreased by the protocol because of lost packets, the sender can not send new data when it has already transmitted all data in the buffer. It has to wait for the other side to get confirmation that everything arrived and to move on, or in case of lost packets, start retransmitting.

    This approach has worked well in the last 20 years, although in several cornercases protocol adjustments and parameter tuning has been necessary. Typical problems arise in single large bandwidth high latency transport of scientific or cloud data, where the necessary buffers are way bigger than those for typical home usage of mail/brouwsing/twitter. The scientific community has investigated this problem with the emergence of the big eScience projects as the Large Hadron Collider, astronomy and earth observation. Several alternate tunded modified TCP-alike stacks exist and the TCP standard is slowly evolving to scale better.

    The problem that now arises is this: suppose we have a concatenation of network devices with different bottlenecks or congestions in the chain, but at all those bottlenecks there is an infinite amount of memory so that no packet is dropped ever. We also assume here that the sender is not the bottleneck, but can send at higher rate than one or more downstram devices can forward. Then the current TCP algorithms assume that there is even more headroom and increase their buffers to the maximum and continue to transmit if possible at line rate of their interface. The net effect will be that the perceived latency increases as it will take longer and longer to get an answer related to a specific packet inserted in the stream. That packet will encounter increasingly filled queues along the path and will take longer to reach destination, at which point the reaction to the sender can be transmitted. The net effect is a steadily increasing RTT. Typically at some point TCP will time out and consider the connection broken.

    Surprisingly this situation is now starting to occur as memory is becoming very cheap, home computers and their interfaces very fast, and the typical bottlenecks may in fact be the modems and routers at home and the ethernet switches at the ISP's locations. Many ISP's nowadays proudly claim interfaces with many gigabytes of buffer memory. For years I countered those remarks with the question: "Is that a good or a bad message?" after which they typically look as if they see water burning.

Proposed solutions

If we consider one end to end link with a bottleneck somewhere in the middle (does actually not matter where and how long the strech of lower bandwidth behind the bottleneck is), then:
    M = S * (1 - B2/B1)
    W = RTT * B2
    S = W
    M = RTT * B2 * (1 - B2/B1)
    B2 = W/RTT
    M = 0.200 * B2 * (1 - B2/B1)

The best solution in my opinion would be the combination of two actions:
  1. TCP implementations only sending shaped traffic where packets are spaced in time to approach the average datarate, so eleminate bursts.
  2. TCP should tune its window size to approach a minimum RTT.
If one would eliminate any memory in the network, then the RTT would always be the minimum, because a packet would either travel at lightspeed or being dropped. Therefore, the dropped packets should also play a role in the window decrease/increase determination. Note that multiple (n) TCP streams each using a smal part (1/n) of the bandwidth gives often enough statistical multiplexing for throughput to saturate the link with even too little memory present.


Related projects

In february-march 2011 this problem was posed to the students of the Grid Master at the University of Amsterdam. This resulted in the following two reports:
An in june 2011 two sudents of the SNE master looked into detection:

Related publications & talks

  1. Antony Antony, Johan Blom, Cees de Laat, Jason Lee, Wim Sjouw, "Microscopic Examination of TCP flows over transatlantic Links", Future Generation Computer Systems, Volume 19, Issue 6, August 2003, Pages 1017-1029. Link to publication:  http://www.delaat.net/pubs/2003-j-3.pdf
  2. 17-apr 2002: Nordunet conference, Copenhagen, 15-17 april, Keynote:  "The road to optical networking": http://www.delaat.net/talks/cdl-2002-04-17.pdf

home home