The TCP Window, Latency, and the Bandwidth Delay productThe relation between the TCP Window, packet delay (pings) and maximum bandwidth
2008-09-15 (updated: 2009-12-13) by Philip
Tags: BDP, RFC, TCP Window, latency, bandwidth, MSS, MTU, packet
This article is intended as a primer on some TCP/IP networking concepts and factors that determine an optimal TCP Receive Window.
The TCP Window
The TCP Window is the amount of outstanding data (unacknowledged by the recepient) that can remain in the network. After sending that amount of data, the sender stops and waits for acknowledgement back from the receiver that it has gotten some of it. As such, this value is probably the single most important setting in tuning broadband internet connections. The TCP Window is negotiated at the beginning of every connection during the TCP "handshake" stage.
In the original DARPA TCP/IP standard, the TCP Receive Window (RWIN) was limited to 64K (65535), since there are only 16-bits in the TCP headers for the RWIN value, and 2^16=64K. This limitation needed to be addressed, and in 1992 RFC 1323 added a "TCP Options" header extension, which allowed for expanding the maximum TCP Window size by adding another byte to act as a "scale factor" to the RWIN value. The RFC1323 RWIN byte can contain any value between 0 and 14, as follows:
For example, let's assume an unscaled RWIN value of 64240, and a scale factor is 3. the actual RWIN value then would be: 64240 * 2^3 = 513920.
Note that the scale factor is limited to 14; 2^14=16384, and the maximum unscaled RWIN is 65535. 16384 * 65536 = 1,073,725,440 (a gigabyte). Thus, RFC1323 allows for a maximum TCP Receive Window of up to one gigabyte.
See also: TCP Header structure
The speed of every data transfer, like TCP is of course largely determined by the line speed. In addition, however, let's consider the delay, or RTT(round trip time) of each data packet.
Any time a client computer asks a server a question, there is a RTT delay until it receives a response. Data packets have to thravel through a number of high-traffic (sometimes congested) routers, and there is always the speed of light (or electricity for copper lines) as limitation, considering the huge distances of internet communication.
Let's examine a client computer communicating with a server over a geosynchronous satellite link. The client's request (every packet) has to travel 22,300 miles to the satellite, then 22,300 miles down to the server. Then, when the server sends its response, it has to travel the same distance back to the client, adding another 22,300 miles up + 22,300 miles down. Thus, that simple packet of data traveled at least 89,200 miles. Considering the speed of light (186,000 miles per second), we can conclude that there is a minimum round-trip delay on a satellite connection of about half a second (500ms).
The Bandwidth * Delay Product
The Bandwidth*Delay product, or BDP for short, determines the amount of data that can be in transit in the network (just as RWIN). It is the product of the available bandwidth and the latency (RTT). BDP is a very important concept in a window-based protocol such as TCP, as throughput is bound by the BDP ! The BDP states that:
BDP (bits) = bandwidth (bits/second) * latency (seconds)
What does it mean ? The BDP, and the TCP Receive Window limit our connection to the product of the latency and the bandwidth. A transmission can not exceed the RWIN / latency value.
See also: SG BDP calculator
Optimizing the TCP Receive Window
When calculating an optimal RWIN value, one should try to use as high as possible unscaled RWIN values (usually the highest MSS multiple under 65535) and a smaller scale factor. It is a much better method accounting for older routers and some wireless networks that don't work well with TCP Options (RFC1323), or large scale factors.
To determine the optimal TCP Receive Window, you can simply use one of the SG TCP Analyzer recommended values, or perform the following calculations:
TCP Window in Vista / Windows 7 / 2008 Server
In Windows Vista and 2008 server, Microsoft introduced a new TCP/IP stack with a number of improvements. It also includes a concept called TCP Window "Auto-Tuning" that's been used in Linux for years. The idea is, a small initial RWIN value is advertised, which is then adjusted on the fly depending on the current line speed and latency. This new implementation works much better by default, compared to previous Windows versions. In theory, the new automatic RWIN algorithm adjusts the TCP Window size based on three main factors:
The algorithm has the ability to control the TCP Window value per connection. Also, by default, Vista/2008 will not allocate RWIN values larger than 16Mb.
There are still a couple of downsides to the new approach:
For additional information on tunning TCP/IP under Vista, see our Windows Vista/2008 tweaks article.
by Ivan - 2009-12-15 01:09
by junjun - 2009-12-23 11:42
by fred - 2010-07-14 12:37
by Philip - 2010-07-26 11:27
by anonymous - 2012-06-07 06:27
by Philip - 2012-07-10 11:44
The TCP Receive Window/RWIN value in most OSes is set in bytes/kilobytes, and the latency is measured in milliseconds. Hence, the BDP formula is converted to different units, with emphasis on the BDP/TCP Window in kilobytes, the bandwidth in kilobits/second, and the latency in milliseconds for ease of use.
by anonymous - 2012-07-21 07:45
by anonymous - 2014-02-25 03:51
Hi, thx for this article. One question: why don't you consider the time need for serializing the data ? It seems me that the total time required for sending an entire TCP Window and reeving back an Ack is:
Time required for serializing data on the serial link +
time required to transfer data between data source to receiver +
time required to transfer Ack between receiver to data source
So the bandwidth is = (Total amount of data ) / ( total time ) = Rwin / (Ts + RTT)
where Ts is the time required for serializing the data.
Isn't it ?
by anonymous - 2014-07-09 16:01
The serialization delay does factor in, but in today's world it is dwarfed by the other factors typically. Serialization delay becomes irrelevant around 768Kbps and is close to irrelevant even below that speed, around 384Kbps. Most links we deal with now are T1 or better. As for one-way versus two-way latency, typically you use the two-way latency in this calculation because the sender will stop sending if the window fills and won't send more data until it receives an ack back from the recipient for a portion of what the sender sent. Thus, both directions of the link factor into that sending-data / wait-for-ack transaction that occurs during a TCP transaction.
by versys5 - 2015-10-08 15:17
by brahim - 2016-03-23 12:50
by Ilweran - 2016-05-06 08:55
the article is not completely exact. There is also TCP Congestion window - CWND, and transmitter is free to send its data only in volume that is minimum between RWND and CWND. CWND is not exchanged in TCP header and is calculated based on ACKs received from the other side of communication (i.e. from receiver). So not only TCP Window size matters, but also the TCP Congestion avoidance algorithm does. Having long fat pipe but using TCP RENO you'll get crumbs of speed that is available with, say, TCP YeAH.