diff --git a/src/include/gpxe/tcp.h b/src/include/gpxe/tcp.h index c8988cf5..494e2ab9 100644 --- a/src/include/gpxe/tcp.h +++ b/src/include/gpxe/tcp.h @@ -213,18 +213,39 @@ struct tcp_mss_option { /** * Advertised TCP window size * + * * Our TCP window is actually limited by the amount of space available * for RX packets in the NIC's RX ring; we tend to populate the rings - * with far fewer descriptors than a typical driver. Since we have no - * way of knowing how much of this RX ring space will be available for - * received TCP packets (consider, for example, that they may all be - * consumed by a series of unrelated ARP requests between other - * machines on the network), it is actually not even theoretically - * possible for us to specify an accurate window size. We therefore - * guess an arbitrary number that is empirically as large as possible - * while avoiding retransmissions due to dropped packets. + * with far fewer descriptors than a typical driver. This would + * result in a desperately small window size, which kills WAN download + * performance; the maximum bandwidth on any link is limited to + * + * max_bandwidth = ( tcp_window / round_trip_time ) + * + * With a 4kB window, which probably accurately reflects our amount of + * buffer space, and a WAN RTT of say 200ms, this gives a maximum + * achievable bandwidth of 20kB/s, which is not acceptable. + * + * We therefore aim to process packets as fast as they arrive, and + * advertise an "infinite" window. If we don't process packets as + * fast as they arrive, then we will drop packets and have to incur + * the retransmission penalty. + * + * Since we don't store out-of-order received packets, the + * retransmission penalty is that the whole window contents must be + * resent. + * + * We choose to compromise on a window size of 64kB (which is the + * maximum that can be represented without using TCP options). This + * gives a maximum bandwidth of 320kB/s at 200ms RTT, which is + * probably faster than the actual link bandwidth. It also limits + * retransmissions to 64kB, which is reasonable. + * + * Finally, since the window goes into a 16-bit field and we cannot + * actually use 65536, we use a window size of (65536-4) to ensure + * that payloads remain dword-aligned. */ -#define TCP_WINDOW_SIZE 4096 +#define TCP_WINDOW_SIZE ( 65536 - 4 ) /** * Advertised TCP MSS