david/ipxe
david
/
ipxe
Archived
1
0
Fork 0
Commit Graph

107 Commits

Author SHA1 Message Date
Michael Brown 469bd11f39 [tcp] Allow sufficient headroom for TCP headers
TCP currently neglects to allow sufficient space for its own headers
when allocating I/O buffers.  This problem is masked by the fact that
the maximum link-layer header size (802.11) is substantially larger
than the common Ethernet link-layer header.

Fix by allowing sufficient space for any TCP headers, as well as the
network-layer and link-layer headers.

Reported-by: Scott K Logan <logans@cottsay.net>
Debugged-by: Scott K Logan <logans@cottsay.net>
Tested-by: Scott K Logan <logans@cottsay.net>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-09-19 15:52:54 +01:00
Michael Brown c68bf14559 [tcp] Send xfer_window_changed() when window opens
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-06-28 14:45:08 +01:00
Michael Brown c4369eb6c2 [tcp] Update ts_recent whenever window is advanced
Commit 3f442d3 ("[tcp] Record ts_recent on first received packet")
failed to achieve its stated intention.

Fix this (and reduce the code size) by moving the ts_recent update to
tcp_rx_seq().  This is the code responsible for advancing the window,
called by both tcp_rx_syn() and tcp_rx_data(), and so the window check
is now redundant.

Reported-by: Frank Weed <zorbustheknight@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-04-03 00:44:22 +01:00
Michael Brown 3f442d3f60 [tcp] Record ts_recent on first received packet
Commit 6861304 ("[tcp] Handle out-of-order received packets")
introduced a regression in which ts_recent would not be updated until
the first packet is received in the ESTABLISHED state, i.e. the
timestamp from the SYN+ACK packet would be ignored.  This causes the
connection to be dropped by strictly-conforming TCP peers, such as
FreeBSD.

Fix by delaying the timestamp window check until after processing the
received SYN flag.

Reported-by: winders@sonnet.com
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-03-26 15:02:41 +00:00
Michael Brown d012f87018 [tcp] Use MAX_LL_NET_HEADER_LEN instead of defining our own MAX_HDR_LEN
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-11-19 16:08:05 +00:00
Michael Brown 67dc832d15 [tcp] Set PSH flag only on packets containing data
Suggested-by: Yelena Kadach <klenusik@hotmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-11-11 01:14:05 +00:00
Michael Brown ea631f6fb8 [list] Add list_first_entry()
There are several points in the iPXE codebase where
list_for_each_entry() is (ab)used to extract only the first entry from
a list.  Add a macro list_first_entry() to make this code easier to
read.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-11-08 03:15:28 +00:00
Michael Brown 28934eef81 [retry] Hold reference while timer is running and during expiry callback
Guarantee that a retry timer cannot go out of scope while the timer is
running, and provide a guarantee to the expiry callback that the timer
will remain in scope during the entire callback (similar to the
guarantee provided to interface methods).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-09-03 21:28:43 +01:00
Piotr Jaroszyński 02e6092cd5 [tcp] Fix a 64bit compile time error
Signed-off-by: Piotr Jaroszyński <p.jaroszynski@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-07-22 21:25:40 +01:00
Michael Brown 1d3b6619e5 [tcp] Allow out-of-order receive queue to be discarded
Allow packets in the receive queue to be discarded in order to free up
memory.  This avoids a potential deadlock condition in which the
missing packet can never be received because the receive queue is
occupying all of the memory available for further RX buffers.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-07-21 12:01:50 +01:00
Michael Brown 68613047f0 [tcp] Handle out-of-order received packets
Maintain a queue of received packets, so that lost packets need not
result in retransmission of the entire TCP window.

Increase the TCP window to 8kB, in order that we can potentially
transmit enough duplicate ACKs to trigger Fast Retransmission at the
sender.

Using a 10MB HTTP download in qemu-kvm with an artificial drop rate of
1 in 64 packets, this reduces the download time from around 26s to
around 4s.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-07-21 00:00:38 +01:00
Michael Brown f033694356 [tcp] Treat ACKs as sent only when successfully transmitted
iPXE currently forces sending (i.e. sends a pure ACK even in the
absence of fresh data to send) only in response to packets that
consume sequence space or that lie outside of the receive window.
This ignores the possibility that a previous ACK was not actually sent
(due to, for example, the retransmission timer running).

This does not cause incorrect behaviour, but does cause unnecessary
retransmissions from our peer.  For example:

 1. Peer sends final data packet (ack      106 seq 521..523)
 2. We send FIN                  (seq 106..107 ack      523)
 3. Peer sends FIN               (ack      106 seq 523..524)
 4. We send nothing since retransmission timer is running for our FIN
 5. Peer ACKs our FIN            (ack      107 seq 524..524)
 6. We send nothing since this packet consumes no sequence space
 7. Peer retransmits FIN         (ack      107 seq 523..524)
 8. We ACK peer's FIN            (seq 107..107 ack      524)

What should happen at step (6) is that we should ACK the peer's FIN,
since we can deduce that we have never sent this ACK.

Fix by maintaining an "ACK pending" flag that is set whenever we are
made aware that our peer needs an ACK (whether by consuming sequence
space or by sending a packet that appears out of order), and is
cleared only when the ACK packet has been transmitted.

Reported-by: Piotr Jaroszyński <p.jaroszynski@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-07-15 19:59:57 +01:00
Michael Brown 75505942ac [tcp] Merge boolean flags into a single "flags" field
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-07-15 19:59:57 +01:00
Michael Brown c57e26381c [tcp] Use a dedicated timer for the TIME_WAIT state
iPXE currently repurposes the retransmission timer to hold the TCP
connection in the TIME_WAIT state (i.e. waiting for up to 2*MSL in
case we are required to re-ACK our peer's FIN due to a lost ACK).
However, the fact that this timer is running will prevent such an ACK
from ever being sent, since the logic in tcp_xmit() assumes that a
running timer indicates that we ourselves are waiting for an ACK and
so blocks the transmission.  (We always wait for an ACK before sending
our next packet, to keep our transmit data path as simple as
possible.)

Fix by using an entirely separate timer for the TIME_WAIT state, so
that packets can still be sent.

Reported-by: Piotr Jaroszyński <p.jaroszynski@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-07-15 19:59:34 +01:00
Guo-Fu Tseng 1e7e4c9a61 [tcp] Randomise local TCP port
Signed-off-by: Guo-Fu Tseng <cooldavid@cooldavid.org>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-07-13 17:29:54 +01:00
Michael Brown 73e3672468 [tcp] Fix typos by changing ntohl() to htonl() where appropriate
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-07-13 17:19:37 +01:00
Michael Brown 43450342a9 [tcp] Store local port in host byte order
Every other scalar integer value in struct tcp_connection is in host
byte order; change the definition of local_port to match.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-07-13 17:15:57 +01:00
Michael Brown 68c2f07f15 [tcp] Fix potential use-after-free when accessing timestamp option
Reported-by: Piotr Jaroszyński <p.jaroszynski@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-07-07 12:57:08 +01:00
Michael Brown 4327d5d39f [interface] Convert all data-xfer interfaces to generic interfaces
Remove data-xfer as an interface type, and replace data-xfer
interfaces with generic interfaces supporting the data-xfer methods.

Filter interfaces (as used by the TLS layer) are handled using the
generic pass-through interface capability.  A side-effect of this is
that deliver_raw() no longer exists as a data-xfer method.  (In
practice this doesn't lose any efficiency, since there are no
instances within the current codebase where xfer_deliver_raw() is used
to pass data to an interface supporting the deliver_raw() method.)

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-06-22 15:50:31 +01:00
Michael Brown 5fa6775b61 [retry] Use start_timer_fixed() instead of direct timeout manipulation
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-06-22 14:32:49 +01:00
Michael Brown c760ac3022 [retry] Add timer_init() wrapper function
Standardise on using timer_init() to initialise an embedded retry
timer, to match the coding style used by other embedded objects.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-06-22 14:30:20 +01:00
Michael Brown 4bfd5b52c1 [refcnt] Add ref_init() wrapper function
Standardise on using ref_init() to initialise an embedded reference
count, to match the coding style used by other embedded objects.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-06-22 14:26:40 +01:00
Michael Brown 9ff8229693 [tcp] Update received sequence number before delivering received data
iPXE currently updates the TCP sequence number after delivering the
data to the application via xfer_deliver_iob().  If the application
responds to the received data by transmitting more data, this would
result in a stale ACK number appearing in the transmitted packet,
which potentially causes retransmissions and also gives the
undesirable appearance of violating causality (by sending a response
to a message that we claim not to have yet received).

Reported-by: Guo-Fu Tseng <cooldavid@cooldavid.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-05-22 00:45:49 +01:00
Michael Brown 8406115834 [build] Rename gPXE to iPXE
Access to the gpxe.org and etherboot.org domains and associated
resources has been revoked by the registrant of the domain.  Work
around this problem by renaming project from gPXE to iPXE, and
updating URLs to match.

Also update README, LOG and COPYRIGHTS to remove obsolete information.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-04-19 23:43:39 +01:00
Michael Brown 5552a1b202 [tcp] Avoid printf format warnings on some compilers
In several places, we currently use size_t to represent a difference
between TCP sequence numbers.  This can cause compiler warnings
relating to printf format specifiers, since the result of
(uint32_t+size_t) may be an unsigned long on some compilers.

Fix by using uint32_t for all variables that represent a difference
between TCP sequence numbers.

Tested-by: Joshua Oreman <oremanj@xenon.get-linux.org>
2009-08-02 22:44:57 +01:00
Michael Brown 58f60df66c [tcp] Avoid rewinding sequence numbers on receiving old duplicate ACKs
Commit 558c1a4 ("[tcp] Improve robustness in the presence of duplicated
received packets") introduced a regression in that an old duplicate
ACK received while in the ESTABLISHED state would pass through normal
ACK processing, including updating tcp->snd_seq.

Fix by ensuring that ACK processing ignores all duplicate ACKs.
2009-06-23 16:10:34 +01:00
Michael Brown 99e64f5806 [tcp] Attempt to catch all possible error cases with debug messages
All TCP errors or unusual events should now generate a debugging
message at DBGLVL_LOG, with enough information (SEQ and ACK numbers)
to be able to identify the corresponding packet (or missing packet) in
a network trace from the remote end.
2009-06-23 14:28:00 +01:00
Michael Brown f4605970f4 [tcp] Include current sequence numbers in "timer expired" messages 2009-06-23 14:03:09 +01:00
Michael Brown a2f753ba64 [tcp] Move high-frequency debug messages to DBGLVL_EXTRA
This makes it possible to leave TCP debugging enabled in order to see
interesting TCP events, without flooding the console with at least one
message per packet.
2009-06-23 13:35:45 +01:00
Michael Brown 558c1a45fe [tcp] Improve robustness in the presence of duplicated received packets
gPXE responds to duplicated ACKs with an immediate retransmission,
which can lead to a sorceror's apprentice syndrome.  It also responds
to out-of-range (or old duplicate) ACKs with a RST, which can cause
valid connections to be dropped.

Fix the sorceror's apprentice syndrome by leaving the retransmission
timer running (and so inhibiting the immediate retransmission) when we
receive a potential duplicate ACK.  This seems to match the behaviour
of Linux observed via wireshark traces.

Fix the RST issue by sending RST only on out-of-range ACKs that occur
before the connection is fully established, as per RFC 793.

These problems were exposed during development of the 802.11 wireless
link layer; the 802.11 protocol has a failure mode that can easily
cause duplicated packets.  The fixes were tested in a controlled way
by faking large numbers of duplicated packets in the rtl8139 driver.

Originally-fixed-by: Joshua Oreman <oremanj@rwcr.net>
2009-06-23 09:40:26 +01:00
Michael Brown c44a193d0d [legal] Add a selection of FILE_LICENCE declarations
Add FILE_LICENCE declarations to almost all files that make up the
various standard builds of gPXE.
2009-05-18 08:33:25 +01:00
Michael Brown 3ed468e0c5 [tcp] Avoid setting PSH flag when SYN flag is set
Some firewall devices seem to regard SYN,PSH as an invalid flag
combination and reject the packet.  Fix by setting PSH only if SYN is
not set.

Reported-by: DSE Incorporated <dseinc@gmail.com>
2009-03-10 08:15:47 +00:00
Michael Brown 8ae1cac050 [xfer] Make consistent assumptions that xfer metadata can never be NULL
The documentation in xfer.h and xfer.c does not say that the metadata
parameter is optional in calls such as xfer_deliver_iob_meta() and the
deliver_iob() method.  However, some code in net/ is prepared to
accept a NULL pointer, and xfer_deliver_as_iob() passes a NULL pointer
directly to the deliver_iob() method.

Fix this mess of conflicting assumptions by making everything assume
that the metadata parameter is mandatory, and fixing
xfer_deliver_as_iob() to pass in a dummy metadata structure (as is
already done in xfer_deliver_iob()).
2009-02-15 08:44:22 +00:00
Michael Brown cf53998901 [tcp] Always set PUSH flag on TCP transmissions
Apparently this can cause a major speedup on some iSCSI targets, which
will otherwise wait for a timer to expire before responding.  It
doesn't seem to hurt other simple TCP test cases (e.g. HTTP
downloads).

Problem and solution identified by Shiva Shankar <802.11e@gmail.com>
2009-01-21 04:22:34 +00:00
Michael Brown d230b53df2 [tcpip] Allow for transmission to multicast IPv4 addresses
When sending to a multicast address, it may be necessary to specify
the source address explicitly, since the multicast destination address
does not provide enough information to deduce the source address via
the miniroute table.

Allow the source address specified via the data-xfer metadata to be
passed down through the TCP/IP stack to the IPv4 layer, which can use
it as a default source address.
2009-01-21 03:40:39 +00:00
Michael Brown 0ebbbb95fa [x86_64] Fix assorted 64-bit compilation errors and warnings
Remove various 32-bit assumptions scattered throughout the codebase.
The code is still not necessarily 64-bit clean, but will at least
compile.
2008-11-19 19:33:05 +00:00
Michael Brown b59e0cc56e [i386] Change [u]int32_t to [unsigned] int, rather than [unsigned] long
This brings us in to line with Linux definitions, and also simplifies
adding x86_64 support since both platforms have 2-byte shorts, 4-byte
ints and 8-byte long longs.
2008-11-19 19:15:44 +00:00
Michael Brown 1a68d3fef3 [TCP] Avoid shrinking TCP window
Maintain state for the advertised window length, and only ever increase
it (instead of calculating it afresh on each transmit).  This avoids
triggering "treason uncloaked" messages on Linux peers.

Respond to zero-length TCP keepalives (i.e. empty data packets
transmitted outside the window).  Even if the peer wouldn't otherwise
expect an ACK (because its packet consumed no sequence space), force an
ACK if it was outside the window.

We don't yet generate TCP keepalives.  It could be done, but it's unclear
what benefit this would have.  (Linux, for example, doesn't start sending
keepalives until the connection has been idle for two hours.)
2008-06-05 00:28:17 +01:00
Alexey Zaytsev a1572e0ab0 Modify gPXE core and drivers to work with the new timer subsystem
Signed-off-by: Alexey Zaytsev <alexey.zaytsev@gmail.com>
2008-03-02 03:41:10 +03:00
Michael Brown f6a8158eed Make seek information part of the xfer metadata, rather than an entirely
separate xfer method.

Add missing .alloc_iob entries to several xfer_interface_operations
structures.
2008-01-08 16:46:55 +00:00
Michael Brown df868476e7 Various warnings fixups for OpenBSD with gcc-3.3.5. 2007-12-07 00:11:43 +00:00
Michael Brown 2ff1b1245b Use start_timer_nodelay() in protocols which rely on the retry timer
to generate the initial transmission; this cuts off around 0.3s per
instantiated connection.
2007-08-13 11:03:33 -07:00
Michael Brown 9aa61ad5a2 Add per-file error identifiers 2007-07-24 17:11:31 +01:00
Michael Brown 096fa94f0c Add support for TCP timestamps 2007-07-13 11:32:53 +01:00
Michael Brown eb530845d4 Adjust received length to take into account any already-received data
in tcp_rx_data().

Clarify comments on discarding duplicate or out-of-order data.
2007-07-13 11:31:58 +01:00
Michael Brown d5735c631c Avoid reusing auto-allocated ports after connection close. 2007-07-13 11:25:00 +01:00
Michael Brown edded7546e Limit xmit window to one MTU. (Path MTU discovery not yet
implemented; should be done at some point.)
2007-07-08 14:33:53 +01:00
Michael Brown 35afb379af TCP limits advertised TCP window to size of application window
obtained via xfer_window().
2007-07-08 14:14:59 +01:00
Michael Brown b34d4d0449 Separate the "is data ready" function of xfer_seek() into an
xfer_window() function, which can return a scalar rather than a
boolean.
2007-07-08 14:11:07 +01:00
Michael Brown ca4c6f9eee Kill off unused request() method in data-xfer interface. 2007-07-08 02:10:54 +01:00