david/ipxe
david
/
ipxe
Archived
1
0
Fork 0

Added diatribe about the mismatch between the PXE spec and the TFTP

protocol, and how we will work around it.
This commit is contained in:
Michael Brown 2005-05-27 11:44:46 +00:00
parent 97675c7129
commit 2ffc960e67
1 changed files with 86 additions and 19 deletions

View File

@ -30,14 +30,14 @@
* @v tftp_open Pointer to a struct s_PXENV_TFTP_OPEN
* @v s_PXENV_TFTP_OPEN::ServerIPAddress TFTP server IP address
* @v s_PXENV_TFTP_OPEN::GatewayIPAddress Relay agent IP address, or 0.0.0.0
* @v s_PXENV_TFTP_OPEN::Filename Name of file to open
* @v s_PXENV_TFTP_OPEN::FileName Name of file to open
* @v s_PXENV_TFTP_OPEN::TFTPPort TFTP server UDP port
* @v s_PXENV_TFTP_OPEN::PacketSize TFTP blksize option to request
* @ret #PXENV_EXIT_SUCCESS File was opened
* @ret #PXENV_EXIT_FAILURE File was not opened
* @ret s_PXENV_TFTP_OPEN::Status PXE status code
* @ret s_PXENV_TFTP_OPEN::PacketSize Negotiated
* @err ....... ..........
* @ret s_PXENV_TFTP_OPEN::PacketSize Negotiated blksize
* @err #PXENV_STATUS_TFTP_INVALID_PACKET_SIZE Requested blksize too small
*
* Opens a TFTP connection for downloading a file a block at a time
* using pxenv_tftp_read().
@ -46,11 +46,21 @@
* routing will take place. See the relevant
* @ref pxe_routing "implementation note" for more details.
*
* s_PXENV_TFTP_OPEN::PacketSize must be at least 512.
* The blksize negotiated with the TFTP server will be returned in
* s_PXENV_TFTP_OPEN::PacketSize, and will be the size of data blocks
* returned by subsequent calls to pxenv_tftp_read(). The TFTP server
* may negotiate a smaller blksize than the caller requested.
*
* Some TFTP servers do not support TFTP options, and will therefore
* not be able to use anything other than a fixed 512-byte blksize.
* The PXE specification version 2.1 requires that the caller must
* pass in s_PXENV_TFTP_OPEN::PacketSize with a value of 512 or
* greater.
*
* You can only have one TFTP connection open at a time, because the
* PXE API requires the PXE stack to keep state about the open TFTP
* connection (rather than letting the caller do so).
* PXE API requires the PXE stack to keep state (e.g. local and remote
* port numbers, data block index) about the open TFTP connection,
* rather than letting the caller do so.
*
* It is unclear precisely what constitutes a "TFTP open" operation.
* Clearly, we must send the TFTP open request to the server. Since
@ -65,7 +75,15 @@
* solution to this problem.
*
*
* @note If you pass in a value less than 512 for
* s_PXENV_TFTP_OPEN::PacketSize, Etherboot will attempt to negotiate
* this blksize with the TFTP server, even though such a value is not
* permitted according to the PXE specification. If the TFTP server
* ends up dictating a blksize larger than the value requested by the
* caller (which is very probable in the case of a requested blksize
* less than 512), then Etherboot will return the error
* #PXENV_STATUS_TFTP_INVALID_PACKET_SIZE.
*
* @note According to the PXE specification version 2.1, this call
* "opens a file for reading/writing", though how writing is to be
* achieved without the existence of an API call %pxenv_tftp_write()
@ -253,44 +271,48 @@ file" operations. The problem is the unreliable nature of UDP
transmissions and the lock-step mechanism employed by TFTP to
guarantee file transfer. The lock-step mechanism requires that if we
time out waiting for a packet to arrive, we must trigger its
retransmission by retransmitting our previously transmitted packet.
retransmission by retransmitting our own previously transmitted
packet.
For example, suppose that pxenv_tftp_read() is called to read the
first data block of a file from a server that does not support TFTP
options, and that no data block is received within the timeout period.
In order to trigger the retransmission of this data block
In order to trigger the retransmission of this data block,
pxenv_tftp_read() must retransmit the TFTP open request. However, the
information used to build the TFTP open request is not available at
this time; it was provided only to the pxenv_tftp_open() call.
this time; it was provided only to the pxenv_tftp_open() call. Even
if we were able to retransmit a TFTP open request, we would have to
allocate a new local port number (and be prepared for data to arrive
from a new remote port number) in order to avoid violating the TFTP
protocol specification.
The question of when to transmit the ACK packets is also awkward. At
a first glance, it would seem to be fairly simple: acknowledge a
packet immediately after receiving it. However, since the ACK packet
may itself be lost, the next call to pxenv_tftp_read() must be
prepared to re-acknowledge the packet.
prepared to retransmit the acknowledgement.
Another problem to consider is that the pxenv_tftp_open() API call
must return an indication of whether or not the TFTP open request
succeeded. In the case of a TFTP server that doesn't support TFTP
options, the only indication of a successful open is the reception of
the first data block. However, the pxenv_tftp_open() API provides no
way to return this data block at this time. Pretending that we lost
the data block and requesting retransmission is problematic, because
the only way to request retransmission of the first data block in such
a case is to reissue the TFTP open request, which has side effects
such as requiring the allocation of a new local port number.
way to return this data block at this time.
At least some PXE stacks (e.g. NILO) solve this problem by violating
the TFTP protocol and never bothering with retransmissions, relying on
the TFTP server to retransmit when it times out waiting for an ACK.
This approach is dubious at best.
This approach is dubious at best; if, for example, the initial TFTP
open request is lost then NILO will believe that it has opened the
file and will eventually time out and give up while waiting for the
first packet to arrive.
The only viable solution seems to be to allocate a buffer for the
storage of the first data packet returned by the TFTP server, since we
may receive this packet during the pxenv_tftp_open() call but have to
return it from the subsequent pxenv_tftp_read() call. This buffer
must be statically allocated and must be dedicated to providing a
temporary home to TFTP packets. There is nothing in the PXE
temporary home for TFTP packets. There is nothing in the PXE
specification that prevents a caller from calling
e.g. pxenv_undi_transmit() between calls to the TFTP API, so we cannot
use the normal transmit/receive buffer for this purpose.
@ -334,6 +356,51 @@ acknowledgement packet.)
In order to set up this invariant condition for the first call to
pxenv_tftp_read(), pxenv_tftp_open() must do the following:
-
- Construct and transmit the TFTP open request.
- Retransmit the TFTP open request (using a new local port number as
necessary) until a response (DATA, OACK, or ERROR) is received.
- If the response is an OACK, acknowledge the OACK and retransmit
the acknowledgement until the first DATA packet arrives.
- If we have a DATA packet, store it in a buffer ready for the first
call to pxenv_tftp_read().
This approach has the advantage of being fully compliant with both
RFC1350 (TFTP) and RFC2347 (TFTP options). It avoids unnecessary
retransmissions. The cost is approximately 1500 bytes of
uninitialised storage. Since there is demonstrably no way to avoid
paying this cost without either violating the protocol specifications
or introducing unnecessary retransmissions, we deem this to be a cost
worth paying.
A small performance gain may be obtained by adding a single extra
"send ACK" in both pxenv_tftp_open() and pxenv_tftp_read() immediately
after receiving the DATA packet and copying it into the internal
buffer. The sequence of events for pxenv_tftp_read() then becomes:
- Copy the data packet from our buffer to the caller's buffer.
- If this was the last data packet, return immediately.
- Check to see if a TFTP data packet is waiting. If not, send an
ACK for the data packet that we have just copied, and retransmit
this ACK until the next data packet arrives.
- Copy the packet into our internal buffer, ready for the next call
to pxenv_tftp_read().
- Send a single ACK for this data packet.
Sending the ACK at this point allows the server to transmit the next
data block while our caller is processing the current packet. If this
ACK is lost, or the DATA packet it triggers is lost or is consumed by
something other than pxenv_tftp_read() (e.g. by calls to
pxenv_undi_isr()), then the next call to pxenv_tftp_read() will not
find a TFTP data packet waiting and will retransmit the ACK anyway.
Note to future API designers at Intel: try to understand the
underlying network protocol first!
*/