david/ipxe
Archived
1
0
Commit Graph

884 Commits

Author SHA1 Message Date
Michael Brown
18d0818f94 [tcp] Do not send RST for unrecognised connections
On large networks with substantial numbers of monitoring agents,
unwanted TCP connection attempts may end up flooding iPXE's ARP cache.

Fix by silently dropping packets received for unrecognised TCP
connections.  This should not cause problems, since many firewalls
will also silently drop any such packets.

Reported-by: Jarrod Johnson <jarrod.b.johnson@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-07-12 03:20:05 +02:00
Michael Brown
c4bce43c3c [netdevice] Reset MAC address when asked to clear the "mac" setting
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-05-16 15:41:20 +01:00
Michael Brown
08bf79582a [netdevice] Add "chip" setting
Suggested-by: Robin Smidsrød <robin@smidsrod.no>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-05-16 15:32:17 +01:00
Michael Brown
15d2f947f5 [settings] Eliminate settings "tag magic"
Create an explicit concept of "settings scope" and eliminate the magic
values used for numerical setting tags.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-05-01 19:52:12 +01:00
Michael Brown
2095ed413e [netdevice] Add netdev_tx_defer() to allow drivers to defer transmissions
Devices with small transmit descriptor rings may temporarily run out
of space.  Provide netdev_tx_defer() to allow drivers to defer packets
for retransmission as soon as a descriptor becomes available.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-05-01 14:05:42 +01:00
Michael Brown
4678864ce6 [build] Fix dubious uses of bitwise operators
Detected by sparse.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-04-28 17:31:23 +01:00
Michael Brown
b9663b8049 [build] Fix uses of literal 0 as a NULL pointer
Detected using sparse.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-04-28 17:13:44 +01:00
Michael Brown
445ac9fbdc [netdevice] Use link-layer address as part of RNG seed
iPXE currently seeds the random number generator using the system
timer tick count.  When large numbers of machines are booted
simultaneously, multiple machines may end up choosing the same DHCP
transaction ID (XID) value; this can cause problems.

Fix by using the least significant (and hence most variable) bits of
each network device's link-layer address to perturb the random number
generator.  This introduces some per-machine unique data into the
random number generator's seed, and so reduces the chances of DHCP XID
collisions.

This does not affect the ANS X9.82-compatible random bit generator
used by TLS and other cryptography code, which uses an entirely
separate source of entropy.

Originally-implemented-by: Bernhard Kohl <bernhard.kohl@nsn.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-04-19 14:34:03 +01:00
Michael Brown
e42bc3aa37 [libc] Use __einfo() tuple as first argument to EUNIQ()
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-04-19 00:45:13 +01:00
Michael Brown
d938e50136 [uuid] Abstract UUID mangling code out to a separate uuid_mangle() function
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-03-20 15:06:40 +00:00
Michael Brown
a9b63ecda5 [dhcp] Use PXE byte ordering for UUID in DHCP option 97
The PXE spec does not specify a byte ordering for UUIDs, but RFC4578
suggests that it follows the EFI spec, in which the first three fields
are little-endian.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-03-20 00:54:42 +00:00
Michael Brown
02b914e812 [tftp] Allow TFTP block size to be controlled via the PXE TFTP API
The PXE TFTP API allows the caller to request a particular TFTP block
size.  Since mid-2008, iPXE has appended a "?blksize=xxx" parameter to
the TFTP URI constructed internally; nothing has ever parsed this
parameter.  Nobody seems to have cared that this parameter has been
ignored for almost five years.

Fix by using xfer_window(), which provides a fairly natural way to
convey the block size information from the PXE TFTP API to the TFTP
protocol layer.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-03-06 17:35:30 +00:00
Michael Brown
77f64b11f7 [netdevice] Separate VLAN support from presence of VLAN-supporting drivers
Some NICs (e.g. Hermon) provide hardware support for stripping the
VLAN tag, but do not provide any way for this support to be disabled.
Drivers for this hardware must therefore call vlan_find() to identify
a suitable receiving network device.

Provide a weak version of vlan_find() which will always return NULL if
VLAN support has not been enabled (either directly, or by enabling
a feature such as FCoE which requires VLAN support).  This allows the
VLAN code to be omitted from builds where the user has not requested
support for VLANs.

Inspired-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-03-01 16:36:34 +00:00
Stefan Hajnoczi
7426177d63 [netdevice] Add vlan_tag() to get the VLAN tag of a network device
The iBFT has a VLAN field that should be filled in.  Add the
vlan_tag() function to extract the VLAN tag of a network device.

Since VLAN support is optional, define a weak function that returns 0
when iPXE is built without VLAN support.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-03-01 16:11:40 +00:00
Michael Brown
0acc52519d [tls] Concatenate received non-data records before processing
Allow non-data records to be split across multiple received I/O
buffers, to accommodate large certificate chains.

Reported-by: Nicola Volpini <Nicola.Volpini@kambi.com>
Tested-by: Nicola Volpini <Nicola.Volpini@kambi.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-01-31 09:59:36 +00:00
Stefan Weil
3fcb8cf8dc [src] Fix spelling in comments, debug messages and local variable names
Fixes in comments and debug messages:

  existance -> existence
  unecessary -> unnecessary
  occured -> occurred
  decriptor -> descriptor
  neccessary -> necessary
  addres, adress -> address
  initilize -> initialize
  sucessfully -> successfully
  paramter -> parameter
  acess -> access
  upto -> up to
  likelyhood ->likelihood
  thru -> through
  substracting -> subtracting
  lenght -> length
  isnt -> isn't
  interupt -> interrupt
  publically -> publicly (this one was not wrong, but unusual)
  recieve -> receive
  accessable -> accessible
  seperately -> separately
  pacet -> packet
  controled -> controlled
  dectect -> detect
  indicies -> indices
  extremly -> extremely
  boundry -> boundary
  usefull -> useful
  unuseable -> unusable
  auxilliary -> auxiliary
  embeded -> embedded
  enviroment -> environment
  sturcture -> structure
  complier -> compiler
  constructes -> constructs
  supress -> suppress
  intruduced -> introduced
  compatability -> compatibility
  verfication -> verification
  ths -> the
  reponse -> response

Fixes in local variable names:

  retreive -> retrieve

Most of these fixes were made using codespell.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-01-03 15:18:48 +00:00
Michael Brown
4867085c0c [build] Include version number within only a single object file
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2012-11-02 14:46:39 +00:00
Michael Brown
88e19fcda9 [netdevice] Clear network device setting before unregistering
Avoid memory leaks by clearing any (non-child) settings immediately
before unregistering the network device settings block.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2012-10-24 22:44:00 -07:00
Michael Brown
947976da0c [netdevice] Do not force a poll on net_tx()
Including a netdev_poll() within net_tx() can cause the net_step()
loop to end up processing hundreds or thousands of packets within a
single step, since each received packet being processed may trigger a
response which, in turn causes a poll for further received packets.

Network devices must now ensure that the TX ring is at least as large
as the RX ring, in order to avoid running out of TX descriptors.  This
should not cause any problems; unlike the RX ring, there is no
substantial memory cost incurred by increasing the TX ring size.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2012-10-24 14:04:41 -07:00
Michael Brown
885384faf3 [arp] Increase robustness of ARP discarder
Take ownership from the ARP cache at the start of arp_destroy(), to
ensure that no code path can lead to arp_destroy() being re-entered.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2012-10-19 23:03:38 +01:00
Michael Brown
d23db28488 [tls] Fix potential memory leak
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2012-09-28 10:54:07 +01:00
Michael Brown
1e199c8260 [tls] Fix uninitialised variable
Reported-by: Christian Hesse <list@eworm.de>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2012-09-28 10:52:17 +01:00
Michael Brown
72db14640c [tls] Split received records over multiple I/O buffers
TLS servers are not obliged to implement the RFC3546 maximum fragment
length extension, and many common servers (including OpenSSL, as used
in Apache's mod_ssl) do not do so.  iPXE may therefore have to cope
with TLS records of up to 16kB.  Allocations for 16kB have a
non-negligible chance of failing, causing the TLS connection to abort.

Fix by maintaining the received record as a linked list of I/O
buffers, rather than a single contiguous buffer.  To reduce memory
pressure, we also decrypt in situ, and deliver the decrypted data via
xfer_deliver_iob() rather than xfer_deliver_raw().

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2012-09-27 01:56:01 +01:00
Michael Brown
03f0c23f8b [ipoib] Expose Ethernet-compatible eIPoIB link-layer addresses and headers
Almost all clients of the raw-packet interfaces (UNDI and SNP) can
handle only Ethernet link layers.  Expose an Ethernet-compatible link
layer to local clients, while remaining compatible with IPoIB on the
wire.  This requires manipulation of ARP (but not DHCP) packets within
the IPoIB driver.

This is ugly, but it's the only viable way to allow IPoIB devices to
be driven via the raw-packet interfaces.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2012-08-31 21:22:59 +01:00
Michael Brown
f54a61e434 [infiniband] Include destination address vector in ib_complete_recv()
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2012-08-31 21:22:58 +01:00
Michael Brown
cbe41cb31b [infiniband] Use explicit "source" and "dest" address vector parameter names
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2012-08-31 21:22:58 +01:00
Michael Brown
f747fac3e1 [infiniband] Allow queue pairs to have a custom allocator for receive iobufs
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2012-08-31 21:22:57 +01:00
Michael Brown
de802310bc [retry] Expose retry_poll() to explicitly poll all running timers
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2012-08-31 20:21:15 +01:00
Michael Brown
1cbb1581f1 [ethernet] Expose eth_broadcast as a global constant
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2012-08-31 20:21:10 +01:00
Michael Brown
79300e2ddf [tls] Disambiguate most error causes
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2012-08-25 04:08:04 +01:00
Michael Brown
8f7cd88af5 [http] Fix HTTP SAN booting
Commit 501527d ("[http] Treat any unexpected connection close as an
error") introduced a regression causing HTTP SAN booting to fail.  At
the end of the response to the HEAD request, the call to http_done()
would erroneously believe that the server had disconnected in the
middle of the HTTP headers.

Fix by treating the header block from a HEAD request as a trailer
block.  This fixes the problem and also simplifies the logic in
http_rx_header().

Reported-by: Shao Miller <shao.miller@yrdsb.edu.on.ca>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2012-08-17 18:00:40 +01:00
Marin Hannache
1170a36e6b [ftp] Add support for the FTP SIZE command
The FTP SIZE command allows us to get the size of a particular file,
as a consequence, we can now show proper transfer progression while
fetching a file using the FTP protocol.

Signed-off-by: Marin Hannache <git@mareo.fr>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2012-08-15 17:04:41 +01:00
Michael Brown
501527daab [http] Treat any unexpected connection close as an error
iPXE currently checks that the server has not closed the connection
mid-stream (i.e. in the middle of a chunked transfer, or before the
specified Content-Length has been received), but does not check that
the server got as far as starting to send data.  Consequently, if the
server closes the connection before any data is transferred (e.g. if
the server gives up waiting while iPXE performs the validation steps
for TLS), then iPXE will treat this as a successful transfer of a
zero-length file.

Fix by checking the RX connection state, and forcing an error if the
server has closed the connection at an unexpected point.

Originally-fixed-by: Marin Hannache <mareo@mareo.fr>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2012-08-15 16:29:22 +01:00
Michael Brown
c3b4860ce3 [legal] Update FSF mailing address in GPL licence texts
Suggested-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2012-07-20 19:55:45 +01:00
Michael Brown
a5d16a91af [tcp] Truncate TCP window to prevent future packet discards
Whenever memory pressure causes a queued packet to be discarded (and
so retransmitted), reduce the maximum TCP window to a size that would
have prevented the discard.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2012-07-09 10:13:47 +01:00
Michael Brown
024247317d [arp] Try to avoid discarding ARP cache entries
Discarding the active ARP cache entry in the middle of a download will
substantially disrupt the TCP stream.  Try to minimise any such
disruption by treating ARP cache entries as expensive, and discarding
them only when nothing else is available to discard.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2012-07-09 10:08:38 +01:00
Michael Brown
b0e236a9ee [netdevice] Process all received packets in net_poll()
The current logic is to process at most one received packet per call
to net_poll(), on the basis that refilling the hardware descriptor
ring should be delayed as little as possible.  However, this limits
the rate at which packets can be processed and ultimately ends up
adding latency which, in turn, limits the achievable throughput.

With temporary modifications in place to essentially remove all
resource constraints (heap size increased to 16MB, RX descriptor ring
increased to 64 descriptors) and a TCP window size of 1MB, the
throughput on a gigabit (i.e. 119MBps) network can be observed to fall
off exponentially from around 115MBps to around 75MBps.  Changing
net_poll() to process all received packets results in a steady
119MBps throughput.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2012-07-04 13:41:49 +01:00
Michael Brown
19859d8ead [arp] Prevent ARP cache entries from being deleted mid-transmission
Each ARP cache entry maintains a transmission queue, which is sent out
as soon as the link-layer address is known.  If multiple packets are
queued, then it is possible for memory pressure to cause the ARP cache
discarder to be invoked during transmission of the first packet, which
may cause the ARP cache entry to be deleted before the second packet
can be sent.  This results in an invalid pointer dereference.

Avoid this problem by reference-counting ARP cache entries and
ensuring that an extra reference is held while processing the
transmission queue, and by using list_first_entry() rather than
list_for_each_entry_safe() to traverse the queue.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2012-07-01 18:31:23 +01:00
Michael Brown
55f52bb77a [tcp] Avoid potential NULL pointer dereference
Commit ea61075 ("[tcp] Add support for TCP window scaling") introduced
a potential NULL pointer dereference by referring to the connection's
send window scale before checking whether or not the connection is
known.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2012-06-30 19:03:07 +01:00
Michael Brown
49ac629821 [tcp] Use a zero window size for RST packets
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2012-06-30 19:00:05 +01:00
Michael Brown
9a8c6b00d4 [tls] Request a maximum fragment length of 2048 bytes
The default maximum plaintext fragment length for TLS is 16kB, which
is a substantial amount of memory for iPXE to have to allocate for a
temporary decryption buffer.

Reduce the memory footprint of TLS connections by requesting a maximum
fragment length of 2kB.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2012-06-29 15:28:15 +01:00
Michael Brown
ea61075c60 [tcp] Add support for TCP window scaling
The maximum unscaled TCP window (64kB) implies a maximum bandwidth of
around 300kB/s on a WAN link with an RTT of 200ms.  Add support for
the TCP window scaling option to remove this upper limit.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2012-06-29 15:05:33 +01:00
Michael Brown
1d77d03216 [tcpip] Allow for architecture-specific TCP/IP checksum routines
Calculating the TCP/IP checksum on received packets accounts for a
substantial fraction of the response latency.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2012-06-27 19:15:17 +01:00
Michael Brown
cbc54bf559 [syslog] Include hostname within syslog messages where possible
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2012-06-20 14:59:06 +01:00
Michael Brown
7ea6764031 [settings] Move "domain" setting from dns.c to settings.c
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2012-06-20 14:39:03 +01:00
Michael Brown
c0942408b7 [dhcp] Request broadcast responses when we already have an IPv4 address
FCoE requires the use of multiple local unicast link-layer addresses.
To avoid the complexity of managing multiple addresses, iPXE operates
in promiscuous mode.  As a consequence, any unicast packets with
non-matching IPv4 addresses are rejected at the IPv4 layer (rather
than at the link layer).

This can cause problems when issuing a second DHCP request: if the
address chosen by the DHCP server does not match the existing address,
then the DHCP response will itself be rejected.

Fix by requesting a broadcast response from the DHCP server if the
network interface already has any IPv4 addresses.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2012-06-20 12:01:50 +01:00
Michael Brown
af47789ef2 [tls] Mark security negotiation as a pending operation
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2012-06-09 18:59:41 +01:00
Michael Brown
5482b0abb6 [tcp] Mark any unacknowledged transmission as a pending operation
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2012-06-09 18:56:07 +01:00
Michael Brown
5af9ad51c8 [crypto] Fix unused-but-set variable warning
Reported-by: Brandon Penglase <bpenglase-ipxe@spaceservices.net>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2012-05-23 23:48:12 +01:00
Michael Brown
658c25aa82 [http] Add support for Digest authentication
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2012-05-22 23:43:44 +01:00