david/ipxe
david
/
ipxe
Archived
1
0
Fork 0
Commit Graph

954 Commits

Author SHA1 Message Date
Michael Brown 2988b26653 [http] Support read-only HTTP block devices
Provide support for HTTP range requests, and expose this functionality
via the iPXE block device API.  This allows SAN booting from a root
path such as:

    sanboot http://boot.ipxe.org/freedos/fdfullcd.iso

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-06-28 14:45:14 +01:00
Michael Brown 5eb60f4883 [tls] Eliminate polling while TX state machine is idle
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-06-28 14:45:12 +01:00
Michael Brown bce34e87df [iscsi] Eliminate polling while waiting for window to open
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-06-28 14:45:12 +01:00
Michael Brown 3ad1a1a60a [http] Eliminate polling while waiting for window to open
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-06-28 14:45:12 +01:00
Michael Brown 019d4c1c18 [infiniband] Use a one-shot process for CMRC shutdown
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-06-28 14:45:11 +01:00
Michael Brown ce3bc9e88b [fc] Use a one-shot process for Fibre Channel name server queries
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-06-28 14:45:10 +01:00
Michael Brown 08ac74b708 [fc] Use a one-shot process for Fibre Channel ELS requests
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-06-28 14:45:10 +01:00
Michael Brown e01ec74601 [process] Pass containing object pointer to process step() methods
Give the step() method a pointer to the containing object, rather than
a pointer to the process.  This is consistent with the operation of
interface methods, and allows a single function to serve as both an
interface method and a process step() method.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-06-28 14:45:08 +01:00
Michael Brown c68bf14559 [tcp] Send xfer_window_changed() when window opens
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-06-28 14:45:08 +01:00
Michael Brown 1e90ff0eb7 [infiniband] Send xfer_window_changed() when CMRC connection is established
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-06-28 14:45:07 +01:00
Michael Brown 0cc03ac76a [tls] Send xfer_window_changed() when TLS session is established
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-06-28 14:45:07 +01:00
Michael Brown 5f608a44a5 [fc] Send xfer_window_changed() when FCP link is established
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-06-28 14:45:07 +01:00
Michael Brown bf8bfa23e2 [fc] Maintain a list of Fibre Channel upper-layer protocol users
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-06-28 14:45:01 +01:00
Michael Brown 5763472b34 [ftp] Remove redundant ftp_data_deliver() method
ftp_data_deliver() does nothing except pass through the received data
to the xfer interface, and so can be eliminated by using a
pass-through interface.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-06-28 13:39:14 +01:00
Michael Brown cc7c2a9dcd [ipv4] Record ARP resolution errors
At the time of attempting ARP resolution, we already know the
transmitting network device.  We can therefore record ARP errors using
netdev_tx_err() so that they show up in the output of "ifstat".

Inspired-by: Dominik Russenberger <dominik.russenberger@terreactive.ch>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-06-28 10:21:30 +01:00
Michael Brown d6115c91cf [netdevice] Allow non-completion TX errors to be recorded
Allow TX errors to be recorded against a network device even when the
packet didn't make it as far as netdev_tx().

Inspired-by: Dominik Russenberger <dominik.russenberger@terreactive.ch>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-06-28 10:19:23 +01:00
Michael Brown c1cc769ef4 [ipv4] Include network device metadata in packet traces
(Ab)use the "ident" field in transmitted IPv4 packets to convey
metadata about the network device.  In particular:

    bits 0-3 represent the low bits of the "RX" good packet counter
    bits 4-7 represent the low bits of the "RXE" bad packet counter
    bits 8-15 represent the transmitted packet sequence number

This allows some relevant information about the internal state of the
network device to be read out from a packet trace from a non-debug
build of iPXE.  In particular, it allows a packet trace containing
packets transmitted by iPXE to indicate whether or not any packets
have been received by iPXE.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-05-05 18:10:31 +01:00
Michael Brown 8f51db233a [http] Support chunked transfer encoding
Booting from an HTTP SAN will require HTTP range requests, which are
defined only in HTTP/1.1 and above.  HTTP/1.1 mandates support for
"Transfer-Encoding: chunked", so we must support it.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-05-05 15:32:34 +01:00
Michael Brown 0b6808aadc [netdevice] Improve detection of bugs in drivers' TX completion handling
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-05-03 20:07:30 +01:00
Michael Brown 9e3604168a [netdevice] Move high-frequency debug messages to DBGLVL_EXTRA
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-05-03 20:01:11 +01:00
Michael Brown c4369eb6c2 [tcp] Update ts_recent whenever window is advanced
Commit 3f442d3 ("[tcp] Record ts_recent on first received packet")
failed to achieve its stated intention.

Fix this (and reduce the code size) by moving the ts_recent update to
tcp_rx_seq().  This is the code responsible for advancing the window,
called by both tcp_rx_syn() and tcp_rx_data(), and so the window check
is now redundant.

Reported-by: Frank Weed <zorbustheknight@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-04-03 00:44:22 +01:00
Michael Brown 58dcb2e15e [tftp] Avoid setting current working URI to "tftp://0.0.0.0/"
Set the current working URI to NULL rather than to "tftp://0.0.0.0/".

Reported-by: Piotr Jaroszyński <p.jaroszynski@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-03-31 04:54:27 +01:00
Michael Brown 1588b9336e [netdevice] Simplify link-down status message
For devices that start in a link-down state, the user will see a
message such as:

  [Link status: The socket is not connected (http://ipxe.org/38086001)]
  Waiting for link-up on net0...

This is potentially misleading, since it suggests that there is a
genuine problem.  Add a dedicated error message for "link down",
giving instead:

  [Link status: Down (http://ipxe.org/38086101)]
  Waiting for link-up on net0...

Reported-by: Tal Aloni <tal.aloni.il@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-03-30 12:45:12 +01:00
Piotr Jaroszyński 8ab2f51997 [netdevice] Mark devices as open only if opening succeeds
netdev_close() assumes that devices that are open are on the
open_list, which wasn't true if device specific opening failed.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-03-27 18:59:13 +01:00
Michael Brown 3f442d3f60 [tcp] Record ts_recent on first received packet
Commit 6861304 ("[tcp] Handle out-of-order received packets")
introduced a regression in which ts_recent would not be updated until
the first packet is received in the ESTABLISHED state, i.e. the
timestamp from the SYN+ACK packet would be ignored.  This causes the
connection to be dropped by strictly-conforming TCP peers, such as
FreeBSD.

Fix by delaying the timestamp window check until after processing the
received SYN flag.

Reported-by: winders@sonnet.com
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-03-26 15:02:41 +00:00
Michael Brown 02a6f46c09 [settings] Match terminology in online documentation
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-03-23 21:21:12 +00:00
Michael Brown 8482451812 [settings] Impose a fixed order on settings
Improve the appearance of the "config" user interface by ensuring that
settings appear in some kind of logical order.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-03-23 11:57:29 +00:00
Michael Brown a04603a070 [settings] Reject attempts to change a network device's bus ID
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-03-23 01:25:17 +00:00
Michael Brown f5fd4dec3b [settings] Formalise notion of setting applicability
Expose a function setting_applies() to allow a caller to determine
whether or not a particular setting is applicable to a particular
settings block.

Restrict DHCP-backed settings blocks to accepting only DHCP-based
settings.

Restrict network device settings blocks to accepting only DHCP-based
settings and network device-specific settings such as "mac".

Inspired-by: Glenn Brown <glenn@myri.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-03-22 19:54:58 +00:00
Michael Brown e49d81689c [syslog] Add support for sending console output to a syslog server
Originally-implemented-by: Anselm Martin Hoffmeister <anselm@hoffmeister.be>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-03-10 05:36:31 +00:00
Michael Brown 960dee6dd0 [iscsi] Change default initiator IQN
The default initiator IQN is "iqn.2000-09.org.etherboot:UNKNOWN".
This is problematic for two reasons:

  a) the etherboot.org domain (and hence the associated IQN namespace)
     is not under the control of the iPXE project, and

  b) some targets (correctly) refuse to allow concurrent connections
     from different initiators using the same initiator IQN.

Solve both problems by changing the default initiator IQN to be

  iqn.2010-04.org.ipxe:<hostname> if a hostname is set, or

  iqn.2010-04.org.ipxe:<uuid> if no hostname is set.

Explicit initiator IQNs set via DHCP option 203 are not affected by
this change.

Unfortunately, this change is likely to break some existing
configurations, where ACL rules have been put in place referring to
the old default initiator IQN.  Users may need to update ACLs, or
force the use of the old IQN using an iPXE script line such as

  set initiator-iqn iqn.2000-09.org.etherboot:UNKNOWN

or a dhcpd.conf option such as

   option iscsi-initiator-iqn "iqn.2000-09.org.etherboot:UNKNOWN"

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-03-03 22:23:44 +00:00
Michael Brown bbe265e08b [dns] Fix memory leak in settings applicator
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-03-03 20:09:29 +00:00
Michael Brown ef87c4ad08 [iscsi] Clarify support for NOP-In
After a more accurate reading of RFC 3720, it becomes clear how NOPs
are supposed to work.  The current implementation (which just ignores
NOP-Ins) is sufficient to cope with NOP-Ins sent to update CmdSN, but
will need to be extended before it can cope with NOP-Ins sent as iSCSI
keepalives.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-02-25 11:11:30 +00:00
Michael Brown 9625132bf5 [iscsi] Verify the correct tag in NOP-In PDUs
We should be checking the target transfer tag, rather than the
initiator task tag.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-02-25 10:41:23 +00:00
Michael Brown 711df439df [iscsi] Accept NOP-In PDUs sent by the target
Some iSCSI targets (observed with a Synology DS207+ NAS) send
unsolicited NOP-Ins to the initiator.  RFC 3720 is remarkably unclear
and possibly self-contradictory on how NOPs are supposed to work, but
it seems as though we can legitimately just ignore any unsolicited
NOP-In PDU.

Reported-by: Marc Lecuyer <marc@maxiscreen.com>
Originally-implemented-by: Thomas Miletich <thomas.miletich@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-02-24 13:25:32 +00:00
Michael Brown 7ef314514c [iscsi] Disambiguate the expected target errors in the login response
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-02-23 09:52:02 +00:00
Michael Brown 66caec3f00 [netdevice] Allow devices to indicate that interrupts are not supported
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-01-25 14:16:11 +00:00
Michael Brown 17d28f4877 [nvo] Allow resizing of non-volatile stored option blocks
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-01-19 13:52:48 +00:00
Michael Brown 17b6a3c506 [dhcp] Allow use of custom reallocation functions for DHCP option blocks
Allow functions other than realloc() to be used to reallocate DHCP
option block data, and specify the reallocation function at the time
of calling dhcpopt_init().

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-01-11 21:24:40 +00:00
Michael Brown 310d46c1ed [dhcp] Rename length fields for DHCP options
Rename "len" to "used_len" and "max_len" to "alloc_len".

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-01-10 03:39:26 +00:00
Michael Brown 6cee8904d1 [dhcp] Remove redundant length fields in struct dhcp_packet
The max_len field is never used, and the len field is used only by
dhcp_tx().  Remove these two fields, and perform the necessary trivial
calculation in dhcp_tx() instead.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2011-01-10 03:39:26 +00:00
Michael Brown 708c5060b9 [dhcp] Use Ethernet-compatible chaddr, if possible
For IPoIB, we currently use the hardware address (i.e. the eight-byte
GUID) as the DHCP chaddr.  This works, but some PXE servers (notably
Altiris RDP) refuse to respond if the chaddr field is anything other
than six bytes in length.

We already have the notion of an Ethernet-compatible link-layer
address, which is used in the iBFT (the design of which similarly
fails to account for non-Ethernet link layers).  Use this as the first
preferred alternative to the actual link-layer address when
constructing the DHCP chaddr field.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-12-15 18:46:19 +00:00
Michael Brown 5273c2748c [vlan] Expose vlan_find() to network card drivers
Some network cards automatically strip the VLAN header, providing the
VLAN tag via a side channel such as a completion queue entry.  These
cards need to be able to report receive completions directly against
the relevant VLAN device.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-12-01 18:46:50 +00:00
Michael Brown 51a9e517f2 [vlan] Use "-" instead of "." as separator in VLAN device names
VLAN device names have the form "netX.Y", e.g. "net0.5" for VLAN 5 on
net0.  This use of "." conflicts with the use of "." as the
hierarchical separator in settings block names, with the result that
VLAN device settings cannot be accessed by name.

It would be trivial to treat the VLAN device settings as being a child
of the trunk device settings, but this would cause the VLAN device
settings to be applied to the trunk device: for example, setting
"net0.5/ip" would then apply the IP address to both net0.5 and net0.

Fix by changing the VLAN device name to use "-" instead of ".": the
VLAN device "net0.5" is now "net0-5".

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-12-01 17:15:52 +00:00
Michael Brown 67b45186a5 [settings] Apply settings block name in register_settings()
Pass the settings block name as a parameter to register_settings(),
rather than defining it with settings_init() (and then possibly
changing it by directly manipulating settings->name).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-12-01 16:35:00 +00:00
Michael Brown de6a59470b [iscsi] Disambiguate the common EINVAL cases
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-12-01 01:23:50 +00:00
Michael Brown 34dab1007c [dns] Disambiguate "no nameserver" and "no DNS record" errors
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-12-01 00:47:09 +00:00
Michael Brown 54ec712ebe [fcoe] Use only the first instance of a FIP descriptor
Almost all FIP packets contain at most one instance of each
descriptor.  A VLAN notification may contain multiple VLAN
descriptors.  The FCoE specification does not provide any guidance
regarding prioritisation of VLANs, so we may choose to arbitrarily
choose the first listed VLAN.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-11-27 16:04:57 +00:00
Michael Brown 98817e2c38 [fcoe] Tidy up debug message
The increase in length in Fibre Channel device names causes the
"selected FCF" message to wrap beyond 80 characters.  Fix by using
abbreviations where possible.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-11-27 16:04:57 +00:00
Michael Brown 1415ec9c9a [fc] Allow Fibre Channel ports to be explicitly named
Use the network interface name as the Fibre Channel port name.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-11-27 14:37:53 +00:00
Michael Brown d17e87da7d [fcoe] Create Fibre Channel port only when we have selected an FCF
Create the Fibre Channel port only when the FCoE port has selected a
Fibre Channel Forwarder to use.  This avoids the confusion of having
an FC port created for the network device on which only VLAN discovery
is performed.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-11-27 14:21:08 +00:00
Michael Brown 1790f56fb2 [fcoe] Add support for FIP VLAN discovery
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-11-26 01:09:41 +00:00
Michael Brown b4706c88c9 [vlan] Provide vlan_can_be_trunk()
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-11-26 01:09:40 +00:00
Michael Brown f1e1545372 [vlan] Add non-error debug messages
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-11-26 01:09:40 +00:00
Michael Brown 7e1b1d6145 [vlan] Allow duplicate VLAN creation attempts
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-11-26 01:09:35 +00:00
Michael Brown 6fd09b541f [vlan] Add support for IEEE 802.1Q VLANs
Originally-implemented-by: michael-dev@fami-braun.de
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-11-20 16:52:04 +00:00
Michael Brown 4576c2da58 [netdevice] Allow per-device receive queue processing to be frozen
Several use cases (e.g. the UNDI API and the EFI SNP API) require
access to the raw network device receive queue, and so currently use
manual calls to netdev_poll() on a specific network device in order to
prevent received packets from being processed by the network stack.

As an alternative, provide a flag that allows receive queue processing
to be frozen on a per-device basis.  When receive queue processing is
frozen, packets will be enqueued as normal, but will not be
automatically dequeued and passed up the network stack.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-11-20 15:46:00 +00:00
Michael Brown d012f87018 [tcp] Use MAX_LL_NET_HEADER_LEN instead of defining our own MAX_HDR_LEN
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-11-19 16:08:05 +00:00
Michael Brown 5de4fba4f9 [udp] Use MAX_LL_NET_HEADER_LEN instead of defining our own UDP_MAX_HLEN
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-11-19 16:08:05 +00:00
Michael Brown 3d9096f719 [lacp] Fix dumping of raw LACP packets
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-11-18 17:41:44 +00:00
Michael Brown 24fc6aa5b0 [netdevice] Use net device name in debugging messages
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-11-18 17:41:02 +00:00
Michael Brown 67dc832d15 [tcp] Set PSH flag only on packets containing data
Suggested-by: Yelena Kadach <klenusik@hotmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-11-11 01:14:05 +00:00
Shao Miller 98b3599a65 [list] Fix typographical error from previous commit
Fix typographical error from commit ea631f6 ("[list] Add
list_first_entry()").  The symptom was PXELINUX 3.86 causing a stack
overflow under VMware.

Tested-by: Shao Miller <shao.miller@yrdsb.edu.on.ca>
Signed-off-by: Shao Miller <shao.miller@yrdsb.edu.on.ca>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-11-11 00:30:22 +00:00
Michael Brown 8e718df5e1 [fc] Add support for Fibre Channel name server lookups
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-11-08 03:35:36 +00:00
Michael Brown 41231fda9c [fc] Hold ULP's peer reference while ULP exists
Allow fc_ulp_decrement() to guarantee to fc_peer_decrement() that the
peer reference remains valid for the duration of the call, by ensuring
that ulp->peer remains valid while ulp is valid.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-11-08 03:35:36 +00:00
Michael Brown 0cd185e734 [fc] Allow peers and ULPs to log out when usage count reaches zero
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-11-08 03:35:36 +00:00
Michael Brown c09f87e3b7 [fc] Hold reference to peers and ULPs while calling fc_link_examine()
Allow link examination methods to safely assume that their
self-reference remains valid for the duration of the method call.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-11-08 03:35:36 +00:00
Michael Brown 00cffae5f9 [fc] Log out correct port ID after a successful LOGO request
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-11-08 03:35:36 +00:00
Michael Brown 90930be8fe [fc] Support Fibre Channel ECHO
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-11-08 03:35:36 +00:00
Michael Brown f5115f96f7 [fcp] Use EINVAL for URI parsing errors and EPROTO for protocol errors
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-11-08 03:35:36 +00:00
Michael Brown 66e7619099 [retry] Process at most one timer's expiry in each call to retry_step()
Calling a timer's expiry method may cause arbitrary consequences,
including arbitrary modifications of the list of retry timers.
list_for_each_entry_safe() guards against only deletion of the current
list entry; it provides no protection against other list
modifications.  In particular, if a timer's expiry method causes the
subsequent timer in the list to be deleted, then the next loop
iteration will access a timer that may no longer exist.

This is a particularly nasty bug, since absolutely none of the
list-manipulation or reference-counting assertion checks will be
triggered.  (The first assertion failure happens on the next iteration
through list_for_each_entry(), showing that the list has become
corrupted but providing no clue as to when this happened.)

Fix by stopping traversal of the list of retry timers as soon as we
hit an expired timer.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-11-08 03:35:36 +00:00
Michael Brown ea631f6fb8 [list] Add list_first_entry()
There are several points in the iPXE codebase where
list_for_each_entry() is (ab)used to extract only the first entry from
a list.  Add a macro list_first_entry() to make this code easier to
read.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-11-08 03:15:28 +00:00
Michael Brown a59bb9c313 [fcp] Avoid quoting exchange ID before exchange is created
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-11-03 01:55:53 +00:00
Michael Brown 0654698cd7 [fcp] Fix potential memory leaks on error paths
Functions that instantiate objects generally own one reference to the
object being created.  The error paths must therefore usually call
ref_put() to release this reference.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-11-03 01:48:59 +00:00
Michael Brown b0e434280e [fc] Do not use the command reference number in FCP_CMND IUs
The FCP command reference number is intended to be used for
controlling precise delivery of FCP commands, rather than being an
essentially arbitrary tag field (as with iSCSI and SRP).

Use the Fibre Channel local exchange ID as the tag for FCP commands,
instead of the FCP command reference.  The local exchange ID does not
appear within the FCP IU itself, but does appear within the FC frame
header; debug traces can therefore still be correlated with packet
captures.

Reported-by: Hadar Hen Zion <hadarh@mellanox.co.il>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-10-19 18:41:50 +01:00
Michael Brown 19c59bb131 [iscsi] Ensure ISID is consistent within an iSCSI session
Commit 5f4ab0d ("[iscsi] Randomise a portion of the ISID to force new
session instantiation") introduced a regression by randomising the
ISID on each call to iscsi_start_login(), which may be called more
than once per connection, rather than on each call to
iscsi_open_connection(), which is guaranteed to be called only once
per connection.  This is incorrect behaviour that causes our
connection to be rejected by some iSCSI targets (observed with a
COMSTAR target under OpenSolaris).

Fix by generating the ISID in iscsi_open_connection(), and storing the
randomised ISID as part of the session state.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-10-18 14:40:27 +01:00
Michael Brown 5f4ab0d22a [iscsi] Randomise a portion of the ISID to force new session instantiation
When a connection to an iSCSI target is broken without gracefully
closing the TCP socket, a subsequent connection attempt may fail
because the target believes that we are attempting session
reinstatement (see RFC3720 section 5.3.1).  This has been observed
using the Microsoft iSCSI target.

Section 9.1.1 of RFC3720 states that initiators should use a stable
ISID, however section 5.3.1 shows that the only way to explicitly
request that a new session be created is to use a new ISID.

Fix by randomising the "qualifier" portion of the ISID.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-10-16 22:11:08 +01:00
Michael Brown 60b690141e [fc] Use port WWN rather than node WWN as the primary Fibre Channel name
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-10-15 01:54:48 +01:00
Michael Brown a9c799250f [fcoe] Request SPMA iff FIP advertisement indicates support for SPMA
We currently set both the FP and SP bits in our FIP FLOGI, to allow
the FCF the choice of selecting either a fabric-provided or a server-
provided MAC address.  This complies with the FCoE specification, but
has been observed to result in an FLOGI rejection from some FCFs.

Fix by recording whether or not the FCF supports SPMA, and requesting
only one of FPMA or SPMA in our FIP FLOGI.  We choose to prefer SPMA
where available, because many iPXE drivers will not be able to receive
unicast packets sent to a non-default MAC address.

Reported-by: Hadar Hen Zion <hadarh@mellanox.co.il>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-10-15 00:04:11 +01:00
Michael Brown 6d11229e83 [dhcp] Include session state metadata in packet traces
(Ab)use the "secs" field in transmitted DHCP packets to convey
metadata about the DHCP session state.  In particular:

  bit 0 represents the receipt of a ProxyDHCPOFFER
  bit 1 represents the receipt of a DHCPOFFER
  bits 2+ represent the transmitted packet sequence number

This allows some relevant information about the internal state of the
DHCP session to be read out from a packet trace from a non-debug build
of iPXE.  It also potentially allows replies to be correlated to their
requests (for servers that copy the "secs" field from request to
reply).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-10-09 01:24:18 +01:00
Michael Brown 831106a875 [dhcp] Omit ProxyDHCPREQUEST if PXE options are present in ProxyDHCPOFFER
Some ProxyDHCP implementations seem to violate the PXE specification
by expecting the client to retain options from the ProxyDHCPOFFER
rather than issuing a separate ProxyDHCPREQUEST.

Work around such broken clients by retaining the ProxyDHCPOFFER
packet, and proceeding to a ProxyDHCPREQUEST only if the
ProxyDHCPOFFER does not already contain PXE options.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-10-08 01:45:53 +01:00
Michael Brown ba6aca3424 [dhcp] Ignore DHCPACKs containing incorrect IP addresses
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-10-08 01:45:31 +01:00
Michael Brown c517d0ea7f [dhcp] Revert various patches
A recent patch series breaks compatibility with various common DHCP
implementations.

Revert "[dhcp] Don't consider invalid offers to be duplicates"
This reverts commit 905ea56753.

Revert "[dhcp] Honor PXEBS_SKIP option in discovery control"
This reverts commit 620b98ee4b.

Revert "[dhcp] Keep multiple DHCP offers received, and use them intelligently"
This reverts commit 5efc2fcb60.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-10-08 01:44:34 +01:00
Michael Brown 0f4fd09180 [fcoe] Add support for the FCoE Initialization Protocol (FIP)
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-10-07 19:20:36 +01:00
Michael Brown 5e56e5f5a3 [fc] Update ELS port IDs when receiving an ELS frame
The port ID assigned by the FLOGI response is implicit in the
destination ID used for the response (which will differ from the
source ID used for the corresponding request).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-10-07 19:19:50 +01:00
Michael Brown 1775a6f25e [fc] Include port IDs in metadata for received Fibre Channel frames
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-10-07 19:16:34 +01:00
Michael Brown 88dd921e24 [netdevice] Pass both link-layer addresses in net_tx() and net_rx()
FCoE requires the use of fabric-provided MAC addresses, which breaks
the assumption that the net device's MAC address is implicitly the
source address for net_tx() and the (unicast) destination address for
net_rx().

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-10-07 19:15:04 +01:00
Michael Brown a5a4dcd0c7 [fcp] Add support for describing an FCP device using EDD
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-09-22 17:12:48 +01:00
Michael Brown bddc3835ac [fcoe] Add support for identifying the underlying hardware device
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-09-22 17:11:52 +01:00
Michael Brown 9e036d32ba [infiniband] Add support for identifying the underlying hardware device
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-09-22 17:11:29 +01:00
Michael Brown d068049789 [aoe] Add support for identifying the underlying hardware device
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-09-22 17:10:56 +01:00
Michael Brown adbe63860a [aoe] Fail immediately when network device is closed
Avoid a tedious timeout delay when attempting to issue a command over
a network device that has been closed.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-09-22 16:43:37 +01:00
Michael Brown 26a50c3a11 [infiniband] Add the notion of an Ethernet queue pair type
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-09-21 02:12:06 +01:00
Michael Brown 118a0ca55a [infiniband] Avoid leaving uninitialised lists in struct ib_device
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-09-21 02:10:56 +01:00
Michael Brown a8e39a9ca7 [fc] Ignore fabric-assigned port ID for fabricless implicit logouts
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-09-21 02:08:05 +01:00
Michael Brown 654da534ad [fc] Allow FLOGI response to be sent to newly-assigned peer port ID
The response to a received FLOGI should probably be sent to the peer
port ID assigned as a result of the WWPN comparison.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-09-21 02:06:06 +01:00
Michael Brown 24efbaefe7 [fc] Maintain port, peer and ULP lists in order of creation
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-09-18 13:23:58 +01:00
Michael Brown 42cf4a720c [infiniband] Add node GUID as distinct from the first port GUID
iPXE currently uses the first port's port GUID as the node GUID,
rather than using the (possibly distinct) real node GUID.  This can
confuse opensm during the handover to a loaded OS: it thinks the port
already belongs to a different node and so discards our port
information with a warning message about duplicate ports.  Everything
is picked up correctly on the second subnet sweep, after opensm has
established that the "old" node no longer exists, but this can delay
link-up unnecessarily by several seconds.

Fix by using the real node GUID.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-09-16 03:30:45 +01:00
Michael Brown 09555826e9 [infiniband] Always call ib_link_state_changed() in ib_smc_update()
ib_smc_update() potentially updates the Infiniband port state, and so
should almost always be followed by a call to ib_link_state_changed().
The one exception is the call made to ib_smc_update() before the
device is registered.

Fix by removing explicit calls to ib_link_state_changed() from drivers
using ib_smc_update(), including a call to ib_link_state_changed()
within ib_smc_update(), and creating a separate ib_smc_init() for use
prior to device registration.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-09-16 03:30:45 +01:00
Michael Brown 52e54a8c69 [infiniband] Match GID/GUID terminology as used in the IBA
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-09-15 19:25:05 +01:00
Michael Brown 6574c55e27 [fcoe] Disambiguate the various error cases and add a CRC failure message
It seems as though several drivers neglect to strip the Ethernet CRC,
which will cause the FCoE footer to be misplaced and result
(coincidentally) in an "invalid CRC" error from FCoE.

Add a human-visible message indicating this, to aid in diagnosis.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-09-15 05:11:28 +01:00
Michael Brown 85a3169967 [netdevice] Report network-layer errors via network device statistics
Errors generated by the network layer in response to received packets
are liable to be lost, since nothing systematically records these
errors and often the packets do not propagate far enough through the
stack to impact upon user-visible processes.

Improve this situation by recording network-layer errors in the
network device statistics.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-09-15 05:08:16 +01:00
Michael Brown dace106f82 [fcoe] Add support for Fibre Channel over Ethernet
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-09-15 03:20:54 +01:00
Michael Brown d2a2618d76 [fcp] Add support for the Fibre Channel Protocol
The Fibre Channel Protocol provides a mechanism for transporting SCSI
commands via a Fibre Channel fabric.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-09-15 03:20:26 +01:00
Michael Brown 508ff4d614 [fc] Add support for Fibre Channel devices
Add support for Fibre Channel ports, peers, and upper-layer protocols,
and for Fibre Channel extended link services.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-09-15 03:16:24 +01:00
Michael Brown 220495f8bf [block] Replace gPXE block-device API with an iPXE asynchronous interface
The block device interface used in gPXE predates the invention of even
the old gPXE data-transfer interface, let alone the current iPXE
generic asynchronous interface mechanism.  Bring this old code up to
date, with the following benefits:

 o  Block device commands can be cancelled by the requestor.  The INT 13
    layer uses this to provide a global timeout on all INT 13 calls,
    with the result that an unexpected passive failure mode (such as
    an iSCSI target ACKing the request but never sending a response)
    will lead to a timeout that gets reported back to the INT 13 user,
    rather than simply freezing the system.

 o  INT 13,00 (reset drive) is now able to reset the underlying block
    device.  INT 13 users, such as DOS, that use INT 13,00 as a method
    for error recovery now have a chance of recovering.

 o  All block device commands are tagged, with a numerical tag that
    will show up in debugging output and in packet captures; this will
    allow easier interpretation of bug reports that include both
    sources of information.

 o  The extremely ugly hacks used to generate the boot firmware tables
    have been eradicated and replaced with a generic acpi_describe()
    method (exploiting the ability of iPXE interfaces to pass through
    methods to an underlying interface).  The ACPI tables are now
    built in a shared data block within .bss16, rather than each
    requiring dedicated space in .data16.

 o  The architecture-independent concept of a SAN device has been
    exposed to the iPXE core through the sanboot API, which provides
    calls to hook, unhook, boot, and describe SAN devices.  This
    allows for much more flexible usage patterns (such as hooking an
    empty SAN device and then running an OS installer via TFTP).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-09-14 20:37:15 +01:00
Michael Brown ef8452a642 [infiniband] Respond to CM disconnection requests
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-09-12 22:32:02 +01:00
Michael Brown e6519af60d [infiniband] Fix TID magic signature
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-09-12 22:28:53 +01:00
Michael Brown 35b19d8848 [infiniband] Add the concept of an Infiniband upper-layer driver
Replace the explicit calls from the Infiniband core to the IPoIB layer
with the general concept of an Infiniband upper-layer driver
(analogous to a PCI driver) which can create arbitrary devices on top
of Infiniband devices.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-09-05 03:06:16 +01:00
Michael Brown ca4df90a63 [netdevice] Add the concept of a network upper-layer driver
Add the concept of a network upper-layer driver, which can create
arbitrary devices on top of network devices.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-09-05 03:03:38 +01:00
Michael Brown 28934eef81 [retry] Hold reference while timer is running and during expiry callback
Guarantee that a retry timer cannot go out of scope while the timer is
running, and provide a guarantee to the expiry callback that the timer
will remain in scope during the entire callback (similar to the
guarantee provided to interface methods).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-09-03 21:28:43 +01:00
Michael Brown 364b92521a [xfer] Generalise metadata "whence" field to "flags" field
iPXE has never supported SEEK_END; the usage of "whence" offers only
the options of SEEK_SET and SEEK_CUR and so is effectively a boolean
flag.  Further flags will be required to support additional metadata
required by the Fibre Channel network model, so repurpose the "whence"
field as a generic "flags" field.

xfer_seek() has always been used with SEEK_SET, so remove the "whence"
field altogether from its argument list.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-09-03 21:21:14 +01:00
Piotr Jaroszyński b9eaf24df2 [build] Fix misaligned table entries when using gcc 4.5
Declarations without the accompanying __table_entry cause misalignment
of the table entries when using gcc 4.5.  Fix by adding the
appropriate __table_entry macro or (where possible) by removing
unnecessary forward declarations.

Signed-off-by: Piotr Jaroszyński <p.jaroszynski@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-08-20 10:13:04 +01:00
Joshua Oreman 49d6f57005 [compiler] Prevent empty weak function stubs from being removed
Even with the noinline specifier added by commit 1a260f8, gcc may skip
calls to non-inlinable functions that it knows have no side
effects. This caused the get_cached_dhcpack() call in start_dhcp(),
the weak stub of which has no code in its body, to be removed,
preventing cached DHCP from working.

Fix by adding a __keepme macro to compiler.h expanding to asm(""), as
recommended by gcc's info page, and using it in the weak stub for
get_cached_dhcpack().

Reported-by: Aaron Brooks <aaron@brooks1.net>
Tested-by: Aaron Brooks <aaron@brooks1.net>
Signed-off-by: Joshua Oreman <oremanj@rwcr.net>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-08-19 13:37:52 +01:00
Joshua Oreman 73aea88a62 [802.11] Fix a use-after-free
When we received an encrypted packet, after replacing it with its
decrypted version and freeing the encrypted original, we would
continue to look at the header of the now-freed original packet. Fix
by moving the header pointer to point at the decrypted packet instead.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-08-01 17:29:57 +01:00
Joshua Oreman 0c593d95e5 [802.11] Use correct name for sec80211_detect()
The workhorse function for detecting 802.11 security was still named
_sec80211_detect(), a holdover from the old style of weak function
handling, with the result that all networks would be identified as
"unknown".

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-08-01 17:29:07 +01:00
Piotr Jaroszyński 02e6092cd5 [tcp] Fix a 64bit compile time error
Signed-off-by: Piotr Jaroszyński <p.jaroszynski@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-07-22 21:25:40 +01:00
Michael Brown 1d3b6619e5 [tcp] Allow out-of-order receive queue to be discarded
Allow packets in the receive queue to be discarded in order to free up
memory.  This avoids a potential deadlock condition in which the
missing packet can never be received because the receive queue is
occupying all of the memory available for further RX buffers.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-07-21 12:01:50 +01:00
Michael Brown 68613047f0 [tcp] Handle out-of-order received packets
Maintain a queue of received packets, so that lost packets need not
result in retransmission of the entire TCP window.

Increase the TCP window to 8kB, in order that we can potentially
transmit enough duplicate ACKs to trigger Fast Retransmission at the
sender.

Using a 10MB HTTP download in qemu-kvm with an artificial drop rate of
1 in 64 packets, this reduces the download time from around 26s to
around 4s.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-07-21 00:00:38 +01:00
Michael Brown 9f2e76ea61 [netdevice] Provide a test mechanism for discarding packets at random
Setting NETDEV_DISCARD_RATE to a non-zero value will cause one in
every NETDEV_DISCARD_RATE packets to be discarded at random on both
the transmit and receive datapaths, allowing the robustness of
upper-layer network protocols to be tested even in simulation
environments that provide wholly reliable packet transmission.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-07-20 20:58:10 +01:00
Michael Brown f033694356 [tcp] Treat ACKs as sent only when successfully transmitted
iPXE currently forces sending (i.e. sends a pure ACK even in the
absence of fresh data to send) only in response to packets that
consume sequence space or that lie outside of the receive window.
This ignores the possibility that a previous ACK was not actually sent
(due to, for example, the retransmission timer running).

This does not cause incorrect behaviour, but does cause unnecessary
retransmissions from our peer.  For example:

 1. Peer sends final data packet (ack      106 seq 521..523)
 2. We send FIN                  (seq 106..107 ack      523)
 3. Peer sends FIN               (ack      106 seq 523..524)
 4. We send nothing since retransmission timer is running for our FIN
 5. Peer ACKs our FIN            (ack      107 seq 524..524)
 6. We send nothing since this packet consumes no sequence space
 7. Peer retransmits FIN         (ack      107 seq 523..524)
 8. We ACK peer's FIN            (seq 107..107 ack      524)

What should happen at step (6) is that we should ACK the peer's FIN,
since we can deduce that we have never sent this ACK.

Fix by maintaining an "ACK pending" flag that is set whenever we are
made aware that our peer needs an ACK (whether by consuming sequence
space or by sending a packet that appears out of order), and is
cleared only when the ACK packet has been transmitted.

Reported-by: Piotr Jaroszyński <p.jaroszynski@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-07-15 19:59:57 +01:00
Michael Brown 75505942ac [tcp] Merge boolean flags into a single "flags" field
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-07-15 19:59:57 +01:00
Michael Brown c57e26381c [tcp] Use a dedicated timer for the TIME_WAIT state
iPXE currently repurposes the retransmission timer to hold the TCP
connection in the TIME_WAIT state (i.e. waiting for up to 2*MSL in
case we are required to re-ACK our peer's FIN due to a lost ACK).
However, the fact that this timer is running will prevent such an ACK
from ever being sent, since the logic in tcp_xmit() assumes that a
running timer indicates that we ourselves are waiting for an ACK and
so blocks the transmission.  (We always wait for an ACK before sending
our next packet, to keep our transmit data path as simple as
possible.)

Fix by using an entirely separate timer for the TIME_WAIT state, so
that packets can still be sent.

Reported-by: Piotr Jaroszyński <p.jaroszynski@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-07-15 19:59:34 +01:00
Guo-Fu Tseng 1e7e4c9a61 [tcp] Randomise local TCP port
Signed-off-by: Guo-Fu Tseng <cooldavid@cooldavid.org>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-07-13 17:29:54 +01:00
Michael Brown 73e3672468 [tcp] Fix typos by changing ntohl() to htonl() where appropriate
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-07-13 17:19:37 +01:00
Michael Brown 43450342a9 [tcp] Store local port in host byte order
Every other scalar integer value in struct tcp_connection is in host
byte order; change the definition of local_port to match.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-07-13 17:15:57 +01:00
Michael Brown 68c2f07f15 [tcp] Fix potential use-after-free when accessing timestamp option
Reported-by: Piotr Jaroszyński <p.jaroszynski@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-07-07 12:57:08 +01:00
Michael Brown 21682afe69 [tls] Handle multiple handshake records
The handshake record in TLS can contain multiple messages.

Originally-fixed-by: Timothy Stack <tstack@vmware.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-06-23 01:01:32 +01:00
Michael Brown b707f15ecb [http] Pass through unknown interface method calls
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-06-22 19:33:39 +01:00
Michael Brown 4327d5d39f [interface] Convert all data-xfer interfaces to generic interfaces
Remove data-xfer as an interface type, and replace data-xfer
interfaces with generic interfaces supporting the data-xfer methods.

Filter interfaces (as used by the TLS layer) are handled using the
generic pass-through interface capability.  A side-effect of this is
that deliver_raw() no longer exists as a data-xfer method.  (In
practice this doesn't lose any efficiency, since there are no
instances within the current codebase where xfer_deliver_raw() is used
to pass data to an interface supporting the deliver_raw() method.)

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-06-22 15:50:31 +01:00
Michael Brown 7b4fbd93a5 [interface] Convert all name-resolution interfaces to generic interfaces
Remove name-resolution as an interface type, and replace
name-resolution interfaces with generic interfaces supporting the
resolv_done() method.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-06-22 15:45:57 +01:00
Michael Brown a03dd97e6b [interface] Convert all job-control interfaces to generic interfaces
Remove job-control as an interface type, and replace job-control
interfaces with generic interfaces supporting the close() method.
(Both done() and kill() are absorbed into the function of close();
kill() is merely close(-ECANCELED).)

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-06-22 14:40:09 +01:00
Michael Brown 5fa6775b61 [retry] Use start_timer_fixed() instead of direct timeout manipulation
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-06-22 14:32:49 +01:00
Michael Brown c760ac3022 [retry] Add timer_init() wrapper function
Standardise on using timer_init() to initialise an embedded retry
timer, to match the coding style used by other embedded objects.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-06-22 14:30:20 +01:00
Michael Brown 4bfd5b52c1 [refcnt] Add ref_init() wrapper function
Standardise on using ref_init() to initialise an embedded reference
count, to match the coding style used by other embedded objects.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-06-22 14:26:40 +01:00
Michael Brown 6c0e8c14be [libc] Enable automated extraction of error usage reports
Add preprocessor magic to the error definitions to enable every error
usage to be tracked.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-05-31 03:11:57 +01:00
Geoff Lywood 6514d6430d [dhcp] Use correct DHCP options on EFI systems
See RFC 4578 for details.

Signed-off-by: Joshua Oreman <oremanj@rwcr.net>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-05-29 08:51:46 +01:00
Piotr Jaroszyński 8a16fd05dc [iscsi] Allow base64 encoding in large binary values
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-05-28 20:04:28 +01:00
Michael Brown b3d8238fd4 [iscsi] Use generic base16 functions for iSCSI reverse CHAP
Yes, I forgot to convert this function before pushing.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-05-28 19:31:13 +01:00
Michael Brown d6f79d6b6e [infiniband] Use generic base16 functions for SRP
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-05-28 19:04:59 +01:00
Michael Brown 7b267ee6db [iscsi] Use generic base16 functions for iSCSI
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-05-28 18:21:24 +01:00
Michael Brown dfcce165a5 [base64] Allow base64_encode() to handle arbitrary data
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-05-28 12:44:23 +01:00
Piotr Jaroszyński 7c6d3752c9 [compiler] Fix 64bit compile time errors
Apart from format specifier fixes there are two changes in proper code:
- Change type of regs in skge_hw to unsigned long
- Cast result of sizeof in myri10ge to uint32_t

Both don't change anything for i386 and should be fine on x86_64.

Signed-off-by: Piotr Jaroszyński <p.jaroszynski@gmail.com>
Signed-off-by: Joshua Oreman <oremanj@rwcr.net>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-05-27 10:23:06 +01:00
Joshua Oreman 905ea56753 [dhcp] Don't consider invalid offers to be duplicates
This fixes a regression in BOOTP support; since BOOTP requests often
have the `siaddr' field set to 0.0.0.0, they would be considered
duplicates of the first zeroed-out offer slot.

Signed-off-by: Joshua Oreman <oremanj@rwcr.net>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-05-27 10:22:05 +01:00
Joshua Oreman 2aad3fab23 [build] Use weak definitions instead of weak declarations
This removes the need for inline safety wrappers, marginally reducing
the size penalty of weak functions, and works around an apparent
binutils bug that causes undefined weak symbols to not actually be
NULL when compiling with -fPIE (as EFI builds do).

A bug in versions of binutils prior to 2.16 (released in 2005) will
cause same-file weak definitions to not work with those
toolchains. Update the README to reflect our new dependency on
binutils >= 2.16.

Signed-off-by: Joshua Oreman <oremanj@rwcr.net>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-05-27 10:19:14 +01:00
Joshua Oreman 620b98ee4b [dhcp] Honor PXEBS_SKIP option in discovery control
It is permissible for a DHCP packet containing PXE options to specify
only "discovery control", instead of the more typical boot menu +
prompt options. This is the strategy used by older versions of
dnsmasq; by specifying the discovery control as PXEBS_SKIP, they cause
vendor PXE ROMs to ignore boot server discovery and just use the
filename and next-server options in the initial (Proxy)DHCP packet.
Modify iPXE to accept this behavior, to be more compatible with the
Intel firmware.

Signed-off-by: Joshua Oreman <oremanj@rwcr.net>
Tested-by: Kyle Kienapfel <kyle@shadowmage.org>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-05-27 01:18:26 +01:00
Joshua Oreman 723cfad316 [wpa] Remove PMKID checking
PMKID checking is an additional pre-check that helps detect invalid
passphrases before going through the full handshaking procedure. It
takes up some amount of code size, and is not necessary from a
security perspective. It also is implemented improperly by some
routers, which was causing iPXE to give spurious authentication
errors. Remove it for these reasons.

Signed-off-by: Joshua Oreman <oremanj@rwcr.net>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-05-27 01:18:25 +01:00
Michael Brown 9ff8229693 [tcp] Update received sequence number before delivering received data
iPXE currently updates the TCP sequence number after delivering the
data to the application via xfer_deliver_iob().  If the application
responds to the received data by transmitting more data, this would
result in a stale ACK number appearing in the transmitted packet,
which potentially causes retransmissions and also gives the
undesirable appearance of violating causality (by sending a response
to a message that we claim not to have yet received).

Reported-by: Guo-Fu Tseng <cooldavid@cooldavid.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-05-22 00:45:49 +01:00
Michael Brown 84996b7b09 [lacp] Add simple LACP implementation
Some switch configurations will refuse to enable our port unless we
can speak LACP to inform the switch that we are alive.  Add a very
simple passive LACP implementation that is sufficient to convince at
least Linux's bonding driver (when tested using qemu attached to a tap
device enslaved to a bond device configured as "mode=802.3ad").

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-05-10 16:34:17 +01:00
Michael Brown 8406115834 [build] Rename gPXE to iPXE
Access to the gpxe.org and etherboot.org domains and associated
resources has been revoked by the registrant of the domain.  Work
around this problem by renaming project from gPXE to iPXE, and
updating URLs to match.

Also update README, LOG and COPYRIGHTS to remove obsolete information.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2010-04-19 23:43:39 +01:00
Michael Brown 4a7648bd3d [netdevice] Record whether or not interrupts are currently enabled
Signed-off-by: Michael Brown <mcb30@etherboot.org>
2010-03-23 00:55:47 +00:00
Michael Brown 88e436376c [netdevice] Add netdev_is_open() wrapper function
Signed-off-by: Michael Brown <mcb30@etherboot.org>
2010-03-23 00:46:35 +00:00
Michael Brown 73c71f6492 [iscsi] Disambiguate some common authentication errors
Signed-off-by: Michael Brown <mcb30@etherboot.org>
2010-03-17 02:23:42 +00:00
Danny Volkind cd9c94851b [iscsi] Fix interoperability with QNAP TS-639Pro
Modified-by: Michael Brown <mcb30@etherboot.org>
Signed-off-by: Michael Brown <mcb30@etherboot.org>
2010-02-22 04:53:04 +00:00
Joshua Oreman f3467ad169 [http] GET / if URI doesn't contain a path
Commit 3d9dd93 introduced a regression in HTTP: if a URI without a
path is specified (e.g. http://netboot.me), we send the empty string
as our GET request. Reintroduce an extra slash when uri->path is NULL,
to turn this into the expected GET /.

Reported-by: Kyle Kienapfel <doctor.whom@gmail.com>
Signed-off-by: Joshua Oreman <oremanj@rwcr.net>
Signed-off-by: Marty Connor <mdc@etherboot.org>
2010-01-27 08:52:39 -05:00
Joshua Oreman 5efc2fcb60 [dhcp] Keep multiple DHCP offers received, and use them intelligently
Instead of keeping only the best IP and PXE offers, store all of them,
and pick the best to use just before a request is sent. This allows
priority differentiation to work even when lower-priority offers
provide PXE options, and improves robustness at sites with broken PXE
servers intermingled with working ones: when a ProxyDHCP request times
out, instead of giving up, we try the next PXE offer we've received.
It also allows us to avoid breaking up combined IP+PXE offers, which
can be important with some firewall configurations. This behavior
matches that of most vendor PXE ROMs.

Store a reference to the DHCPOFFER packet in the offer structure, so
that when registering settings after a successful ACK we can register
the proxy PXE settings we originally received; this removes the need
for a nonstandard duplicate REQUEST/ACK to port 67 of proxy servers
like dnsmasq that provide PXE options in the OFFER.

Total cost: 450 bytes uncompressed.

Signed-off-by: Marty Connor <mdc@etherboot.org>
2010-01-21 18:36:26 -05:00
gL2n30Y06arv2 93805d9765 [ftp] User and password URI support for the FTP protocol
The default user and password are used for anonymous FTP by default.
This patch adds support for an explicit user name and password in an FTP
URI:

    imgfetch ftp://user:password@server.com/path/to/file

Edited-by: Stefan Hajnoczi <stefanha@gmail.com>.  Bugs are my fault.

Signed-off-by: Marty Connor <mdc@etherboot.org>
2010-01-20 18:18:47 -05:00
Joshua Oreman 3d9dd93a14 [uri] Decode/encode URIs when parsing/unparsing
Currently, handling of URI escapes is ad-hoc; escaped strings are
stored as-is in the URI structure, and it is up to the individual
protocol to unescape as necessary. This is error-prone and expensive
in terms of code size. Modify this behavior by unescaping in
parse_uri() and escaping in unparse_uri() those fields that typically
handle URI escapes (hostname, user, password, path, query, fragment),
and allowing unparse_uri() to accept a subset of fields to print so
it can be easily used to generate e.g. the escaped HTTP path?query
request.

Signed-off-by: Joshua Oreman <oremanj@rwcr.net>
Signed-off-by: Marty Connor <mdc@etherboot.org>
2010-01-20 18:14:28 -05:00
Joshua Oreman b1ba80f8fb [dhcp] Add generic facility for using cached network settings
When a DHCP session is started (using autoboot or a command-line `dhcp
net0'), check whether the new setting use-cached (DHCP option 175.178)
is TRUE; if so, skip DHCP and rely on currently registered
settings. This lets one combine a static IP with autoboot.

Before checking the use-cached setting, call a weak
get_cached_dhcpack() hook that can be implemented by particular builds
of gPXE supporting some fashion of retrieving a cached DHCPACK packet.
If one is available, it is registered as an options source, and then
either that packet's option 175.178 or the user's prior manual
use-cached setting can allow skipping duplicate DHCP.

Using cached packets is not the default because DHCP servers are often
configured to give gPXE different options than they give a vendor PXE
client; in order to break the infinite loop of PXE chaining, one would
need to load a gPXE with an embedded image that does something more
than autoboot.

Signed-off-by: Marty Connor <mdc@etherboot.org>
2010-01-20 17:15:51 -05:00
Stefan Hajnoczi 0579ddc834 [tftp] Abort requests with error code 0
There is no defined error code for aborting a request but 0 is commonly
used.  This patch switches the abort request error code from
TFTP_ERR_UNKNOWN_TID (5) to 0.

Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
2010-01-18 17:24:38 -05:00
Thomas Horsten c124f6360d [tftp] Make TFTP size requests abort transfer with an error
pxenv_tftp_get_fsize is an API call that PXE clients can call to
obtain the size of a remote file. It is implemented by starting a TFTP
transfer with pxe_tftp_open, waiting for the response and then
stopping the transfer with pxe_tftp_close(). This leaves the session
hanging on the TFTP server and it will try to resend the packet
repeatedly (verified with tftpd-hpa) until it times out.

This patch adds a method "tftpsize" that will abort the transfer after
the first packet is received from the server. This will terminate the
session on the server and is the same behaviour as Intel's PXE ROM
exhibits.

Together with a qemu patch to handle the ERROR packet (submitted to
qemu's mailing list), this resolves a specific issue where booting
pxegrub with qemu's TFTP server would be slow or hang.

I've tested this against qemu's tftp server and against my normal boot
infrastructure (tftpd-hpa). Booting pxegrub and loading extra files
now produces a trace similar to Intel's PXE client and there are no
spurious retransmits from tftpd any more.

Signed-off-by: Thomas Horsten <thomas@horsten.com>
Signed-off-by: Milan Plzik <milan.plzik@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
2010-01-17 19:18:28 -05:00
Stefan Hajnoczi 245dca9ce6 [tftp] Remove unnecessary delay when opening a connection
The retry timer is used to retransmit TFTP packets lost on the network,
and to start a new connection.  There is an unnecessary delay while
waiting for name resolution because the timer period is fixed and cannot
be shortened when name resolution completes.  This patch keeps the timer
period at zero while name resolution takes place so that no time is lost
once before sending the first packet.

Reported-by: Thomas Horsten <thomas@horsten.com>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
2010-01-15 16:04:33 -05:00
Stefan Hajnoczi dd99ee95cb [tftp] Allow fetching larger files by wrapping block number
This patch adds TFTP support for files larger than 65535 blocks by
wrapping the 16-bit block number.

Reported-by: Mark Johnson <johnson.nh@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
2010-01-15 15:54:36 -05:00
Joshua Oreman 734061e9c6 [dhcp] Assume PXE options are in DHCPOFFER only if boot menu is included
IBM's Tivoli Provisioning Manager for OS Deployment, when acting as a
ProxyDHCP server, sends an initial offer with a vendor class of "PXEClient"
and vendor-encapsulated options that have nothing to do with PXE. To
differentiate between this case and the case of a ProxyDHCP server that
sends all PXE options in its initial offer, modify gPXE to check for
the presence of an encapsulated PXE boot menu option (43.9) instead of
simply checking for the existence of any encapsulated options at all.
This is the same check used by the Intel vendor PXE ROM.

Signed-off-by: Marty Connor <mdc@etherboot.org>
2010-01-14 18:34:55 -05:00
Joshua Oreman 04e4a4f695 [dhcp] Accept ProxyDHCP replies of type DHCPOFFER
The PXE standard provides examples of ProxyDHCP responses being encoded both
as type DHCPOFFER and DHCPACK, but currently we only accept DHCPACKs. Since
there are PXE servers in existence that respond to ProxyDHCPREQUESTs with
DHCPOFFERs, modify gPXE's ProxyDHCP pruning logic to treat both types of
responses equally.

Signed-off-by: Marty Connor <mdc@etherboot.org>
2010-01-14 18:33:31 -05:00
Shao Miller cf5e79adc9 [dhcp] Append new DHCP options versus prepend
Change the behaviour for adding DHCP options into a DHCP packet so
that we now append options, rather than insert them in front of
whatever options might already be present.

Apparently, the DHCP relay logic on a Nortel 470-48T layer 2 switch
cares about the order of DHCP options.  If we build a DHCP packet
pre-populated with some options, their order will now be preserved,
except for encapsulated options.

Signed-off-by: Marty Connor <mdc@etherboot.org>
2010-01-14 11:14:24 -05:00
Shao Miller 9de525c34c [dhcp] Ensure message type is first DHCP option
Apparently, the DHCP relay logic on a Nortel 470-48T layer 2 switch
cares about the order of DHCP options.  Specifically, it requires
that the DHCP message type option be the first option present in the
DHCP packet.  We achieve this by having this option appear first in
our dhcp_request_options_data array, which pre-populates DHCP
requests.

Signed-off-by: Marty Connor <mdc@etherboot.org>
2010-01-14 11:13:10 -05:00
Joshua Oreman aa1b894ecd [802.11] Allow connecting to spectrum managed networks
Contrary to the IEEE specification, some access points apparently
set the Spectrum Mgmt bit in the capabilities field even when
broadcasting on a 2.4GHz band that does not require spectrum
management. Allow gPXE to attempt to connect to such networks;
if spectrum management is really required, our advertisement
of capabilities not including it will result in an association
failure.

Reported-by: Peter Meyer <residue@xmail.net>

Signed-off-by: Marty Connor <mdc@etherboot.org>
2010-01-05 10:16:25 -05:00
Joshua Oreman 5240fee38f [wpa] Add CCMP backend (new AES-based cryptosystem)
Signed-off-by: Marty Connor <mdc@etherboot.org>
2010-01-05 10:11:42 -05:00
Joshua Oreman 8106cb130b [wpa] Add TKIP backend (legacy RC4-based cryptosystem)
Signed-off-by: Marty Connor <mdc@etherboot.org>
2010-01-05 10:09:44 -05:00
Joshua Oreman 0758111345 [wpa] Add pre-shared key frontend (WPA "Personal" with just a passphrase)
Modified-by: Marty Connor <mdc@etherboot.org>
Signed-off-by: Marty Connor <mdc@etherboot.org>
2010-01-05 10:07:59 -05:00
Joshua Oreman 8ec18a5b50 [wpa] Add general support for WPA-protected 802.11 networks
Modified-by: Marty Connor <mdc@etherboot.org>
Signed-off-by: Marty Connor <mdc@etherboot.org>
2010-01-05 09:53:03 -05:00
Joshua Oreman 432cc6d1d8 [eapol] Add basic support for 802.1X EAP over LANs
EAPOL is a container protocol that can wrap either EAP packets or
802.11 EAPOL-Key frames. For cleanliness' sake, add a stub that strips
the framing and sends packets off to the appropriate handler if it
is compiled in.

Signed-off-by: Marty Connor <mdc@etherboot.org>
2010-01-05 09:18:12 -05:00
Joshua Oreman 01b4f52089 [802.11] Add support for WEP-protected networks
WEP is a highly flawed cryptosystem, barely better than no encryption at all,
but many people still use it. It does have the advantage of being very simple
and small in code size.

Signed-off-by: Marty Connor <mdc@etherboot.org>
2010-01-05 09:14:08 -05:00
Joshua Oreman dd8a3e2e70 [802.11] Add core support for detecting and using encrypted networks
Signed-off-by: Marty Connor <mdc@etherboot.org>
2010-01-05 09:08:37 -05:00
Shao Miller 177389fb73 [settings] Add Bus ID setting
Users can find the bus type and PCI IDs for a network interface with:

netX/busid

Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
2009-12-14 17:54:53 +00:00
Michael Brown 58b6794c11 [infiniband] Rename IB_PKEY_NONE to IB_PKEY_DEFAULT
There is no such thing as a non-existent partition.
2009-11-16 22:14:36 +00:00
Michael Brown bbc530c0dd [infiniband] Report IB link status as IPoIB netdevice status 2009-11-16 22:14:12 +00:00
Michael Brown 228ac9d018 [infiniband] Include hostname in node description, if available 2009-11-16 22:13:44 +00:00
Michael Brown e7018228fa [infiniband] Make node description invariant across all ports
IBA section 14.2.5.2 states that "the contents of the NodeDescription
attribute are the same for all ports on a node".  Satisfy this by
using the HCA GUID rather than the port GUID to form the node
description string.
2009-11-16 22:13:25 +00:00
Michael Brown 4933ccbf65 [ipv4] Ignore non-open net devices when performing routing
We do not discard routing table entries when closing an interface.  It
is plausible that multiple interfaces may be on the same physical
network; if so, then we may end up in a situation whereby outbound
packets attempt to route via a closed interface.

Fix by ignoring non-open net devices in ipv4_route().
2009-11-16 22:12:48 +00:00
Michael Brown 55d23b19a2 [ipv4] Allow calculation of default subnet mask
ipv4.c calculates the default subnet mask before calling
fetch_ipv4_setting() to retrieve the configured subnet mask (if any).

However, as of commit 612f4e7 "[settings] Avoid returning
uninitialised data on error in fetch_xxx_setting()",
fetch_ipv4_setting() will zero the IP address if the setting does not
exist, rather than leaving it unaltered.

Fix by fetching the setting first and calculating the default subnet
mask only if necessary.
2009-11-16 22:11:53 +00:00
Michael Brown 2ce0d8f08b [ipv4] Use a zero address to indicate "no gateway", rather than INADDR_NONE
ipv4.c uses a gateway address of INADDR_NONE to represent "no
gateway".  It initialises the gateway address to INADDR_NONE before
calling fetch_ipv4_setting() to retrieve the configured gateway
address (if any).

However, as of commit 612f4e7 "[settings] Avoid returning
uninitialised data on error in fetch_xxx_setting()",
fetch_ipv4_setting() will zero the IP address if the setting does not
exist, rather than leaving it unaltered.

Fix by using a zero IP address to indicate "no gateway", so that a
non-existent gateway address setting will be treated as such.
2009-11-16 22:09:23 +00:00
Joshua Oreman 67015d1011 [pxebs] Correct endianness of PXE type
The PXE type field is canonically little-endian, but the pxebs command
treats it as big-endian in converting the type number passed on the
command line to a field value to search against. Fix, to prevent the
necessity of incantations like "pxebs net0 1536" to select menu item #6.

Signed-off-by: Michael Brown <mcb30@etherboot.org>
Modified-by: Michael Brown <mcb30@etherboot.org>
2009-10-24 19:34:35 +01:00
Michael Brown 1b1e63d54d [netdevice] Add the concept of an "Ethernet-compatible" MAC address
The iBFT is Ethernet-centric in providing only six bytes for a MAC
address.  This is most probably an indirect consequence of a similar
design flaw in the Windows NDIS stack.  (The WinOF IPoIB stack
performs all sorts of contortions in order to pretend to the NDIS
layer that it is dealing with six-byte MAC addresses.)

There is no sensible way in which to extend the iBFT without breaking
compatibility with programs that expect to parse it.  Add the notion
of an "Ethernet-compatible" MAC address to our link layer abstraction,
so that link layers can provide their own workarounds for this
limitation.
2009-10-23 22:14:05 +01:00
Michael Brown 224ef7f483 [infiniband] Send CM requests to target node's GSI rather than SM's GSI 2009-10-16 23:03:47 +01:00
Michael Brown a7290a970c [802.11] Support multicast hashing
802.11 multicast hashing is the same as standard Ethernet hashing, so
just expose and use eth_mc_hash().

Signed-off-by: Joshua Oreman <oremanj@rwcr.net>
2009-08-12 00:54:29 +01:00
Joshua Oreman 2310c30d1c [802.11] Properly initialize autoassociation process
The recent change to process_add() to detect duplicate process
additions relies on the fact that all processes will be initialized
using process_init_stopped() before being passed to that function.
The autoassociation process was not initialized in this fashion, so
process_add() erroneously detected it as a duplicate.

Fix by using process_init_stopped() to initialize the autoassociation
process instead of setting the step member directly.

Signed-off-by: Michael Brown <mcb30@etherboot.org>
2009-08-12 00:31:34 +01:00
Michael Brown 444d5550a7 [dhcp] Fall back to using the hardware address to populate the chaddr field
For IPoIB, the chaddr field is too small (16 bytes) to contain the
20-byte IPoIB link-layer address.  RFC4390 mandates that we should
pass an empty chaddr field and rely on the DHCP client identifier
instead.  This has many problems, not least of which is that a client
identifier containing an IPoIB link-layer address is not very useful
from the point of view of creating DHCP reservations, since the QPN
component is assigned at runtime and may vary between boots.

Leave the DHCP client identifier as-is, to avoid breaking existing
setups as far as possible, but expose the real hardware address (the
port GUID) via the DHCP chaddr field, using the broadcast flag to
instruct the DHCP server not to use this chaddr value as a link-layer
address.

This makes it possible (at least with ISC dhcpd) to create DHCP
reservations using host declarations such as:

    host duckling {
        fixed-address 10.252.252.99;
        hardware unknown-32 00:02:c9:02:00:25:a1:b5;
    }
2009-08-12 00:27:08 +01:00
Michael Brown 4eab5bc8ca [netdevice] Allow the hardware and link-layer addresses to differ in size
IPoIB has a 20-byte link-layer address, of which only eight bytes
represent anything relating to a "hardware address".

The PXE and EFI SNP APIs expect the permanent address to be the same
size as the link-layer address, so fill in the "permanent address"
field with the initial link layer address (as generated by
register_netdev() based upon the real hardware address).
2009-08-12 00:23:38 +01:00
Michael Brown 37a0aab4ff [netdevice] Separate out the concept of hardware and link-layer addresses
The hardware address is an intrinsic property of the hardware, while
the link-layer address can be changed at runtime.  This separation is
exposed via APIs such as PXE and EFI, but is currently elided by gPXE.

Expose the hardware and link-layer addresses as separate properties
within a net device.  Drivers should now fill in hw_addr, which will
be used to initialise ll_addr at the time of calling
register_netdev().
2009-08-12 00:19:14 +01:00
Michael Brown 0ff5c456cb [infiniband] Disambiguate CM connection rejection reasons
There is diagnostic value in being able to disambiguate between the
various reasons why an IB CM has rejected a connection attempt.  In
particular, reason 8 "invalid service ID" can be used to identify an
incorrect SRP service_id root-path component, and reason 28 "consumer
reject" corresponds to a genuine SRP login rejection IU, which can be
passed up to the SRP layer.

For rejection reasons other than "consumer reject", we should not pass
through the private data, since it is most likely generated by the CM
without any protocol-specific knowledge.
2009-08-10 22:31:55 +01:00
Michael Brown a0d337912e [infiniband] Generate more specific errors in response to failure MADs
Generate errors within individual MAD transaction consumers such as
ib_pathrec.c and ib_mcast.c, rather than within ib_mi.c.  This allows
for more meaningful error messages to eventually be displayed to the
user.
2009-08-10 22:30:06 +01:00
Michael Brown 0c30dc6bc5 [infiniband] Add support for SRP over Infiniband
SRP is the SCSI RDMA Protocol.  It allows for a method of SAN booting
whereby the target is responsible for reading and writing data using
Remote DMA directly to the initiator's memory.  The software initiator
merely sends and receives SCSI commands; it never has to touch the
actual data.
2009-08-10 22:27:33 +01:00
Michael Brown 8de49af0d2 [infiniband] Add last_opened_ibdev(), analogous to last_opened_netdev()
The minimal-surprise behaviour, when no explicit SRP initiator device
is specified, will probably be to use the most recently opened
Infiniband device.  This matches our behaviour with using the most
recently opened net device for PXE, iSCSI, AoE, NBI, etc.
2009-08-10 22:25:57 +01:00
Michael Brown 419243e7f1 [infiniband] Add find_ibdev() 2009-08-10 22:25:02 +01:00
Michael Brown 4be11f523c [infiniband] Add a "communication-managed reliable connection" protocol
SRP over Infiniband uses a protocol whereby data is sent via a
combination of the CM private data fields and the RC queue pair
itself.  This seems sufficiently generic that it's worth having
available as a separate protocol.
2009-08-10 22:23:28 +01:00
Michael Brown cf716a0ce6 [scsi] Make LUN a property of the SCSI backend only
Nothing within the SCSI core actually refers to the LUN, so we can
simplify matters by treating it as purely a property of the backend.
2009-08-10 19:31:45 +01:00
Michael Brown d944794680 [scsi] Generalise iscsi_parse_lun() to scsi_parse_lun() 2009-08-10 19:30:41 +01:00
Michael Brown 976f12c501 [scsi] Generalise iscsi_detached_command() to scsi_detached_command() 2009-08-10 19:29:40 +01:00
Michael Brown 04878ef745 [process] Make it safe to call process_add() multiple times 2009-08-10 19:27:24 +01:00
Michael Brown 46073f1239 [infiniband] Handle duplicate Communication Management REPs
We will terminate our transaction as soon as we receive the first CM
REP, since that provides all the state that we need.  However, the
peer may resend the REP if it didn't see our RTU, and if we don't
respond with another RTU we risk being disconnected.  (This protocol
appears not to handle retries gracefully.)

Fix by adding a management agent that will listen for these duplicate
REPs and send back an RTU.
2009-08-09 01:31:07 +01:00
Joshua Oreman fc9750a68d [802.11] Fix memory leak on unsuccessful probes
When a probe found no results, the list head of beacons would not be
freed, leaking 16 bytes of memory per probe.

Signed-off-by: Michael Brown <mcb30@etherboot.org>
2009-08-09 00:12:53 +01:00
Joshua Oreman 1e810bebe9 [802.11] Set channels early on to avoid tuning to an undefined channel
Some cards (such as ath5k) always need to tune to a particular channel
when they are reset; the reset may happen upon open(), which is before
the channels array would be set up (in prepare_probe()). Avoid tuning
the card to an inconsistent state by copying the hardware
supported-channels array to the 802.11 device's allowable-channels
array even before channels are "properly" set up.

Signed-off-by: Michael Brown <mcb30@etherboot.org>
2009-08-09 00:11:33 +01:00
Joshua Oreman f128a6db21 [802.11] Enhance support for driver PHY differences
The prior net80211 model of physical-layer behavior for drivers was
overly simplistic and limited the drivers that could be written.  To
be more flexible, split the driver-provided list of supported rates by
band, and add a means for specifying a list of supported channels.
Allow drivers to specify a hardware channel value that will be tied to
uses of the channel.

Expose net80211_duration() to drivers, and make the rate it uses in
its computations configurable, so that it can be used in calculating
durations that must be set in hardware for ACK and CTS packets. Add
net80211_cts_duration() for the common case of calculating the
duration for a CTS packet.

Signed-off-by: Michael Brown <mcb30@etherboot.org>
2009-08-09 00:11:26 +01:00
Michael Brown 34bfc04e4c [infiniband] Update all other MAD users to use a management interface 2009-08-08 23:56:28 +01:00
Michael Brown 44251ebb9a [infiniband] Update subnet management agent to use a management interface 2009-08-08 23:55:29 +01:00
Michael Brown 0e07516f62 [infiniband] Add the concept of a management interface
A management interface is the component through which both local and
remote management agents are accessed.

This new implementation of a management interface allows for the user
to react to timed-out transactions, and also allows for cancellation
of in-progress transactions.
2009-08-08 23:51:27 +01:00
Michael Brown b0c563824b [infiniband] Change IB_{QPN,QKEY,QPT} names from {SMA,GMA} to {SMI,GSI}
The IBA specification refers to management "interfaces" and "agents".
The interface is the component that connects to the queue pair and
sends and receives MADs; the agent is the component that constructs
the reply to the MAD.

Rename the IB_{QPN,QKEY,QPT} constants as a first step towards making
this separation in gPXE.
2009-08-06 01:24:18 +01:00
Michael Brown 5552a1b202 [tcp] Avoid printf format warnings on some compilers
In several places, we currently use size_t to represent a difference
between TCP sequence numbers.  This can cause compiler warnings
relating to printf format specifiers, since the result of
(uint32_t+size_t) may be an unsigned long on some compilers.

Fix by using uint32_t for all variables that represent a difference
between TCP sequence numbers.

Tested-by: Joshua Oreman <oremanj@xenon.get-linux.org>
2009-08-02 22:44:57 +01:00
Joshua Oreman ce64398f87 [802.11] Add support for 802.11 devices with software MAC layer
This is required for all modern 802.11 devices, and allows drivers
to be written for them with minimally more effort than is required
for a wired NIC.

Signed-off-by: Michael Brown <mcb30@etherboot.org>
Modified-by: Michael Brown <mcb30@etherboot.org>
2009-08-01 19:00:32 +01:00
Michael Brown cc2e767b5a [infiniband] Add Communication Manager (CM)
The Communication Manager is responsible for handling the setup and
teardown of RC connections.
2009-07-17 23:06:35 +01:00
Michael Brown c939bc57ff [infiniband] Add infrastructure for RC queue pairs
Queue pairs are now assumed to be created in the INIT state, with a
call to ib_modify_qp() required to bring the queue pair to the RTS
state.

ib_modify_qp() no longer takes a modification list; callers should
modify the relevant queue pair parameters (e.g. qkey) directly and
then call ib_modify_qp() to synchronise the changes to the hardware.

The packet sequence number is now a property of the queue pair, rather
than of the device.

Each queue pair may have an associated address vector.  For RC queue
pairs, this is the address vector that will be programmed in to the
hardware as the remote address.  For UD queue pairs, it will be used
as the default address vector if none is supplied to ib_post_send().
2009-07-17 23:06:35 +01:00
Michael Brown ea6eb7f7ed [infiniband] Pass a generic MAD to ib_set_port_info() 2009-07-17 23:06:35 +01:00
Michael Brown 0095e18d4c [infiniband] Expose supported and enabled link speeds and widths 2009-07-17 23:06:35 +01:00
Michael Brown 773028d34e [infiniband] Allow MAD handlers to indicate response via return value
Now that MAD handlers no longer return a status code, we can allow
them to return a pointer to a MAD structure if and only if they want
to send a response.  This provides a more natural and flexible
approach than using a "response method" field within the handler's
descriptor.
2009-07-17 23:06:35 +01:00
Michael Brown 94876f4bb6 [infiniband] Remove the return status code from MAD handlers
MAD handlers have to set the status fields within the MAD itself
anyway, in order to provide a meaningful response MAD; the additional
gPXE return status code is just noise.

Note that we probably don't need to ever explicitly set the status to
IB_MGMT_STATUS_OK, since it should already have this value from the
request.  (By not explicitly setting the status in this way, we can
safely have ib_sma_set_xxx() call ib_sma_get_xxx() in order to
generate the GetResponse MAD without worrying that ib_sma_get_xxx()
will clear any error status set by ib_sma_set_xxx().)
2009-07-17 23:06:35 +01:00
Michael Brown f1d92fa886 [infiniband] Allow external QPN to differ from real QPN
Most IB hardware seems not to allow allocation of the genuine QPNs 0
and 1, so allow for the externally-visible QPN (as constructed and
parsed by ib_packet, where used) to differ from the real
hardware-allocated QPN.
2009-07-17 23:06:34 +01:00
Michael Brown 92cf240020 [infiniband] Always create an SMA and a GMA 2009-07-17 23:06:34 +01:00
Michael Brown 80c41b90d2 [infiniband] Add notion of a queue pair type 2009-07-17 23:06:34 +01:00
Michael Brown 3f4972db9a [infiniband] Allow completion queue operations to be optional
The send completion handler typically will just free the I/O buffer,
so allow this common case to be handled by the Infiniband core.
2009-07-17 23:06:34 +01:00
Michael Brown 0582a84e66 [infiniband] Improve ib_packet debugging messages 2009-07-17 23:06:34 +01:00
Michael Brown 165074c188 [infiniband] Implement SMA as an instance of a GMA
The GMA code was based upon the SMA code.  We can save space by making
the SMA simply an instance of the GMA.
2009-07-17 23:06:34 +01:00
Michael Brown 8a852280eb [infiniband] Pass GMA as a parameter to GMA MAD handlers 2009-07-17 23:06:34 +01:00
Michael Brown cb9ef4dee2 [ipoib] Remove the queue set abstraction
Now that IPoIB has to deal with only one set of queues, the queue set
abstraction becomes merely an inconvenient wrapper.
2009-07-17 23:06:34 +01:00
Michael Brown 0fbf2f6bda [infiniband] Provide a general mechanism for multicast group joins
Generalise out the multicast group membership record code from IPoIB.
2009-07-17 23:06:34 +01:00
Michael Brown 3c77fe73a5 [infiniband] Allow for sending MADs via GMA without retransmission 2009-07-17 23:06:34 +01:00
Michael Brown b4155c4ab5 [infiniband] Make qkey and rate optional parameters to ib_post_send()
The queue key is stored as a property of the queue pair, and so can
optionally be added by the Infiniband core at the time of calling
ib_post_send(), rather than always having to be specified by the
caller.

This allows IPoIB to avoid explicitly keeping track of the data queue
key.
2009-07-17 23:06:33 +01:00
Michael Brown d6b47871de [infiniband] Provide a general mechanism for path record lookups
Generalise out the path record lookup code from IPoIB.
2009-07-17 23:06:33 +01:00
Michael Brown 1d8c85d112 [infiniband] Create a general management agent
Generalise the subnet management agent into a general management agent
capable of sending and responding to MADs, including support for
retransmissions as necessary.
2009-07-17 23:06:33 +01:00
Michael Brown 365b8db5cf [infiniband] Centralise SMA and GMA queue constants 2009-07-17 23:06:33 +01:00
Michael Brown 887d296b88 [infiniband] Poll completion queues automatically
Currently, all Infiniband users must create a process for polling
their completion queues (or rely on a regular hook such as
netdev_poll() in ipoib.c).

Move instead to a model whereby the Infiniband core maintains a single
process calling ib_poll_eq(), and polling the event queue triggers
polls of the applicable completion queues.  (At present, the
Infiniband core simply polls all of the device's completion queues.)
Polling a completion queue will now implicitly refill all attached
receive work queues; this is analogous to the way that netdev_poll()
implicitly refills the RX ring.

Infiniband users no longer need to create a process just to poll their
completion queues and refill their receive rings.
2009-07-17 23:06:33 +01:00
Michael Brown 1f5c0239b4 [infiniband] Centralise assumption of 2048-byte payloads
IPoIB and the SMA have separate constants for the packet size to be
used to I/O buffer allocations.  Merge these into the single
IB_MAX_PAYLOAD_SIZE constant.

(Various other points in the Infiniband stack have hard-coded
assumptions of a 2048-byte payload; we don't currently support
variable MTUs.)
2009-07-17 23:06:33 +01:00
Michael Brown 7ba33f7826 [infiniband] Provide ib_get_hca_info() as a commonly-available function 2009-07-17 23:06:33 +01:00
Michael Brown b25a4b6c8a [infiniband] Split queue set functionality out of ipoib.c to ib_qset.c 2009-07-17 23:06:33 +01:00
Michael Brown 8868956268 [infiniband] Move non-driver-specific code to net/infiniband 2009-07-17 23:04:07 +01:00
Michael Brown d09290161e [netdevice] Make ll_broadcast per-netdevice rather than per-ll_protocol
IPoIB has a link-layer broadcast address that varies according to the
partition key.  We currently go through several contortions to pretend
that the link-layer address is a fixed constant; by making the
broadcast address a property of the network device rather than the
link-layer protocol it will be possible to simplify IPoIB's broadcast
handling.
2009-07-17 23:02:48 +01:00
Michael Brown 54ec3673cc [ata] Make ATA command issuing partially asynchronous
Move the icky call to step() from aoe.c to ata.c; this takes it at
least one step further away from where it really doesn't belong.

Unfortunately, AoE has the ugly aoe_discover() mechanism which means
that we still have a step() loop in aoe.c for now; this needs to be
replaced at some future point.
2009-07-17 23:01:20 +01:00
Michael Brown 1d8d8ef2c8 [scsi] Make SCSI command issuing partially asynchronous
Move the icky call to step() from iscsi.c to scsi.c; this takes it at
least one step further away from where it really doesn't belong.
2009-07-17 23:00:09 +01:00
Michael Brown a310d00d37 [netdevice] Add mechanism for reporting detailed link status codes
Expand the NETDEV_LINK_UP bit into a link_rc status code field,
allowing specific reasons for link failure to be reported via
"ifstat".

Originally-authored-by: Joshua Oreman <oremanj@rwcr.net>
2009-06-24 13:04:36 +01:00
Michael Brown 58f60df66c [tcp] Avoid rewinding sequence numbers on receiving old duplicate ACKs
Commit 558c1a4 ("[tcp] Improve robustness in the presence of duplicated
received packets") introduced a regression in that an old duplicate
ACK received while in the ESTABLISHED state would pass through normal
ACK processing, including updating tcp->snd_seq.

Fix by ensuring that ACK processing ignores all duplicate ACKs.
2009-06-23 16:10:34 +01:00
Michael Brown 99e64f5806 [tcp] Attempt to catch all possible error cases with debug messages
All TCP errors or unusual events should now generate a debugging
message at DBGLVL_LOG, with enough information (SEQ and ACK numbers)
to be able to identify the corresponding packet (or missing packet) in
a network trace from the remote end.
2009-06-23 14:28:00 +01:00
Michael Brown f4605970f4 [tcp] Include current sequence numbers in "timer expired" messages 2009-06-23 14:03:09 +01:00
Michael Brown a2f753ba64 [tcp] Move high-frequency debug messages to DBGLVL_EXTRA
This makes it possible to leave TCP debugging enabled in order to see
interesting TCP events, without flooding the console with at least one
message per packet.
2009-06-23 13:35:45 +01:00
Joshua Oreman eb3ca2a36f [netdevice] Add netdev argument to link-layer push and pull handlers
In order to construct outgoing link-layer frames or parse incoming
ones properly, some protocols (such as 802.11) need more state than is
available in the existing variables passed to the link-layer protocol
handlers. To remedy this, add struct net_device *netdev as the first
argument to each of these functions, so that more information can be
fetched from the link layer-private part of the network device.

Updated all three call sites (netdevice.c, efi_snp.c, pxe_undi.c) and
both implementations (ethernet.c, ipoib.c) of ll_protocol to use the
new argument.

Signed-off-by: Michael Brown <mcb30@etherboot.org>
2009-06-23 10:41:57 +01:00
Michael Brown 558c1a45fe [tcp] Improve robustness in the presence of duplicated received packets
gPXE responds to duplicated ACKs with an immediate retransmission,
which can lead to a sorceror's apprentice syndrome.  It also responds
to out-of-range (or old duplicate) ACKs with a RST, which can cause
valid connections to be dropped.

Fix the sorceror's apprentice syndrome by leaving the retransmission
timer running (and so inhibiting the immediate retransmission) when we
receive a potential duplicate ACK.  This seems to match the behaviour
of Linux observed via wireshark traces.

Fix the RST issue by sending RST only on out-of-range ACKs that occur
before the connection is fully established, as per RFC 793.

These problems were exposed during development of the 802.11 wireless
link layer; the 802.11 protocol has a failure mode that can easily
cause duplicated packets.  The fixes were tested in a controlled way
by faking large numbers of duplicated packets in the rtl8139 driver.

Originally-fixed-by: Joshua Oreman <oremanj@rwcr.net>
2009-06-23 09:40:26 +01:00
Daniel Verkamp 1f80b2dcd5 [ethernet] Add MII link status functions from Linux
Signed-off-by: Michael Brown <mcb30@etherboot.org>
2009-05-26 11:37:46 +01:00
Michael Brown 3c06277bbb [settings] Allow for arbitrarily-named settings
This provides a mechanism for using arbitrarily-named variables within
gPXE, using the existing syntax for settings.
2009-05-26 11:05:58 +01:00
Michael Brown c345336435 [dhcp] Choose ProxyDHCP port based on presence of PXE options
If the ProxyDHCPOFFER already includes PXE options (i.e. option 60 is
set to "PXEClient" and option 43 is present) then assume that the
ProxyDHCPREQUEST can be sent to port 67, rather than port 4011.  This
is a reasonable assumption, since in that case the ProxyDHCP server
has already demonstrated by responding to the DHCPDISCOVER that it is
listening on port 67.  (If the ProxyDHCP server were not listening on
port 67, then the standard DHCP server would have been configured to
respond with option 60 set to "PXEClient" but no option 43 present.)

The PXE specification is ambiguous on this point; the specified
behaviour covers only the cases in which option 43 is *not* present in
the ProxyDHCPOFFER.  In these cases, we will continue to send the
ProxyDHCPREQUEST to port 4011.

This change is required in order to allow us to interoperate with
dnsmasq, which listens only on port 67.  (dnsmasq relies on
unspecified behaviour of the Intel PXE stack, which it seems will
retain the ProxyDHCPOFFER as an options source and never issue a
ProxyDHCPREQUEST, thereby enabling dnsmasq to omit listening on port
4011.)
2009-05-22 05:42:57 +01:00
Michael Brown 1958974d0a [tftp] Process OACKs even if malformed
IBM Tivoli PXE Server 5.1.0.3 is reported to send trailing garbage
bytes at the end of the OACK packet, which causes gPXE to reject the
packet and abort the TFTP transfer.

Work around the problem by processing as much as possible of the OACK,
and treating name/value parsing errors as non-fatal.

Reported-by: Shao Miller <Shao.Miller@yrdsb.edu.on.ca>
2009-05-20 10:04:50 +01:00