david/ipxe
Archived
1
0
Commit Graph

570 Commits

Author SHA1 Message Date
Michael Brown
1b1e63d54d [netdevice] Add the concept of an "Ethernet-compatible" MAC address
The iBFT is Ethernet-centric in providing only six bytes for a MAC
address.  This is most probably an indirect consequence of a similar
design flaw in the Windows NDIS stack.  (The WinOF IPoIB stack
performs all sorts of contortions in order to pretend to the NDIS
layer that it is dealing with six-byte MAC addresses.)

There is no sensible way in which to extend the iBFT without breaking
compatibility with programs that expect to parse it.  Add the notion
of an "Ethernet-compatible" MAC address to our link layer abstraction,
so that link layers can provide their own workarounds for this
limitation.
2009-10-23 22:14:05 +01:00
Michael Brown
224ef7f483 [infiniband] Send CM requests to target node's GSI rather than SM's GSI 2009-10-16 23:03:47 +01:00
Michael Brown
a7290a970c [802.11] Support multicast hashing
802.11 multicast hashing is the same as standard Ethernet hashing, so
just expose and use eth_mc_hash().

Signed-off-by: Joshua Oreman <oremanj@rwcr.net>
2009-08-12 00:54:29 +01:00
Joshua Oreman
2310c30d1c [802.11] Properly initialize autoassociation process
The recent change to process_add() to detect duplicate process
additions relies on the fact that all processes will be initialized
using process_init_stopped() before being passed to that function.
The autoassociation process was not initialized in this fashion, so
process_add() erroneously detected it as a duplicate.

Fix by using process_init_stopped() to initialize the autoassociation
process instead of setting the step member directly.

Signed-off-by: Michael Brown <mcb30@etherboot.org>
2009-08-12 00:31:34 +01:00
Michael Brown
444d5550a7 [dhcp] Fall back to using the hardware address to populate the chaddr field
For IPoIB, the chaddr field is too small (16 bytes) to contain the
20-byte IPoIB link-layer address.  RFC4390 mandates that we should
pass an empty chaddr field and rely on the DHCP client identifier
instead.  This has many problems, not least of which is that a client
identifier containing an IPoIB link-layer address is not very useful
from the point of view of creating DHCP reservations, since the QPN
component is assigned at runtime and may vary between boots.

Leave the DHCP client identifier as-is, to avoid breaking existing
setups as far as possible, but expose the real hardware address (the
port GUID) via the DHCP chaddr field, using the broadcast flag to
instruct the DHCP server not to use this chaddr value as a link-layer
address.

This makes it possible (at least with ISC dhcpd) to create DHCP
reservations using host declarations such as:

    host duckling {
        fixed-address 10.252.252.99;
        hardware unknown-32 00:02:c9:02:00:25:a1:b5;
    }
2009-08-12 00:27:08 +01:00
Michael Brown
4eab5bc8ca [netdevice] Allow the hardware and link-layer addresses to differ in size
IPoIB has a 20-byte link-layer address, of which only eight bytes
represent anything relating to a "hardware address".

The PXE and EFI SNP APIs expect the permanent address to be the same
size as the link-layer address, so fill in the "permanent address"
field with the initial link layer address (as generated by
register_netdev() based upon the real hardware address).
2009-08-12 00:23:38 +01:00
Michael Brown
37a0aab4ff [netdevice] Separate out the concept of hardware and link-layer addresses
The hardware address is an intrinsic property of the hardware, while
the link-layer address can be changed at runtime.  This separation is
exposed via APIs such as PXE and EFI, but is currently elided by gPXE.

Expose the hardware and link-layer addresses as separate properties
within a net device.  Drivers should now fill in hw_addr, which will
be used to initialise ll_addr at the time of calling
register_netdev().
2009-08-12 00:19:14 +01:00
Michael Brown
0ff5c456cb [infiniband] Disambiguate CM connection rejection reasons
There is diagnostic value in being able to disambiguate between the
various reasons why an IB CM has rejected a connection attempt.  In
particular, reason 8 "invalid service ID" can be used to identify an
incorrect SRP service_id root-path component, and reason 28 "consumer
reject" corresponds to a genuine SRP login rejection IU, which can be
passed up to the SRP layer.

For rejection reasons other than "consumer reject", we should not pass
through the private data, since it is most likely generated by the CM
without any protocol-specific knowledge.
2009-08-10 22:31:55 +01:00
Michael Brown
a0d337912e [infiniband] Generate more specific errors in response to failure MADs
Generate errors within individual MAD transaction consumers such as
ib_pathrec.c and ib_mcast.c, rather than within ib_mi.c.  This allows
for more meaningful error messages to eventually be displayed to the
user.
2009-08-10 22:30:06 +01:00
Michael Brown
0c30dc6bc5 [infiniband] Add support for SRP over Infiniband
SRP is the SCSI RDMA Protocol.  It allows for a method of SAN booting
whereby the target is responsible for reading and writing data using
Remote DMA directly to the initiator's memory.  The software initiator
merely sends and receives SCSI commands; it never has to touch the
actual data.
2009-08-10 22:27:33 +01:00
Michael Brown
8de49af0d2 [infiniband] Add last_opened_ibdev(), analogous to last_opened_netdev()
The minimal-surprise behaviour, when no explicit SRP initiator device
is specified, will probably be to use the most recently opened
Infiniband device.  This matches our behaviour with using the most
recently opened net device for PXE, iSCSI, AoE, NBI, etc.
2009-08-10 22:25:57 +01:00
Michael Brown
419243e7f1 [infiniband] Add find_ibdev() 2009-08-10 22:25:02 +01:00
Michael Brown
4be11f523c [infiniband] Add a "communication-managed reliable connection" protocol
SRP over Infiniband uses a protocol whereby data is sent via a
combination of the CM private data fields and the RC queue pair
itself.  This seems sufficiently generic that it's worth having
available as a separate protocol.
2009-08-10 22:23:28 +01:00
Michael Brown
cf716a0ce6 [scsi] Make LUN a property of the SCSI backend only
Nothing within the SCSI core actually refers to the LUN, so we can
simplify matters by treating it as purely a property of the backend.
2009-08-10 19:31:45 +01:00
Michael Brown
d944794680 [scsi] Generalise iscsi_parse_lun() to scsi_parse_lun() 2009-08-10 19:30:41 +01:00
Michael Brown
976f12c501 [scsi] Generalise iscsi_detached_command() to scsi_detached_command() 2009-08-10 19:29:40 +01:00
Michael Brown
04878ef745 [process] Make it safe to call process_add() multiple times 2009-08-10 19:27:24 +01:00
Michael Brown
46073f1239 [infiniband] Handle duplicate Communication Management REPs
We will terminate our transaction as soon as we receive the first CM
REP, since that provides all the state that we need.  However, the
peer may resend the REP if it didn't see our RTU, and if we don't
respond with another RTU we risk being disconnected.  (This protocol
appears not to handle retries gracefully.)

Fix by adding a management agent that will listen for these duplicate
REPs and send back an RTU.
2009-08-09 01:31:07 +01:00
Joshua Oreman
fc9750a68d [802.11] Fix memory leak on unsuccessful probes
When a probe found no results, the list head of beacons would not be
freed, leaking 16 bytes of memory per probe.

Signed-off-by: Michael Brown <mcb30@etherboot.org>
2009-08-09 00:12:53 +01:00
Joshua Oreman
1e810bebe9 [802.11] Set channels early on to avoid tuning to an undefined channel
Some cards (such as ath5k) always need to tune to a particular channel
when they are reset; the reset may happen upon open(), which is before
the channels array would be set up (in prepare_probe()). Avoid tuning
the card to an inconsistent state by copying the hardware
supported-channels array to the 802.11 device's allowable-channels
array even before channels are "properly" set up.

Signed-off-by: Michael Brown <mcb30@etherboot.org>
2009-08-09 00:11:33 +01:00
Joshua Oreman
f128a6db21 [802.11] Enhance support for driver PHY differences
The prior net80211 model of physical-layer behavior for drivers was
overly simplistic and limited the drivers that could be written.  To
be more flexible, split the driver-provided list of supported rates by
band, and add a means for specifying a list of supported channels.
Allow drivers to specify a hardware channel value that will be tied to
uses of the channel.

Expose net80211_duration() to drivers, and make the rate it uses in
its computations configurable, so that it can be used in calculating
durations that must be set in hardware for ACK and CTS packets. Add
net80211_cts_duration() for the common case of calculating the
duration for a CTS packet.

Signed-off-by: Michael Brown <mcb30@etherboot.org>
2009-08-09 00:11:26 +01:00
Michael Brown
34bfc04e4c [infiniband] Update all other MAD users to use a management interface 2009-08-08 23:56:28 +01:00
Michael Brown
44251ebb9a [infiniband] Update subnet management agent to use a management interface 2009-08-08 23:55:29 +01:00
Michael Brown
0e07516f62 [infiniband] Add the concept of a management interface
A management interface is the component through which both local and
remote management agents are accessed.

This new implementation of a management interface allows for the user
to react to timed-out transactions, and also allows for cancellation
of in-progress transactions.
2009-08-08 23:51:27 +01:00
Michael Brown
b0c563824b [infiniband] Change IB_{QPN,QKEY,QPT} names from {SMA,GMA} to {SMI,GSI}
The IBA specification refers to management "interfaces" and "agents".
The interface is the component that connects to the queue pair and
sends and receives MADs; the agent is the component that constructs
the reply to the MAD.

Rename the IB_{QPN,QKEY,QPT} constants as a first step towards making
this separation in gPXE.
2009-08-06 01:24:18 +01:00
Michael Brown
5552a1b202 [tcp] Avoid printf format warnings on some compilers
In several places, we currently use size_t to represent a difference
between TCP sequence numbers.  This can cause compiler warnings
relating to printf format specifiers, since the result of
(uint32_t+size_t) may be an unsigned long on some compilers.

Fix by using uint32_t for all variables that represent a difference
between TCP sequence numbers.

Tested-by: Joshua Oreman <oremanj@xenon.get-linux.org>
2009-08-02 22:44:57 +01:00
Joshua Oreman
ce64398f87 [802.11] Add support for 802.11 devices with software MAC layer
This is required for all modern 802.11 devices, and allows drivers
to be written for them with minimally more effort than is required
for a wired NIC.

Signed-off-by: Michael Brown <mcb30@etherboot.org>
Modified-by: Michael Brown <mcb30@etherboot.org>
2009-08-01 19:00:32 +01:00
Michael Brown
cc2e767b5a [infiniband] Add Communication Manager (CM)
The Communication Manager is responsible for handling the setup and
teardown of RC connections.
2009-07-17 23:06:35 +01:00
Michael Brown
c939bc57ff [infiniband] Add infrastructure for RC queue pairs
Queue pairs are now assumed to be created in the INIT state, with a
call to ib_modify_qp() required to bring the queue pair to the RTS
state.

ib_modify_qp() no longer takes a modification list; callers should
modify the relevant queue pair parameters (e.g. qkey) directly and
then call ib_modify_qp() to synchronise the changes to the hardware.

The packet sequence number is now a property of the queue pair, rather
than of the device.

Each queue pair may have an associated address vector.  For RC queue
pairs, this is the address vector that will be programmed in to the
hardware as the remote address.  For UD queue pairs, it will be used
as the default address vector if none is supplied to ib_post_send().
2009-07-17 23:06:35 +01:00
Michael Brown
ea6eb7f7ed [infiniband] Pass a generic MAD to ib_set_port_info() 2009-07-17 23:06:35 +01:00
Michael Brown
0095e18d4c [infiniband] Expose supported and enabled link speeds and widths 2009-07-17 23:06:35 +01:00
Michael Brown
773028d34e [infiniband] Allow MAD handlers to indicate response via return value
Now that MAD handlers no longer return a status code, we can allow
them to return a pointer to a MAD structure if and only if they want
to send a response.  This provides a more natural and flexible
approach than using a "response method" field within the handler's
descriptor.
2009-07-17 23:06:35 +01:00
Michael Brown
94876f4bb6 [infiniband] Remove the return status code from MAD handlers
MAD handlers have to set the status fields within the MAD itself
anyway, in order to provide a meaningful response MAD; the additional
gPXE return status code is just noise.

Note that we probably don't need to ever explicitly set the status to
IB_MGMT_STATUS_OK, since it should already have this value from the
request.  (By not explicitly setting the status in this way, we can
safely have ib_sma_set_xxx() call ib_sma_get_xxx() in order to
generate the GetResponse MAD without worrying that ib_sma_get_xxx()
will clear any error status set by ib_sma_set_xxx().)
2009-07-17 23:06:35 +01:00
Michael Brown
f1d92fa886 [infiniband] Allow external QPN to differ from real QPN
Most IB hardware seems not to allow allocation of the genuine QPNs 0
and 1, so allow for the externally-visible QPN (as constructed and
parsed by ib_packet, where used) to differ from the real
hardware-allocated QPN.
2009-07-17 23:06:34 +01:00
Michael Brown
92cf240020 [infiniband] Always create an SMA and a GMA 2009-07-17 23:06:34 +01:00
Michael Brown
80c41b90d2 [infiniband] Add notion of a queue pair type 2009-07-17 23:06:34 +01:00
Michael Brown
3f4972db9a [infiniband] Allow completion queue operations to be optional
The send completion handler typically will just free the I/O buffer,
so allow this common case to be handled by the Infiniband core.
2009-07-17 23:06:34 +01:00
Michael Brown
0582a84e66 [infiniband] Improve ib_packet debugging messages 2009-07-17 23:06:34 +01:00
Michael Brown
165074c188 [infiniband] Implement SMA as an instance of a GMA
The GMA code was based upon the SMA code.  We can save space by making
the SMA simply an instance of the GMA.
2009-07-17 23:06:34 +01:00
Michael Brown
8a852280eb [infiniband] Pass GMA as a parameter to GMA MAD handlers 2009-07-17 23:06:34 +01:00
Michael Brown
cb9ef4dee2 [ipoib] Remove the queue set abstraction
Now that IPoIB has to deal with only one set of queues, the queue set
abstraction becomes merely an inconvenient wrapper.
2009-07-17 23:06:34 +01:00
Michael Brown
0fbf2f6bda [infiniband] Provide a general mechanism for multicast group joins
Generalise out the multicast group membership record code from IPoIB.
2009-07-17 23:06:34 +01:00
Michael Brown
3c77fe73a5 [infiniband] Allow for sending MADs via GMA without retransmission 2009-07-17 23:06:34 +01:00
Michael Brown
b4155c4ab5 [infiniband] Make qkey and rate optional parameters to ib_post_send()
The queue key is stored as a property of the queue pair, and so can
optionally be added by the Infiniband core at the time of calling
ib_post_send(), rather than always having to be specified by the
caller.

This allows IPoIB to avoid explicitly keeping track of the data queue
key.
2009-07-17 23:06:33 +01:00
Michael Brown
d6b47871de [infiniband] Provide a general mechanism for path record lookups
Generalise out the path record lookup code from IPoIB.
2009-07-17 23:06:33 +01:00
Michael Brown
1d8c85d112 [infiniband] Create a general management agent
Generalise the subnet management agent into a general management agent
capable of sending and responding to MADs, including support for
retransmissions as necessary.
2009-07-17 23:06:33 +01:00
Michael Brown
365b8db5cf [infiniband] Centralise SMA and GMA queue constants 2009-07-17 23:06:33 +01:00
Michael Brown
887d296b88 [infiniband] Poll completion queues automatically
Currently, all Infiniband users must create a process for polling
their completion queues (or rely on a regular hook such as
netdev_poll() in ipoib.c).

Move instead to a model whereby the Infiniband core maintains a single
process calling ib_poll_eq(), and polling the event queue triggers
polls of the applicable completion queues.  (At present, the
Infiniband core simply polls all of the device's completion queues.)
Polling a completion queue will now implicitly refill all attached
receive work queues; this is analogous to the way that netdev_poll()
implicitly refills the RX ring.

Infiniband users no longer need to create a process just to poll their
completion queues and refill their receive rings.
2009-07-17 23:06:33 +01:00
Michael Brown
1f5c0239b4 [infiniband] Centralise assumption of 2048-byte payloads
IPoIB and the SMA have separate constants for the packet size to be
used to I/O buffer allocations.  Merge these into the single
IB_MAX_PAYLOAD_SIZE constant.

(Various other points in the Infiniband stack have hard-coded
assumptions of a 2048-byte payload; we don't currently support
variable MTUs.)
2009-07-17 23:06:33 +01:00
Michael Brown
7ba33f7826 [infiniband] Provide ib_get_hca_info() as a commonly-available function 2009-07-17 23:06:33 +01:00