Apart from format specifier fixes there are two changes in proper code:
- Change type of regs in skge_hw to unsigned long
- Cast result of sizeof in myri10ge to uint32_t
Both don't change anything for i386 and should be fine on x86_64.
Signed-off-by: Piotr Jaroszyński <p.jaroszynski@gmail.com>
Signed-off-by: Joshua Oreman <oremanj@rwcr.net>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
This fixes a regression in BOOTP support; since BOOTP requests often
have the `siaddr' field set to 0.0.0.0, they would be considered
duplicates of the first zeroed-out offer slot.
Signed-off-by: Joshua Oreman <oremanj@rwcr.net>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
This removes the need for inline safety wrappers, marginally reducing
the size penalty of weak functions, and works around an apparent
binutils bug that causes undefined weak symbols to not actually be
NULL when compiling with -fPIE (as EFI builds do).
A bug in versions of binutils prior to 2.16 (released in 2005) will
cause same-file weak definitions to not work with those
toolchains. Update the README to reflect our new dependency on
binutils >= 2.16.
Signed-off-by: Joshua Oreman <oremanj@rwcr.net>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
It is permissible for a DHCP packet containing PXE options to specify
only "discovery control", instead of the more typical boot menu +
prompt options. This is the strategy used by older versions of
dnsmasq; by specifying the discovery control as PXEBS_SKIP, they cause
vendor PXE ROMs to ignore boot server discovery and just use the
filename and next-server options in the initial (Proxy)DHCP packet.
Modify iPXE to accept this behavior, to be more compatible with the
Intel firmware.
Signed-off-by: Joshua Oreman <oremanj@rwcr.net>
Tested-by: Kyle Kienapfel <kyle@shadowmage.org>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
PMKID checking is an additional pre-check that helps detect invalid
passphrases before going through the full handshaking procedure. It
takes up some amount of code size, and is not necessary from a
security perspective. It also is implemented improperly by some
routers, which was causing iPXE to give spurious authentication
errors. Remove it for these reasons.
Signed-off-by: Joshua Oreman <oremanj@rwcr.net>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
iPXE currently updates the TCP sequence number after delivering the
data to the application via xfer_deliver_iob(). If the application
responds to the received data by transmitting more data, this would
result in a stale ACK number appearing in the transmitted packet,
which potentially causes retransmissions and also gives the
undesirable appearance of violating causality (by sending a response
to a message that we claim not to have yet received).
Reported-by: Guo-Fu Tseng <cooldavid@cooldavid.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some switch configurations will refuse to enable our port unless we
can speak LACP to inform the switch that we are alive. Add a very
simple passive LACP implementation that is sufficient to convince at
least Linux's bonding driver (when tested using qemu attached to a tap
device enslaved to a bond device configured as "mode=802.3ad").
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Access to the gpxe.org and etherboot.org domains and associated
resources has been revoked by the registrant of the domain. Work
around this problem by renaming project from gPXE to iPXE, and
updating URLs to match.
Also update README, LOG and COPYRIGHTS to remove obsolete information.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Commit 3d9dd93 introduced a regression in HTTP: if a URI without a
path is specified (e.g. http://netboot.me), we send the empty string
as our GET request. Reintroduce an extra slash when uri->path is NULL,
to turn this into the expected GET /.
Reported-by: Kyle Kienapfel <doctor.whom@gmail.com>
Signed-off-by: Joshua Oreman <oremanj@rwcr.net>
Signed-off-by: Marty Connor <mdc@etherboot.org>
Instead of keeping only the best IP and PXE offers, store all of them,
and pick the best to use just before a request is sent. This allows
priority differentiation to work even when lower-priority offers
provide PXE options, and improves robustness at sites with broken PXE
servers intermingled with working ones: when a ProxyDHCP request times
out, instead of giving up, we try the next PXE offer we've received.
It also allows us to avoid breaking up combined IP+PXE offers, which
can be important with some firewall configurations. This behavior
matches that of most vendor PXE ROMs.
Store a reference to the DHCPOFFER packet in the offer structure, so
that when registering settings after a successful ACK we can register
the proxy PXE settings we originally received; this removes the need
for a nonstandard duplicate REQUEST/ACK to port 67 of proxy servers
like dnsmasq that provide PXE options in the OFFER.
Total cost: 450 bytes uncompressed.
Signed-off-by: Marty Connor <mdc@etherboot.org>
The default user and password are used for anonymous FTP by default.
This patch adds support for an explicit user name and password in an FTP
URI:
imgfetch ftp://user:password@server.com/path/to/file
Edited-by: Stefan Hajnoczi <stefanha@gmail.com>. Bugs are my fault.
Signed-off-by: Marty Connor <mdc@etherboot.org>
Currently, handling of URI escapes is ad-hoc; escaped strings are
stored as-is in the URI structure, and it is up to the individual
protocol to unescape as necessary. This is error-prone and expensive
in terms of code size. Modify this behavior by unescaping in
parse_uri() and escaping in unparse_uri() those fields that typically
handle URI escapes (hostname, user, password, path, query, fragment),
and allowing unparse_uri() to accept a subset of fields to print so
it can be easily used to generate e.g. the escaped HTTP path?query
request.
Signed-off-by: Joshua Oreman <oremanj@rwcr.net>
Signed-off-by: Marty Connor <mdc@etherboot.org>
When a DHCP session is started (using autoboot or a command-line `dhcp
net0'), check whether the new setting use-cached (DHCP option 175.178)
is TRUE; if so, skip DHCP and rely on currently registered
settings. This lets one combine a static IP with autoboot.
Before checking the use-cached setting, call a weak
get_cached_dhcpack() hook that can be implemented by particular builds
of gPXE supporting some fashion of retrieving a cached DHCPACK packet.
If one is available, it is registered as an options source, and then
either that packet's option 175.178 or the user's prior manual
use-cached setting can allow skipping duplicate DHCP.
Using cached packets is not the default because DHCP servers are often
configured to give gPXE different options than they give a vendor PXE
client; in order to break the infinite loop of PXE chaining, one would
need to load a gPXE with an embedded image that does something more
than autoboot.
Signed-off-by: Marty Connor <mdc@etherboot.org>
There is no defined error code for aborting a request but 0 is commonly
used. This patch switches the abort request error code from
TFTP_ERR_UNKNOWN_TID (5) to 0.
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
pxenv_tftp_get_fsize is an API call that PXE clients can call to
obtain the size of a remote file. It is implemented by starting a TFTP
transfer with pxe_tftp_open, waiting for the response and then
stopping the transfer with pxe_tftp_close(). This leaves the session
hanging on the TFTP server and it will try to resend the packet
repeatedly (verified with tftpd-hpa) until it times out.
This patch adds a method "tftpsize" that will abort the transfer after
the first packet is received from the server. This will terminate the
session on the server and is the same behaviour as Intel's PXE ROM
exhibits.
Together with a qemu patch to handle the ERROR packet (submitted to
qemu's mailing list), this resolves a specific issue where booting
pxegrub with qemu's TFTP server would be slow or hang.
I've tested this against qemu's tftp server and against my normal boot
infrastructure (tftpd-hpa). Booting pxegrub and loading extra files
now produces a trace similar to Intel's PXE client and there are no
spurious retransmits from tftpd any more.
Signed-off-by: Thomas Horsten <thomas@horsten.com>
Signed-off-by: Milan Plzik <milan.plzik@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
The retry timer is used to retransmit TFTP packets lost on the network,
and to start a new connection. There is an unnecessary delay while
waiting for name resolution because the timer period is fixed and cannot
be shortened when name resolution completes. This patch keeps the timer
period at zero while name resolution takes place so that no time is lost
once before sending the first packet.
Reported-by: Thomas Horsten <thomas@horsten.com>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
This patch adds TFTP support for files larger than 65535 blocks by
wrapping the 16-bit block number.
Reported-by: Mark Johnson <johnson.nh@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
IBM's Tivoli Provisioning Manager for OS Deployment, when acting as a
ProxyDHCP server, sends an initial offer with a vendor class of "PXEClient"
and vendor-encapsulated options that have nothing to do with PXE. To
differentiate between this case and the case of a ProxyDHCP server that
sends all PXE options in its initial offer, modify gPXE to check for
the presence of an encapsulated PXE boot menu option (43.9) instead of
simply checking for the existence of any encapsulated options at all.
This is the same check used by the Intel vendor PXE ROM.
Signed-off-by: Marty Connor <mdc@etherboot.org>
The PXE standard provides examples of ProxyDHCP responses being encoded both
as type DHCPOFFER and DHCPACK, but currently we only accept DHCPACKs. Since
there are PXE servers in existence that respond to ProxyDHCPREQUESTs with
DHCPOFFERs, modify gPXE's ProxyDHCP pruning logic to treat both types of
responses equally.
Signed-off-by: Marty Connor <mdc@etherboot.org>
Change the behaviour for adding DHCP options into a DHCP packet so
that we now append options, rather than insert them in front of
whatever options might already be present.
Apparently, the DHCP relay logic on a Nortel 470-48T layer 2 switch
cares about the order of DHCP options. If we build a DHCP packet
pre-populated with some options, their order will now be preserved,
except for encapsulated options.
Signed-off-by: Marty Connor <mdc@etherboot.org>
Apparently, the DHCP relay logic on a Nortel 470-48T layer 2 switch
cares about the order of DHCP options. Specifically, it requires
that the DHCP message type option be the first option present in the
DHCP packet. We achieve this by having this option appear first in
our dhcp_request_options_data array, which pre-populates DHCP
requests.
Signed-off-by: Marty Connor <mdc@etherboot.org>
Contrary to the IEEE specification, some access points apparently
set the Spectrum Mgmt bit in the capabilities field even when
broadcasting on a 2.4GHz band that does not require spectrum
management. Allow gPXE to attempt to connect to such networks;
if spectrum management is really required, our advertisement
of capabilities not including it will result in an association
failure.
Reported-by: Peter Meyer <residue@xmail.net>
Signed-off-by: Marty Connor <mdc@etherboot.org>
EAPOL is a container protocol that can wrap either EAP packets or
802.11 EAPOL-Key frames. For cleanliness' sake, add a stub that strips
the framing and sends packets off to the appropriate handler if it
is compiled in.
Signed-off-by: Marty Connor <mdc@etherboot.org>
WEP is a highly flawed cryptosystem, barely better than no encryption at all,
but many people still use it. It does have the advantage of being very simple
and small in code size.
Signed-off-by: Marty Connor <mdc@etherboot.org>
IBA section 14.2.5.2 states that "the contents of the NodeDescription
attribute are the same for all ports on a node". Satisfy this by
using the HCA GUID rather than the port GUID to form the node
description string.
We do not discard routing table entries when closing an interface. It
is plausible that multiple interfaces may be on the same physical
network; if so, then we may end up in a situation whereby outbound
packets attempt to route via a closed interface.
Fix by ignoring non-open net devices in ipv4_route().
ipv4.c calculates the default subnet mask before calling
fetch_ipv4_setting() to retrieve the configured subnet mask (if any).
However, as of commit 612f4e7 "[settings] Avoid returning
uninitialised data on error in fetch_xxx_setting()",
fetch_ipv4_setting() will zero the IP address if the setting does not
exist, rather than leaving it unaltered.
Fix by fetching the setting first and calculating the default subnet
mask only if necessary.
ipv4.c uses a gateway address of INADDR_NONE to represent "no
gateway". It initialises the gateway address to INADDR_NONE before
calling fetch_ipv4_setting() to retrieve the configured gateway
address (if any).
However, as of commit 612f4e7 "[settings] Avoid returning
uninitialised data on error in fetch_xxx_setting()",
fetch_ipv4_setting() will zero the IP address if the setting does not
exist, rather than leaving it unaltered.
Fix by using a zero IP address to indicate "no gateway", so that a
non-existent gateway address setting will be treated as such.
The PXE type field is canonically little-endian, but the pxebs command
treats it as big-endian in converting the type number passed on the
command line to a field value to search against. Fix, to prevent the
necessity of incantations like "pxebs net0 1536" to select menu item #6.
Signed-off-by: Michael Brown <mcb30@etherboot.org>
Modified-by: Michael Brown <mcb30@etherboot.org>
The iBFT is Ethernet-centric in providing only six bytes for a MAC
address. This is most probably an indirect consequence of a similar
design flaw in the Windows NDIS stack. (The WinOF IPoIB stack
performs all sorts of contortions in order to pretend to the NDIS
layer that it is dealing with six-byte MAC addresses.)
There is no sensible way in which to extend the iBFT without breaking
compatibility with programs that expect to parse it. Add the notion
of an "Ethernet-compatible" MAC address to our link layer abstraction,
so that link layers can provide their own workarounds for this
limitation.
802.11 multicast hashing is the same as standard Ethernet hashing, so
just expose and use eth_mc_hash().
Signed-off-by: Joshua Oreman <oremanj@rwcr.net>
The recent change to process_add() to detect duplicate process
additions relies on the fact that all processes will be initialized
using process_init_stopped() before being passed to that function.
The autoassociation process was not initialized in this fashion, so
process_add() erroneously detected it as a duplicate.
Fix by using process_init_stopped() to initialize the autoassociation
process instead of setting the step member directly.
Signed-off-by: Michael Brown <mcb30@etherboot.org>
For IPoIB, the chaddr field is too small (16 bytes) to contain the
20-byte IPoIB link-layer address. RFC4390 mandates that we should
pass an empty chaddr field and rely on the DHCP client identifier
instead. This has many problems, not least of which is that a client
identifier containing an IPoIB link-layer address is not very useful
from the point of view of creating DHCP reservations, since the QPN
component is assigned at runtime and may vary between boots.
Leave the DHCP client identifier as-is, to avoid breaking existing
setups as far as possible, but expose the real hardware address (the
port GUID) via the DHCP chaddr field, using the broadcast flag to
instruct the DHCP server not to use this chaddr value as a link-layer
address.
This makes it possible (at least with ISC dhcpd) to create DHCP
reservations using host declarations such as:
host duckling {
fixed-address 10.252.252.99;
hardware unknown-32 00:02:c9:02:00:25:a1:b5;
}
IPoIB has a 20-byte link-layer address, of which only eight bytes
represent anything relating to a "hardware address".
The PXE and EFI SNP APIs expect the permanent address to be the same
size as the link-layer address, so fill in the "permanent address"
field with the initial link layer address (as generated by
register_netdev() based upon the real hardware address).
The hardware address is an intrinsic property of the hardware, while
the link-layer address can be changed at runtime. This separation is
exposed via APIs such as PXE and EFI, but is currently elided by gPXE.
Expose the hardware and link-layer addresses as separate properties
within a net device. Drivers should now fill in hw_addr, which will
be used to initialise ll_addr at the time of calling
register_netdev().
There is diagnostic value in being able to disambiguate between the
various reasons why an IB CM has rejected a connection attempt. In
particular, reason 8 "invalid service ID" can be used to identify an
incorrect SRP service_id root-path component, and reason 28 "consumer
reject" corresponds to a genuine SRP login rejection IU, which can be
passed up to the SRP layer.
For rejection reasons other than "consumer reject", we should not pass
through the private data, since it is most likely generated by the CM
without any protocol-specific knowledge.
Generate errors within individual MAD transaction consumers such as
ib_pathrec.c and ib_mcast.c, rather than within ib_mi.c. This allows
for more meaningful error messages to eventually be displayed to the
user.
SRP is the SCSI RDMA Protocol. It allows for a method of SAN booting
whereby the target is responsible for reading and writing data using
Remote DMA directly to the initiator's memory. The software initiator
merely sends and receives SCSI commands; it never has to touch the
actual data.
The minimal-surprise behaviour, when no explicit SRP initiator device
is specified, will probably be to use the most recently opened
Infiniband device. This matches our behaviour with using the most
recently opened net device for PXE, iSCSI, AoE, NBI, etc.
SRP over Infiniband uses a protocol whereby data is sent via a
combination of the CM private data fields and the RC queue pair
itself. This seems sufficiently generic that it's worth having
available as a separate protocol.
We will terminate our transaction as soon as we receive the first CM
REP, since that provides all the state that we need. However, the
peer may resend the REP if it didn't see our RTU, and if we don't
respond with another RTU we risk being disconnected. (This protocol
appears not to handle retries gracefully.)
Fix by adding a management agent that will listen for these duplicate
REPs and send back an RTU.
When a probe found no results, the list head of beacons would not be
freed, leaking 16 bytes of memory per probe.
Signed-off-by: Michael Brown <mcb30@etherboot.org>
Some cards (such as ath5k) always need to tune to a particular channel
when they are reset; the reset may happen upon open(), which is before
the channels array would be set up (in prepare_probe()). Avoid tuning
the card to an inconsistent state by copying the hardware
supported-channels array to the 802.11 device's allowable-channels
array even before channels are "properly" set up.
Signed-off-by: Michael Brown <mcb30@etherboot.org>
The prior net80211 model of physical-layer behavior for drivers was
overly simplistic and limited the drivers that could be written. To
be more flexible, split the driver-provided list of supported rates by
band, and add a means for specifying a list of supported channels.
Allow drivers to specify a hardware channel value that will be tied to
uses of the channel.
Expose net80211_duration() to drivers, and make the rate it uses in
its computations configurable, so that it can be used in calculating
durations that must be set in hardware for ACK and CTS packets. Add
net80211_cts_duration() for the common case of calculating the
duration for a CTS packet.
Signed-off-by: Michael Brown <mcb30@etherboot.org>
A management interface is the component through which both local and
remote management agents are accessed.
This new implementation of a management interface allows for the user
to react to timed-out transactions, and also allows for cancellation
of in-progress transactions.
The IBA specification refers to management "interfaces" and "agents".
The interface is the component that connects to the queue pair and
sends and receives MADs; the agent is the component that constructs
the reply to the MAD.
Rename the IB_{QPN,QKEY,QPT} constants as a first step towards making
this separation in gPXE.
In several places, we currently use size_t to represent a difference
between TCP sequence numbers. This can cause compiler warnings
relating to printf format specifiers, since the result of
(uint32_t+size_t) may be an unsigned long on some compilers.
Fix by using uint32_t for all variables that represent a difference
between TCP sequence numbers.
Tested-by: Joshua Oreman <oremanj@xenon.get-linux.org>
This is required for all modern 802.11 devices, and allows drivers
to be written for them with minimally more effort than is required
for a wired NIC.
Signed-off-by: Michael Brown <mcb30@etherboot.org>
Modified-by: Michael Brown <mcb30@etherboot.org>
Queue pairs are now assumed to be created in the INIT state, with a
call to ib_modify_qp() required to bring the queue pair to the RTS
state.
ib_modify_qp() no longer takes a modification list; callers should
modify the relevant queue pair parameters (e.g. qkey) directly and
then call ib_modify_qp() to synchronise the changes to the hardware.
The packet sequence number is now a property of the queue pair, rather
than of the device.
Each queue pair may have an associated address vector. For RC queue
pairs, this is the address vector that will be programmed in to the
hardware as the remote address. For UD queue pairs, it will be used
as the default address vector if none is supplied to ib_post_send().
Now that MAD handlers no longer return a status code, we can allow
them to return a pointer to a MAD structure if and only if they want
to send a response. This provides a more natural and flexible
approach than using a "response method" field within the handler's
descriptor.
MAD handlers have to set the status fields within the MAD itself
anyway, in order to provide a meaningful response MAD; the additional
gPXE return status code is just noise.
Note that we probably don't need to ever explicitly set the status to
IB_MGMT_STATUS_OK, since it should already have this value from the
request. (By not explicitly setting the status in this way, we can
safely have ib_sma_set_xxx() call ib_sma_get_xxx() in order to
generate the GetResponse MAD without worrying that ib_sma_get_xxx()
will clear any error status set by ib_sma_set_xxx().)
Most IB hardware seems not to allow allocation of the genuine QPNs 0
and 1, so allow for the externally-visible QPN (as constructed and
parsed by ib_packet, where used) to differ from the real
hardware-allocated QPN.
The queue key is stored as a property of the queue pair, and so can
optionally be added by the Infiniband core at the time of calling
ib_post_send(), rather than always having to be specified by the
caller.
This allows IPoIB to avoid explicitly keeping track of the data queue
key.
Generalise the subnet management agent into a general management agent
capable of sending and responding to MADs, including support for
retransmissions as necessary.
Currently, all Infiniband users must create a process for polling
their completion queues (or rely on a regular hook such as
netdev_poll() in ipoib.c).
Move instead to a model whereby the Infiniband core maintains a single
process calling ib_poll_eq(), and polling the event queue triggers
polls of the applicable completion queues. (At present, the
Infiniband core simply polls all of the device's completion queues.)
Polling a completion queue will now implicitly refill all attached
receive work queues; this is analogous to the way that netdev_poll()
implicitly refills the RX ring.
Infiniband users no longer need to create a process just to poll their
completion queues and refill their receive rings.
IPoIB and the SMA have separate constants for the packet size to be
used to I/O buffer allocations. Merge these into the single
IB_MAX_PAYLOAD_SIZE constant.
(Various other points in the Infiniband stack have hard-coded
assumptions of a 2048-byte payload; we don't currently support
variable MTUs.)
IPoIB has a link-layer broadcast address that varies according to the
partition key. We currently go through several contortions to pretend
that the link-layer address is a fixed constant; by making the
broadcast address a property of the network device rather than the
link-layer protocol it will be possible to simplify IPoIB's broadcast
handling.