david/ipxe
Archived
1
0
Commit Graph

5525 Commits

Author SHA1 Message Date
Michael Brown
4775dd3835 [thunderx] Add driver for Cavium ThunderX SoC NICs
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-06-13 18:41:26 +01:00
Michael Brown
3c61e11fe1 [cmdline] Add "ntp" command
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-06-13 15:57:16 +01:00
Michael Brown
fce6117ad9 [ntp] Add simple NTP client
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-06-13 15:55:49 +01:00
Michael Brown
e6111c1517 [time] Allow system clock to be adjusted at runtime
Provide a mechanism to allow an arbitrary adjustment to be applied to
all subsequent calls to time().

Note that the underlying clock source (e.g. the RTC clock) will not be
changed; only the time as reported within iPXE will be affected.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-06-13 15:29:05 +01:00
Leendert van Doorn
02d5cfff22 [tg3] Add missing memory barrier
ARM64 has a weaker memory order model than x86.  The missing memory
barrier caused phy initialization notification to be delayed beyond
the link-wait timeout (15 secs).

Signed-off-by: Leendert van Doorn <leendert@paramecium.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-06-13 15:14:43 +01:00
Michael Brown
188789eb3c [tcp] Send TCP keepalives on idle established connections
In some circumstances, intermediate devices may lose state in a way
that temporarily prevents the successful delivery of packets from a
TCP peer.  For example, a firewall may drop a NAT forwarding table
entry.

Since iPXE spends most of its time downloading files (and hence purely
receiving data, sending only TCP ACKs), this can easily happen in a
situation in which there is no reason for iPXE's TCP stack to generate
any retransmissions.  The temporary loss of connectivity can therefore
effectively become permanent.

Work around this problem by sending TCP keepalives after a period of
inactivity on an established connection.

TCP keepalives usually send a single garbage byte in sequence number
space that has already been ACKed by the peer.  Since we do not need
to elicit a response from the peer, we instead send pure ACKs (with no
garbage data) in order to keep the transmit code path simple.

Originally-implemented-by: Ladi Prosek <lprosek@redhat.com>
Debugged-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-06-13 09:58:32 +01:00
Leendert van Doorn
5c2a959a72 [tg3] Fix address truncation bug on 64-bit machines
Signed-off-by: Leendert van Doorn <leendert@paramecium.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-06-10 15:45:19 +01:00
Michael Brown
b42e71921f [http] Accept headers with no whitespace following the colon
Reported-by: Raphael Cohn <raphael.cohn@stormmq.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-06-09 12:27:04 +01:00
Michael Brown
f76210961c [pci] Support systems with multiple PCI root bridges
Extend the 16-bit PCI bus:dev.fn address to a 32-bit seg🚌dev.fn
address, assuming a segment value of zero in contexts where multiple
segments are unsupported by the underlying data structures (e.g. in
the iBFT or BOFM tables).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-06-09 09:36:28 +01:00
Michael Brown
2c197517f2 [libc] Always use a non-zero seed for the (non-crypto) RNG
The non-cryptographic RNG implemented by random() has the property
that a seed value of zero will result in a generated sequence of
all-zero values.  This situation can arise if currticks() returns zero
at start of day.

Work around this problem by falling back to a fixed non-zero seed if
necessary.

This has no effect on the separate DRBG used by cryptographic code.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-06-09 08:44:32 +01:00
Vinson Lee
f6e8b800be [build] Remove nested "my" declaration
Fix build error with perl >= 5.23.2:

  Can't redeclare "my" in "my" at ./util/parserom.pl line 160

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Robin Smidsrød <robin@smidsrod.no>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-06-03 18:09:54 +01:00
Michael Brown
aa4b038c70 [efi] Expose DHCP packets via the Apple NetBoot protocol
Mac OS X uses non-standard EFI protocols to obtain the DHCP packets
from the UEFI firmware.

Originally-implemented-by: Michael Kuron <m.kuron@gmx.de>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-05-29 13:10:14 +01:00
Michael Brown
af9afd0a86 [dhcp] Fix definitions for x86_64 and EFI BC client architectures
There has been a longstanding disagreement between RFC4578 and the
IANA "Processor Architecture Types" registry.  RFC4578 section 2.1
defines type 7 as "EFI BC" and type 9 as "EFI x86-64"; the IANA
registry quotes RFC4578 as its source but has these values erroneously
swapped.  The EDK2 codebase uses the IANA values.

As of March 2016, RFC4578 has been modified by an errata to match the
values as recorded in the IANA registry.

Fix our definitions to match the consensus values.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-05-26 13:58:37 +01:00
Michael Brown
31d4a7b8db [arm] Use correct DHCP client architecture values
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-05-26 13:43:33 +01:00
Michael Brown
ee5dfb75aa [axge] Add driver for ASIX 10/100/1000 USB Ethernet NICs
Add driver for the AX88178A (USB2) and AX88179 (USB3) 10/100/1000
Ethernet NICs.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-05-26 12:52:06 +01:00
Michael Brown
8dd39b9572 [efi] Work around broken UEFI keyboard drivers
Some UEFI keyboard drivers are blissfully unaware of the existence of
either Ctrl key, and will report "Ctrl-<key>" as just "<key>".  This
breaks substantial portions of the iPXE user interface.

Work around these broken UEFI drivers by allowing "ESC <key>" to be
used as a substitute for "Ctrl-<key>".

Tested-by: Dreamcat4 <dreamcat4@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-05-25 23:28:41 +01:00
Michael Brown
f42b2585fe [http] Ignore unrecognised "Connection" header tokens
Some HTTP/2 servers send the header "Connection: upgrade, close".  This
currently causes iPXE to fail due to the unrecognised "upgrade" token.

Fix by ignoring any unrecognised tokens in the "Connection" header.

Reported-by: Ján ONDREJ (SAL) <ondrejj@salstar.sk>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-05-25 15:35:43 +01:00
Michael Brown
80dd6cbcc4 [lotest] Add option to use broadcast packets for loopback testing
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-05-23 14:17:47 +01:00
Michael Brown
231adda40f [netdevice] Fix failure path in register_netdev()
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-05-23 14:17:47 +01:00
Michael Brown
56c0147deb [settings] Extend numerical setting tags to "unsigned long"
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-05-20 16:51:56 +01:00
Michael Brown
6d2bdc4ea3 [pci] Add support for PCI Enhanced Allocation
Some embedded devices have immovable BARs, which are described via a
PCI Enhanced Allocation capability.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-05-20 16:51:56 +01:00
Michael Brown
276d7c31c5 [undi] Work around broken HP EliteBook 745 G3 PXE ROM
Reported-by: Arturino Mazzei <mazzeia@hotmail.com>
Tested-by: Arturino Mazzei <mazzeia@hotmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-05-13 13:22:06 +01:00
Christian Hesse
858f56e68b [ath9k] Fix buffer overrun for ar9287
This backport is from linux kernel upstream commit 83d6f1f ("ath9k:
fix buffer overrun for ar9287").

Signed-off-by: Christian Hesse <mail@eworm.de>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-05-12 14:55:13 +01:00
Michael Brown
601706688b [arm] Use CNTVCT_EL0 as profiling timestamp
The raw cycle counter at PMCCNTR_EL0 works in qemu but seems to always
read as zero on physical hardware (tested on Juno r1 and Cavium
ThunderX), even after ensuring that PMCR_EL0.E and PMCNTENSET_EL0.C
are both enabled.

Use CNTVCT_EL0 instead; this seems to count at a lower resolution
(tens of CPU cycles), but is usable for profiling.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-05-12 11:16:41 +01:00
Michael Brown
6164741f81 [efi] Guard against GetStatus() failing to return a NULL TX buffer
The UEFI specification requires the EFI_SIMPLE_NETWORK_PROTOCOL
GetStatus() method to set TxBuf to NULL if there are no transmit
buffers to recycle.

Some implementations (observed with Lan9118Dxe in EDK2) fill in TxBuf
only when there is a transmit buffer to recycle, which leads to large
numbers of "spurious TX completion" errors.

Work around this problem by initialising TxBuf to NULL before calling
the GetStatus() method.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-05-11 23:02:10 +01:00
Michael Brown
47931a4de5 [arm] Add optimised TCP/IP checksumming for 64-bit ARM
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-05-11 08:16:36 +01:00
Michael Brown
95716ece91 [arm] Add optimised string functions for 64-bit ARM
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-05-11 08:15:52 +01:00
Michael Brown
a966570dce [libc] Avoid implicit assumptions about potentially-optimised memcpy()
Do not assume that an architecture-specific optimised memcpy() will
have the same properties as generic_memcpy() in terms of handling
overlapping regions.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-05-09 16:23:38 +01:00
Michael Brown
45cd68c0fb [efi] Allow for building with older versions of elf.h system header
Reported-by: Ahmad Mahagna <ahmhad@mellanox.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-05-09 16:18:10 +01:00
Michael Brown
17c6f322ee [arm] Add support for 64-bit ARM (Aarch64)
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-05-08 00:20:20 +01:00
Michael Brown
edea3a434c [arm] Split out 32-bit-specific code to arch/arm32
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-05-08 00:18:35 +01:00
Michael Brown
2a187f480e [arm] Avoid instruction references to symbols defined via ".equ"
When building for 64-bit ARM, some symbol references may be resolved
via an "adrp" instruction (to obtain the start of the 4kB page
containing the symbol) and a separate 12-bit offset.  For example
(taken from the GNU assembler documentation):

  adrp x0, foo
  ldr  x0, [x0, #:lo12:foo]

We occasionally refer to symbols defined via mechanisms that are not
directly visible to gcc.  For example:

  extern char some_magic_symbol[];
  __asm__ ( ".equ some_magic_symbol, some_magic_expression" );

The subsequent use of the ":lo12:" prefix on such magically-defined
symbols triggers an assertion failure in the assembler.

This problem seems to affect only "private_key_len" in the current
codebase.  Fix by storing this value as static data; this avoids the
need to provide the value as a literal within the instruction stream,
and so avoids the problematic use of the ":lo12:" prefix.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-05-08 00:08:48 +01:00
Michael Brown
1a16f67a28 [arm] Add support for 32-bit ARM
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-05-06 12:08:44 +01:00
Michael Brown
49a5bcfba6 [bitops] Fix typo in test case
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-05-05 23:42:57 +01:00
Michael Brown
67f539fa80 [libgcc] Provide __divmoddi4()
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-05-05 23:42:57 +01:00
Michael Brown
a5885fbc19 [legacy] Fix building with GCC 6
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-05-04 16:01:33 +01:00
Michael Brown
63037bdce4 [ath] Fix building with GCC 6
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-05-04 16:01:33 +01:00
Michael Brown
08230599ef [golan] Fix building with GCC 6
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-05-04 16:01:32 +01:00
Michael Brown
76ec2a0540 [skge] Fix building with GCC 6
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-05-04 16:01:32 +01:00
Michael Brown
65b32a0b70 [sis190] Fix building with GCC 6
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-05-04 16:01:32 +01:00
Vinson Lee
e2f14c2f8c [mucurses] Fix GCC 6 nonnull-compare errors
Remove null checks for arguments declared as nonnull.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-05-04 16:01:32 +01:00
Michael Brown
57d0ea7c46 [efi] Generalise EFI entropy generation to non-x86 CPUs
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-05-04 14:34:24 +01:00
Michael Brown
757ab98381 [efi] Use a timer event to generate the currticks() timer
We currently use the EFI_CPU_ARCH_PROTOCOL's GetTimerValue() method to
generate the currticks() timer, calibrated against a 1ms delay from
the boot services Stall() method.

This does not work on ARM platforms, where GetTimerValue() is an empty
stub which just returns EFI_UNSUPPORTED.

Fix by instead creating a periodic timer event, and using this event
to increment a current tick counter.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-05-04 13:38:33 +01:00
Michael Brown
1e066431a4 [tcpip] Do not fall back to using unoptimised TCP/IP checksumming
Require architecture-specific code to make a deliberate choice to use
the unoptimised generic_tcpip_continue_chksum() function, if there is
no optimised version available.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-05-04 13:38:25 +01:00
Michael Brown
9f91df422b [build] Remove unnecessary dependency on zlib
The dependency on zlib seems to have been introduced in commit 3dd7ce1
("[efi] Allow building with non-system libbfd") as an indirect
requirement of either libbfd or libiberty when building on Mac OS X.
Since we no longer use either of these libraries, remove the
unnecessary link against zlib.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-05-02 23:09:49 +01:00
Michael Brown
efd5cf9aad [efi] Eliminate use of libbfd
Parse the intermediate ELF file directly instead of using libbfd, in
order to allow for cross-compiled ELF objects.

As a side bonus, this eliminates libbfd as a build requirement.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-05-02 22:35:14 +01:00
Michael Brown
71560d1854 [librm] Preserve FPU, MMX and SSE state across calls to virt_call()
The IBM Tivoli Provisioning Manager for OS Deployment (also known as
TPMfOSD, Rembo-ia32, or Rembo Auto-Deploy) has a serious bug in some
older versions (observed with v5.1.1.0, apparently fixed by v7.1.1.0)
which can lead to arbitrary data corruption.

As mentioned in commit 87723a0 ("[libflat] Test A20 gate without
switching to flat real mode"), Tivoli's NBP sets up a VMM and makes
calls to the PXE stack in VM86 mode.  This appears to be some kind of
attempt to run PXE API calls inside a sandbox.  The VMM is fairly
sophisticated: for example, it handles our attempts to switch into
protected mode and patches our GDT so that our protected-mode code
runs in ring 1 instead of ring 0.  However, it neglects to apply any
memory protections.  In particular, it does not enable paging and
leaves us with 4GB segment limits.  We can therefore trivially break
out of the sandbox by simply overwriting the GDT (or by modifying any
of Tivoli's VMM code or data structures).

When we attempt to execute privileged instructions (such as "lidt"),
the CPU raises an exception and control is passed to the Tivoli VMM.
This may result in a call to Tivoli's memcpy() function.

Tivoli's memcpy() function includes optimisations which use the SSE
registers %xmm0-%xmm3 to speed up aligned memory copies.

Unfortunately, the Tivoli VMM's exception handler does not save or
restore %xmm0-%xmm3.  The net effect of this bug in the Tivoli VMM is
that any privileged instruction (such as "lidt") issued by iPXE may
result in unexpected corruption of the %xmm0-%xmm3 registers.

Even more unfortunately, this problem affects the code path taken in
response to a hardware interrupt from the NIC, since that code path
will call PXENV_UNDI_ISR.  The net effect therefore becomes that any
NIC hardware interrupt (e.g. due to a received packet) may result in
unexpected corruption of the %xmm0-%xmm3 registers.

If a packet arrives while Tivoli is in the middle of using its
memcpy() function, then the unexpected corruption of the %xmm0-%xmm3
registers will result in unexpected corruption in the destination
buffer.  The net effect therefore becomes that any received packet may
result in a 16-byte block of corruption somewhere in any data that
Tivoli copied using its memcpy() function.

We can work around this bug in the Tivoli VMM by saving and restoring
the %xmm0-%xmm3 registers across calls to virt_call().  To work around
the problem, we need to save registers before attempting to execute
any privileged instructions, and ensure that we attempt no further
privileged instructions after restoring the registers.

This is less simple than it may sound.  We can use the "movups"
instruction to save and restore individual registers, but this will
itself generate an undefined opcode exception if SSE is not currently
enabled according to the flags in %cr0 and %cr4.  We can't access %cr0
or %cr4 before attempting the "movups" instruction, because access a
control register is itself a privileged instruction (which may
therefore trigger corruption of the registers that we're trying to
save).

The best solution seems to be to use the "fxsave" and "fxrstor"
instructions.  If SSE is not enabled, then these instructions may fail
to save and restore the SSE register contents, but will not generate
an undefined opcode exception.  (If SSE is not enabled, then we don't
really care about preserving the SSE register contents anyway.)

The use of "fxsave" and "fxrstor" introduces an implicit assumption
that the CPU supports SSE instructions (even though we make no
assumption about whether or not SSE is currently enabled).  SSE was
introduced in 1999 with the Pentium III (and added by AMD in 2001),
and is an architectural requirement for x86_64.  Experimentation with
current versions of gcc suggest that it may generate SSE instructions
even when using "-m32", unless an explicit "-march=i386" or "-mno-sse"
is used to inhibit this.  It therefore seems reasonable to assume that
SSE will be supported on any hardware that might realistically be used
with new iPXE builds.

As a side benefit of this change, the MMX register %mm0 will now be
preserved across virt_call() even in an i386 build of iPXE using a
driver that requires readq()/writeq(), and the SSE registers
%xmm0-%xmm5 will now be preserved across virt_call() even in an x86_64
build of iPXE using the Hyper-V netvsc driver.

Experimentation suggests that this change adds around 10% to the
number of cycles required for a do-nothing virt_call(), most of which
are due to the extra bytes copied using "rep movsb".  Since the number
of bytes copied is a compile-time constant local to librm.S, we could
potentially reduce this impact by ensuring that we always copy a whole
number of dwords and so can use "rep movsl" instead of "rep movsb".

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-05-02 14:02:31 +01:00
Michael Brown
fe62f3c831 [tg3] Fix _tg3_flag() for 64-bit builds
Commit 86f96a4 ("[tg3] Remove x86-specific inline assembly")
introduced a regression in _tg3_flag() in 64-bit builds, since any
flags in the upper 32 bits of a 64-bit unsigned long would be
discarded when truncating to a 32-bit int.

Debugged-by: Shane Thompson <shane.thompson@aeontech.com.au>
Tested-by: Shane Thompson <shane.thompson@aeontech.com.au>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-05-02 13:25:56 +01:00
Michael Brown
2d42d3cff6 [librm] Reduce real-mode stack consumption in virt_call()
Some PXE NBPs are known to make PXE API calls with very little space
available on the real-mode stack.  For example, the Rembo-ia32 NBP
from some versions of IBM's Tivoli Provisioning Manager for Operating
System Deployment (TPMfOSD) will issue calls with the real-mode stack
placed at 0000:03d2; this is at the end of the interrupt vector table
and leaves only 498 bytes of stack space available before overwriting
the hardware IRQ vectors.  This limits the amount of state that we can
preserve before transitioning to protected mode.

Work around these challenging conditions by preserving everything
other than the initial register dump in a temporary static buffer
within our real-mode data segment, and copying the contents of this
buffer to the protected-mode stack.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-04-29 11:48:39 +01:00
Michael Brown
b696a5063e [image] Skip misleading "format not recognised" error message
Return success (rather than failure) after an image format has been
correctly identified.

This has no practical effect, since the return value from
image_probe() is deliberately never used, but avoids a somewhat
surprising and misleading "format not recognised" error message when
debugging is enabled.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-04-28 12:12:50 +01:00
Michael Brown
55e409b14f [libgcc] Provide symbol to handle gcc's implicit calls to memset()
On some architectures (such as ARM), gcc will insert implicit calls to
memset().  Handle these using the same mechanism as for the implicit
calls to memcpy() used by x86.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-04-20 16:46:24 +01:00
Michael Brown
91aa188fbb [libc] Allow CPU architectures to use unoptimised string functions
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-04-19 16:30:49 +01:00
Michael Brown
40a8a5294c [ethernet] Make LACP support configurable at build time
Add a build configuration option NET_PROTO_LACP to control whether or
not LACP support is included for Ethernet devices.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-04-18 10:08:46 +01:00
Ladi Prosek
988243c93f [virtio] Add virtio-net 1.0 support
This commit makes virtio-net support devices with VEN 0x1af4 and DEV
0x1041, which is how non-transitional (modern-only) virtio-net devices
are exposed on the PCI bus.

Transitional devices supporting both the old 0.9.5 and new 1.0 version
of the virtio spec are driven using the new protocol.  Legacy devices
are driven using the old protocol, same as before this commit.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-04-15 17:43:07 +01:00
Ladi Prosek
8a055a2a70 [virtio] Add virtio 1.0 PCI support
This commit adds support for driving virtio 1.0 PCI devices.  In
addition to various helpers, a number of vpm_ functions are introduced
to be used instead of their legacy vp_ counterparts when accessing
virtio 1.0 (aka modern) devices.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-04-15 17:41:26 +01:00
Ladi Prosek
7b499f849e [virtio] Add virtio 1.0 constants and data structures
Virtio 1.0 introduces new constants and data structures, common to all
devices as well as specific to virtio-net.  This commit adds a subset
of these to be able to drive the virtio-net 1.0 network device.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-04-15 17:28:06 +01:00
Ladi Prosek
2379494918 [pci] Add pci_find_next_capability()
PCI devices may support more capabilities of the same type (for
example PCI_CAP_ID_VNDR) and there was no way to discover all of them.
This commit adds a new API pci_find_next_capability which provides
this functionality.  It would typically be used like so:

  for (pos = pci_find_capability(pci, PCI_CAP_ID_VNDR);
       pos > 0;
       pos = pci_find_next_capability(pci, pos, PCI_CAP_ID_VNDR)) {
    ...
  }

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-04-15 17:27:35 +01:00
Michael Brown
5e5450c2d0 [comboot] Support COMBOOT in 64-bit builds
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-04-15 15:31:36 +01:00
Suresh Sundriyal
4afb758423 [pool] Fix check for reopenable pooled connections
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-04-12 14:18:17 +01:00
Wissam Shoukair
0eea8b5c3b [golan] Add missing iounmap()
Signed-off-by: Wissam Shoukair <wissams@mellanox.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-04-12 13:40:50 +01:00
Wissam Shoukair
ffd959a1d6 [mlx_icmd] Fix compilation error in GCC versions newer than 4.6.4
Signed-off-by: Wissam Shoukair <wissams@mellanox.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-04-12 13:38:10 +01:00
Michael Brown
5238c85b62 [efi] Work around broken EFI HII specification
The EFI_HII_CONFIG_ACCESS_PROTOCOL's ExtractConfig() method is passed
a request string which includes the parameters being queried plus an
apparently meaningless blob of information (the ConfigHdr), and is
expected to include this same meaningless blob of information in the
results string.

Neither the specification nor the existing EDK2 code (including the
nominal reference implementation in the DriverSampleDxe driver)
provide any reason for the existence of this meaningless blob of
information.  It appears to be consumed in its entirety by the
EFI_HII_CONFIG_ROUTING_PROTOCOL, and to contain zero bits of
information by the time it reaches an EFI_HII_CONFIG_ACCESS_PROTOCOL
instance.  It would potentially allow for multiple configuration data
sets to be handled by a single EFI_HII_CONFIG_ACCESS_PROTOCOL
instance, in a style alien to the rest of the UEFI specification
(which implicitly assumes that the instance pointer is always
sufficient to uniquely identify the instance).

iPXE currently handles this by simply copying the ConfigHdr from the
request string to the results string, and otherwise ignoring it.  This
approach is also used by some code in EDK2, such as OVMF's PlatformDxe
driver.

As of EDK2 commit 8a45f80 ("MdeModulePkg: Make HII configuration
settings available to OS runtime"), this causes an assertion failure
inside EDK2.  The failure arises when iPXE is handled a NULL request
string, and responds (as per the specification) with a results string
including all settings.  Since there is no meaningless blob to copy
from the request string, there is no corresponding meaningless blob in
the results string.  This now causes an assertion failure in
HiiDatabaseDxe's HiiConfigRoutingExportConfig().

The same failure does not affect the OVMF PlatformDxe driver, which
simply passes the request string to the HII BlockToConfig() utility
function.  The BlockToConfig() function returns EFI_INVALID_PARAMETER
when passed a null request string, and PlatformDxe propagates this
error directly to the caller.

Fix by matching the behaviour of OVMF's PlatformDxe driver: explicitly
return EFI_INVALID_PARAMETER if the request string is NULL or empty.
This violates the specification (insofar as it is feasible to
determine what the specification actually requires), but causes
correct behaviour with the EDK2 codebase.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-04-12 12:24:14 +01:00
Michael Brown
cc8824ad4e [libc] Print "<NULL>" for wide-character NULL strings
The existing code intends to print NULL strings as "<NULL>" (for the
sake of debug messages), but the logic is incorrect when handling
wide-character strings.  Fix the logic and add applicable unit tests.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-04-12 11:53:06 +01:00
Michael Brown
320488d0f9 [test] Update snprintf_ok() to use okx()
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-04-12 11:45:58 +01:00
Michael Brown
597521ef53 [qib7322] Validate payload length
There is no way for the hardware to give us an invalid length in the
LRH, since it must have parsed this length field in order to perform
header splitting.  However, this is difficult to prove conclusively.

Add an unnecessary length check to explicitly reject any packets
larger than the posted receive I/O buffer.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-30 07:31:51 +01:00
Michael Brown
c9af896314 [linda] Validate payload length
There is no way for the hardware to give us an invalid length in the
LRH, since it must have parsed this length field in order to perform
header splitting.  However, this is difficult to prove conclusively.

Add an unnecessary length check to explicitly reject any packets
larger than the posted receive I/O buffer.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-30 07:27:09 +01:00
Michael Brown
70509e6a03 [netdevice] Return ENOENT for an unknown bus type
It is possible for the preloaded UNDI device to end up with no
specified bus type, since it may not be recognised as either a PCI or
an ISAPnP device.  This will result in a bus type value of zero, which
currently results in NULL being treated as a string pointer by
netdev_fetch_bustype().

Fix by returning ENOENT if an unknown bus type is specified.

Reported-by: Todd Stansell <todd@stansell.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-29 20:59:30 +01:00
Christian Nilsson
ef1c4b1c90 [intel] Add PCI device ID for another I219-V
Signed-off-by: Christian Nilsson <nikize@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-29 19:53:20 +01:00
Michael Brown
97c3f6e55a [iscsi] Include DHCP server address in iBFT
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-29 19:38:18 +01:00
Michael Brown
f8e1678b84 [crypto] Allow cross-certificate source to be configured at build time
Provide a build option CROSSCERT in config/crypto.h to allow the
default cross-signed certificate source to be configured at build
time.  The ${crosscert} setting may still be used to reconfigure the
cross-signed certificate source at runtime.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-24 19:25:03 +00:00
Michael Brown
c4e8c40227 [prefix] Use CRC32 to verify each block prior to decompression
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-24 16:52:26 +00:00
Christian Hesse
05027a7a12 [golan] Fix build error on some versions of gcc
Some versions of gcc complain that "'__bswap_variable_32' is static
but used in inline function 'golan_check_rc_and_cmd_status' which is
not static".

Fix by making golan_check_rc_and_cmd_status() a static inline.

Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-23 05:59:44 +00:00
Wissam Shoukair
0a20373a2f [golan] Add Connect-IB, ConnectX-4 and ConnectX-4 Lx (Infiniband) support
Signed-off-by: Wissam Shoukair <wissams@mellanox.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-22 17:55:55 +00:00
Michael Brown
3df598849b [pxe] Implicitly open network device in PXENV_UDP_OPEN
Some end-user configurations have been observed in which the first NBP
(such as GRUB2) uses the UNDI API and then transfers control to a
second NBP (such as pxelinux) which uses the UDP API.  The first NBP
closes the network device using PXENV_UNDI_CLOSE, which renders the
UDP API unable to transmit or receive packets.

The correct behaviour under these circumstances is (as often) simply
not documented by the PXE specification.  Testing with the Intel PXE
stack suggests that PXENV_UDP_OPEN will implicitly reopen the network
device if necessary, so match this behaviour.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-22 17:33:21 +00:00
Michael Brown
ee3122bc54 [libc] Make sleep() interruptible
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-22 16:18:42 +00:00
Michael Brown
860d5904fb [arbel] Fix received packet length
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-22 16:11:58 +00:00
Michael Brown
3ad028cf1c [hermon] Fix received packet length
Debugged-by: Wissam Shoukair <wissams@mellanox.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-22 16:09:18 +00:00
Michael Brown
59bae324c0 [etherfabric] Avoid use of sleep() in driver code
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-22 15:19:25 +00:00
Michael Brown
c640b954cd [3c5x9] Avoid use of sleep() in driver code
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-22 15:14:07 +00:00
Michael Brown
c32b07b81b [int13] Allow default drive to be specified via "san-drive" setting
The DHCP option 175.189 has been defined (by us) since 2006 as
containing the drive number to be used for a SAN boot, but has never
been automatically used as such by iPXE.

Use this option (if specified) to override the default SAN drive
number.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-22 09:55:09 +00:00
Michael Brown
ab5b3abbba [int13] Allow drive to be hooked using the natural drive number
Interpret the maximum drive number (0xff for hard disks, 0x7f for
floppy disks) as meaning "use natural drive number".

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-22 09:55:09 +00:00
Michael Brown
311a5732c8 [gdb] Add support for x86_64
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-22 08:44:32 +00:00
Michael Brown
1afcccd5fd [build] Do not use "objcopy -O binary" for objects with relocation records
The mbr.bin and usbdisk.bin standalone blobs are currently generated
using "objcopy -O binary", which does not process relocation records.

For the i386 build, this does not matter since the section start
address is zero and so the ".rel" relocation records are effectively
no-ops anyway.

For the x86_64 build, the ".rela" relocation records are not no-ops,
since the addend is included as part of the relocation record (rather
than inline).  Using "objcopy -O binary" will silently discard the
relocation records, with the result that all symbols are effectively
given a value of zero.

Fix by using "ld --oformat binary" instead of "objcopy -O binary" to
generate mbr.bin and usbdisk.bin.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-21 17:49:58 +00:00
Michael Brown
173c0c2536 [infiniband] Allow drivers to override the eIPoIB LEMAC
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-21 09:30:42 +00:00
Michael Brown
57c63047e3 [arbel] Allocate space for GRH on UD queue pairs
As with the previous commit (for Hermon), allocate a separate ring
buffer to hold received GRHs.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-21 08:55:02 +00:00
Michael Brown
e84c917f39 [hermon] Allocate space for GRH on UD queue pairs
The Infiniband specification (volume 1, section 11.4.1.2 "Post Receive
Request") notes that for UD QPs, the GRH will be placed in the first
40 bytes of the receive buffer if present.  (If no GRH is present,
which is normal, then the first 40 bytes of the receive buffer will be
unused.)

Mellanox hardware performs this placement automatically: other headers
will be stripped (and their values returned via the CQE), but the
first 40 bytes of the data buffer will be consumed by the (probably
non-existent) GRH.

This does not fit neatly into iPXE's internal abstraction, which
expects the data buffer to represent just the data payload with the
addresses from the GRH (if present) passed as additional parameters to
ib_complete_recv().

The end result of this discrepancy is that attempts to receive
full-sized 2048-byte IPoIB packets on Mellanox hardware will fail.

Fix by allocating a separate ring buffer to hold the received GRHs.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-21 08:18:15 +00:00
Michael Brown
0141ea3a77 [crypto] Allow trusted certificates to be stored in non-volatile options
The intention of the existing code (as documented in its own comments)
is that it should be possible to override the list of trusted root
certificates using a "trust" setting held in non-volatile stored
options.  However, the rootcert_init() function currently executes
before any devices have been probed, and so will not be able to
retrieve any such non-volatile stored options.

Fix by executing rootcert_init() only after devices have been probed.
Since startup functions may be executed multiple times (unlike
initialisation functions), add an explicit flag to preserve the
property that rootcert_init() should run only once.

As before, if an explicit root of trust is specified at build time,
then any runtime "trust" setting will be ignored.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-20 17:26:09 +00:00
Michael Brown
4a861cc61c [qib7322] Add missing iounmap()
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-20 14:55:18 +00:00
Michael Brown
bea9ee2397 [linda] Add missing iounmap()
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-20 14:54:08 +00:00
Michael Brown
692324905e [arbel] Add missing iounmap()
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-20 14:52:01 +00:00
Michael Brown
e2cdbd51a8 [hermon] Add missing iounmap()
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-20 14:46:40 +00:00
Michael Brown
750a2efeb2 [ipoib] Allow external code to identify IPoIB network devices
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-20 09:22:55 +00:00
Michael Brown
ef0297b527 [libc] Allow container_of() to be used on volatile pointers
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-18 08:18:31 +00:00
Michael Brown
04ef198d2f [efi] Move architecture-independent EFI prefixes to interface/efi
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-17 14:51:14 +00:00
Michael Brown
dbc9e591a5 [test] Move i386-specific tests to arch/i386/tests
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-17 14:32:00 +00:00
Michael Brown
c14971bf88 [xen] Use generic test_and_clear_bit() function
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-16 22:46:05 +00:00
Michael Brown
9bab13a772 [hyperv] Use generic set_bit() function
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-16 22:33:41 +00:00
Michael Brown
c867b5ab1f [bitops] Add generic atomic bit test, set, and clear functions
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-16 22:33:40 +00:00
Michael Brown
2246a6b274 [pseudobit] Rename bitops.h to pseudobit.h
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-16 17:03:33 +00:00
Michael Brown
36fbc3f4bd [build] Remove long-obsolete header file
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-16 16:53:16 +00:00
Michael Brown
9913a405ea [efi] Provide access to files stored on EFI filesystems
Provide access to local files via the "file://" URI scheme.  There are
three syntaxes:

  - An opaque URI with a relative path (e.g. "file:script.ipxe").
    This will be interpreted as a path relative to the iPXE binary.

  - A hierarchical URI with a non-network absolute path
    (e.g. "file:/boot/script.ipxe").  This will be interpreted as a
    path relative to the root of the filesystem from which the iPXE
    binary was loaded.

  - A hierarchical URI with a network path in which the authority is a
    volume label (e.g. "file://bootdisk/script.ipxe").  This will be
    interpreted as a path relative to the root of the filesystem with
    the specified volume label.

Note that the potentially desirable shell mappings (e.g. "fs0:" and
"blk0:") are concepts internal to the UEFI shell binary, and do not
seem to be exposed in any way to external executables.  The old
EFI_SHELL_PROTOCOL (which did provide access to these mappings) is no
longer installed by current versions of the UEFI shell.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-14 21:11:01 +00:00
Michael Brown
75496817c2 [uri] Support "file:" URIs describing relative paths
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-14 18:03:13 +00:00
Michael Brown
17c1488a44 [uri] Support URIs containing only scheme and path components
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-13 14:51:15 +00:00
Michael Brown
11ccfb67fa [efi] Add processor binding headers for ARM and AArch64
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-13 11:54:33 +00:00
Michael Brown
24415a3eee [efi] Update to current EDK2 headers
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-13 11:47:30 +00:00
Michael Brown
0d29cf2a4d [build] Accept CROSS= as a synonym for CROSS_COMPILE=
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-13 11:32:54 +00:00
Michael Brown
1f65ed53da [build] Allow assembler section type character to vary by architecture
On some architectures (such as ARM) the "@" character is used as a
comment delimiter.  A section type argument such as "@progbits"
therefore becomes "%progbits".

This is further complicated by the fact that the "%" character has
special meaning for inline assembly when input or output operands are
used, in which cases "@progbits" becomes "%%progbits".

Allow the section type character(s) to be defined via Makefile
variables.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-13 11:20:53 +00:00
Michael Brown
a8037ee131 [efi] Centralise architecture-independent EFI Makefile and linker script
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-12 21:47:13 +00:00
Michael Brown
86f96a40f4 [tg3] Remove x86-specific inline assembly
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-12 21:15:43 +00:00
Michael Brown
7e78cdddc8 [3c595] Fix compilation when "char" is unsigned by default
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-12 18:06:47 +00:00
Michael Brown
3c84178003 [serial] Add missing #include <string.h>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-12 18:02:20 +00:00
Michael Brown
9eff4284bd [test] Add missing #include <string.h>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-12 17:55:38 +00:00
Michael Brown
4350d26a04 [qib7322] Use standard readq() and writeq() implementations
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-12 17:51:59 +00:00
Michael Brown
5229662b7f [linda] Use standard readq() and writeq() implementations
This driver is the original source of the current readq() and writeq()
implementations for 32-bit iPXE.  Switch to using the now-centralised
definitions, to avoid including architecture-specific code in an
otherwise architecture-independent driver.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-12 17:42:30 +00:00
Michael Brown
cc9f31ee0c [librm] Do not unconditionally preserve flags across virt_call()
Commit 196f0f2 ("[librm] Convert prot_call() to a real-mode near
call") introduced a regression in which any deliberate modification to
the low 16 bits of the CPU flags (in struct i386_all_regs) would be
overwritten with the original flags value at the time of entry to
prot_call().

The regression arose because the alignment requirements of the
protected-mode stack necessitated the insertion of two bytes of
padding immediately below the prot_call() return address.  The
solution chosen was to extend the existing "pushfl / popfl" pair to
"pushfw;pushfl / popfl;popfw".  The extra "pushfw / popfw" appears at
first glance to be a no-op, but fails to take into account the fact
that the flags restored by popfl may have been deliberately modified
by the protected-mode function.

Fix by replacing "pushfw / popfw" with "pushw %ss / popw %ss".  While
%ss does appear within struct i386_all_regs, any modification to the
stored value has always been ignored by prot_call() anyway.

The most visible symptom of this regression was that SAN booting would
fail since every INT 13 call would be chained to the original INT 13
vector.

Reported-by: Vishvananda Ishaya <vishvananda@gmail.com>
Reported-by: Jamie Thompson <forum.ipxe@jamie-thompson.co.uk>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-12 12:39:17 +00:00
Michael Brown
64acfd9ddd [arp] Validate length of ARP packet
There is no practical way to generate an underlength ARP packet since
an ARP packet is always padded up to the minimum Ethernet frame length
(or dropped by the receiving Ethernet hardware if incorrectly padded),
but the absence of an explicit check causes warnings from some
analysis tools.

Fix by adding an explicit check on the I/O buffer length.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-12 01:24:03 +00:00
Michael Brown
11396473f5 [pixbuf] Check for unsigned integer overflow on multiplication
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-12 00:09:23 +00:00
Michael Brown
5a6ed90a00 [crypto] Allow for zero-length ASN.1 cursors
The assumption in asn1_type() that an ASN.1 cursor will always contain
a type byte is incorrect.  A cursor that has been cleanly invalidated
via asn1_invalidate_cursor() will contain a type byte, but there are
other ways in which to arrive at a zero-length cursor.

Fix by explicitly checking the cursor length in asn1_type().  This
allows asn1_invalidate_cursor() to be reduced to simply zeroing the
length field.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-11 16:58:51 +00:00
Michael Brown
05dcb07cb2 [tls] Avoid potential out-of-bound reads in length fields
Many TLS records contain variable-length fields.  We currently
validate the overall record length, but do so only after reading the
length of the variable-length field.  If the record is too short to
even contain the length field, then we may read uninitialised data
from beyond the end of the record.

This is harmless in practice (since the subsequent overall record
length check would fail regardless of the value read from the
uninitialised length field), but causes warnings from some analysis
tools.

Fix by validating that the overall record length is sufficient to
contain the length field before reading from the length field.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-11 16:09:40 +00:00
Michael Brown
e303a6b387 [efi] Work around broken GetFontInfo() implementations
Several UEFI platforms are known to return EFI_NOT_FOUND when asked to
retrieve the system default font information via GetFontInfo().  Work
around these broken platforms by iterating over the glyphs to find the
maximum height used by a printable character.

Originally-fixed-by: Jonathan Dieter <jdieter@lesbg.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-10 18:09:59 +00:00
Michael Brown
e44f6dcb89 [xsigo] Add support for Xsigo virtual Ethernet (XVE) EoIB devices
Add support for EoIB devices as implemented by Xsigo.  Based on the
public (but out-of-tree) Linux kernel drivers at

  https://oss.oracle.com/git/?p=linux-uek.git;a=log;h=v4.1.12-32.2.1

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-09 08:46:24 +00:00
Michael Brown
3144e4fb64 [eoib] Support non-FullMember gateway devices
Some EoIB implementations utilise an EoIB-to-Ethernet gateway device
that does not perform a FullMember join to the multicast group for the
EoIB broadcast domain.  This has various exciting side-effects, such
as requiring every EoIB node to send every broadcast packet twice.

As an added bonus, the gateway may also break the EoIB MAC address to
GID mapping protocol by sending Ethernet-sourced packets from the
wrong QPN.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-09 08:43:40 +00:00
Michael Brown
1a9ed68cbb [eoib] Allow the multicast group to be forcefully created
Some EoIB implementations require each individual EoIB node to create
the multicast group for the EoIB broadcast domain.

It is left as an exercise for the interested reader to determine how
such an implementation might ever allow the parameters of such a
multicast group to be changed without requiring a simultaneous upgrade
of every driver on every operating system on every machine currently
attached to the fabric.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-09 08:43:40 +00:00
Michael Brown
ecd93cfc11 [eoib] Silently ignore EoIB heartbeat packets
Some EoIB implementations transmit a vendor-proprietary heartbeat
packet on the same multicast group used to provide the EoIB broadcast
domain.

Silently ignore these heartbeat packets, to avoid cluttering up the
network interface error statistics.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-09 08:43:40 +00:00
Michael Brown
9154d7a65c [eoib] Add Ethernet over Infiniband (EoIB) driver
EoIB is a fairly simple protocol in which raw Ethernet frames
(excluding the CRC) are encapsulated within Infiniband Unreliable
Datagrams, with a four-byte fixed EoIB header (which conveys no actual
information).  The Ethernet broadcast domain is provided by a
multicast group, similar to the IPoIB IPv4 multicast group.

The mapping from Ethernet MAC addresses to Infiniband address vectors
is achieved by snooping incoming traffic and building a peer cache
which can then be used to map a MAC address into a port GID.  The
address vector is completed using a path record lookup, as for IPoIB.
Note that this requires every packet to include a GRH.

Add basic support for EoIB devices.  This driver is substantially
derived from the IPoIB driver.  There is currently no mechanism for
automatically creating EoIB devices.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-09 08:43:40 +00:00
Michael Brown
5bcaa1e4d4 [infiniband] Make IPoIB support configurable at build time
Add a build configuration option VNIC_IPOIB to control whether or not
IPoIB support is included for Infiniband devices.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-09 08:43:40 +00:00
Michael Brown
8290a10aba [ifmgmt] Include human-readable error message for configuration failure
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-08 17:45:30 +00:00
Michael Brown
9939b704f1 [ipoib] Increase number of transmit work queue entries
Avoid running out of transmit work queue entries under heavy load.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-08 17:44:28 +00:00
Michael Brown
b5aa51ac62 [ipoib] Resimplify test for received broadcast packets
Commit e62e52b ("[ipoib] Simplify test for received broadcast
packets") relies upon the multicast LID being present in the
destination address vector as passed to ipoib_complete_recv().
Unfortunately, this information is not present in many Infiniband
devices' completion queue entries.

Fix by testing instead for the presence of a multicast GID.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-08 17:43:26 +00:00
Michael Brown
076d772648 [infiniband] Retrieve GID flag from cached path entries
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-08 17:40:52 +00:00
Michael Brown
299fdabe48 [infiniband] Add "ibstat" command
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-08 17:38:06 +00:00
Michael Brown
6a3ffa0114 [infiniband] Assign names to queue pairs
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-08 15:51:53 +00:00
Michael Brown
174bf6b569 [infiniband] Assign names to CMRC connections
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-08 15:51:19 +00:00
Michael Brown
d3db00ecf9 [pcbios] Restrict external memory allocations to the low 4GB
When running the 64-bit BIOS version of iPXE, restrict external memory
allocations to the low 4GB to ensure that allocations (such as for
initrds) fall within our identity-mapped memory region, and will be
accessible to the potentially 32-bit operating system.

Move largest_memblock() back to memtop_umalloc.c, since this change
imposes a restriction that applies only to BIOS builds.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-08 13:25:09 +00:00
Michael Brown
5a7fd2cc90 [infiniband] Allow for the creation of multicast groups
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-08 12:23:30 +00:00
Michael Brown
e62e52b2b9 [ipoib] Simplify test for received broadcast packets
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-08 12:23:30 +00:00
Michael Brown
ffdf8ea757 [ipoib] Avoid unnecessary path record lookup for broadcast address
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-08 12:23:30 +00:00
Michael Brown
14ad9cbd67 [infiniband] Parse MLID, rate, and SL from multicast membership record
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-08 12:23:30 +00:00
Michael Brown
c335f8eae4 [infiniband] Record multicast GID attachment as part of group membership
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-08 12:23:30 +00:00
Michael Brown
114a2f19a6 [infiniband] Do not use GRH for local paths
Avoid including an unnecessary GRH in packets sent to unicast
destinations within the local subnet.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-08 12:23:24 +00:00
Michael Brown
bd1687465c [infiniband] Use correct transaction identifier in CM responses
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-08 12:08:58 +00:00
Michael Brown
8336186564 [infiniband] Use connection's local ID as debug message identifier
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-08 12:08:58 +00:00
Michael Brown
36c4779356 [infiniband] Use "%d" as format specifier for LIDs
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-08 12:08:58 +00:00
Michael Brown
7aef4d4c94 [infiniband] Use "%#lx" as format specifier for queue pair numbers
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-08 12:08:58 +00:00
Michael Brown
d7794dcac7 [infiniband] Assign names to Infiniband devices for debug messages
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-08 12:08:58 +00:00
Michael Brown
ff13eeb747 [infiniband] Add support for performing service record lookups
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-08 12:08:58 +00:00
Michael Brown
7544763626 [infiniband] Avoid multiple calls to ib_cmrc_shutdown()
When a CMRC connection is closed, the deferred shutdown process calls
ib_destroy_qp().  This will cause the receive work queue entries to
complete in error (since they are being cancelled), which will in turn
reschedule the deferred shutdown process.  This eventually leads to
ib_destroy_conn() being called on a connection that has already been
freed.

Fix by explicitly cancelling any pending shutdown process after the
shutdown process has completed.

Ironically, this almost exactly reverts commit 019d4c1 ("[infiniband]
Use a one-shot process for CMRC shutdown"); prior to the introduction
of one-shot processes the only way to achieve a one-shot process was
for the process to cancel itself.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-08 12:07:03 +00:00
Michael Brown
60e205a551 [infiniband] Remove concept of whole-device owner data
Remove the implicit assumption that the IPoIB protocol owns the whole
Infiniband device.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-07 21:04:40 +00:00
Michael Brown
fcf3b03544 [netdevice] Refuse to create duplicate network device names
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-03-07 21:04:40 +00:00
Michael Brown
99b5216b1c [librm] Support ioremap() for addresses above 4GB in a 64-bit build
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-02-26 15:34:28 +00:00