david/ipxe
Archived
1
0
Commit Graph

971 Commits

Author SHA1 Message Date
Michael Brown
0ce3c97095 [efi] Allow for non-PCI snpnet devices
We currently require information about the underlying PCI device to
populate the snpnet device's name and description.  If the underlying
device is not a PCI device, this will fail and prevent the device from
being registered.

Fix by falling back to populating the device description with
information based on the EFI handle, if no PCI device information is
available.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-09-04 16:18:08 +01:00
Michael Brown
4c5b7945c3 [efi] Use the SNP protocol instance to match the SNP chainloading device
Some systems will install a child of the SNP device and use this as
our loaded image's device handle, duplicating the installation of the
underlying SNP protocol onto the child device handle.  On such
systems, we want to end up driving the parent device (and
disconnecting any other drivers, such as MNP, which may be attached to
the parent device).

Fix by recording the SNP protocol instance at initialisation time, and
using this to match against device handles (rather than simply
comparing the handles themselves).

Reported-by: Jarrod Johnson <jarrod.b.johnson@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-09-04 15:39:02 +01:00
Michael Brown
ead70bf920 [intel] Apply PBS/PBA errata workaround only to ICH8 PCI device IDs
ICH8 devices have an errata which requires us to reconfigure the
packet buffer size (PBS) register, and correspondingly adjust the
packet buffer allocation (PBA) register.  The "Intel I/O Controller
Hub ICH8/9/10 and 82566/82567/82562V Software Developer's Manual"
notes for the PBS register that:

  10.4.20   Packet Buffer Size - PBS (01008h; R/W)

  Note: The default setting of this register is 20 KB and is
        incorrect. This register must be programmed to 16 KB.

  Initial value: 0014h
                 0018h (ICH9/ICH10)

It is unclear from this comment precisely which devices require the
workaround to be applied.  We currently attempt to err on the side of
caution: if we detect an initial value of either 0x14 or 0x18 then the
workaround will be applied.  If the workaround is applied
unnecessarily, then the effect should be just that we use less than
the full amount of the available packet buffer memory.

Unfortunately this approach does not play nicely with other device
drivers.  For example, the Linux e1000e driver will rewrite PBA while
assuming that PBS still contains the default value, which can result
in inconsistent values between the two registers, and a corresponding
inability to transmit or receive packets.  Even more unfortunately,
the contents of PBS and PBA are not reset by anything less than a
power cycle, meaning that this error condition will survive a hardware
reset.

The Linux driver (written and maintained by Intel) applies the PBS/PBA
errata workaround only for devices in the ICH8 family, identified via
the PCI device ID.  Adopt a similar approach, using the PCI_ROM()
driver data field to indicate when the workaround is required.

Reported-by: Donald Bindner <dbindner@truman.edu>
Debugged-by: Donald Bindner <dbindner@truman.edu>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-08-21 00:40:22 +01:00
Michael Brown
d461b8ddf2 [intel] Display before and after values for both PBS and PBA
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-08-20 23:16:01 +01:00
Michael Brown
c845740b88 [intel] Display PBS value when applying ICH errata workaround
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-08-20 22:59:54 +01:00
Michael Brown
3953ddd2ac [smc9000] Avoid using CONFIG as a preprocessor macro
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-08-19 14:38:27 +01:00
Michael Brown
8b2942a7db [xen] Cope with unexpected initial backend states
Under some circumstances (e.g. if iPXE itself is booted via iSCSI, or
after an unclean reboot), the backend may not be in the expected
InitWait state when iPXE starts up.

There is no generic reset mechanism for Xenbus devices.  Recent
versions of xen-netback will gracefully perform all of the required
steps if the frontend sets its state to Initialising.  Older versions
(such as that found in XenServer 6.2.0) require the frontend to
transition through Closed before reaching Initialising.

Add a reset mechanism for netfront devices which does the following:

 - read current backend state

 - if backend state is anything other than InitWait, then set the
   frontend state to Closed and wait for the backend to also reach
   Closed

 - set the frontend state to Initialising and wait for the backend to
   reach InitWait.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-08-14 00:14:51 +01:00
Michael Brown
be79ca535a [xen] Use version 1 grant tables by default
Using version 1 grant tables limits guests to using 16TB of grantable
RAM, and prevents the use of subpage grants.  Some versions of the Xen
hypervisor refuse to allow the grant table version to be set after the
first grant references have been created, so the loaded operating
system may be stuck with whatever choice we make here.  We therefore
currently use version 2 grant tables, since they give the most
flexibility to the loaded OS.

Current versions (7.2.0) of the Windows PV drivers have no support for
version 2 grant tables, and will merrily create version 1 entries in
what the hypervisor believes to be a version 2 table.  This causes
some confusion.

Avoid this problem by attempting to use version 1 tables, since
otherwise we may render Windows unable to boot.

Play nicely with other potential bootloaders by accepting either
version 1 or version 2 grant tables (if we are unable to set our
requested version).

Note that the use of version 1 tables on a 64-bit system introduces a
possible failure path in which a frame number cannot fit into the
32-bit field within the v1 structure.  This in turn introduces
additional failure paths into netfront_transmit() and
netfront_refill_rx().

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-08-13 19:21:42 +01:00
Michael Brown
5c4f1da2ce [efi] Generalise snpnet_pci_info() to efi_locate_device()
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-08-06 14:27:45 +01:00
Curtis Larsen
2ce1c70166 [efi] Try various possible SNP receive filters
The behavior observed in the Apple EFI (1.10) RecieveFilters() call
is:

  - failure if any of the PROMISCUOUS or MULTICAST filters are
    included

  - success if only UNICAST is included, however the result is
    UNICAST|BROADCAST

  - success if only UNICAST and BROADCAST are included

  - if UNICAST, or UNICAST|BROADCAST are used, but the previous call
    tried (and failed) to set UNICAST|BROADCAST|MULTICAST, then the
    result is UNICAST|BROADCAST|MULTICAST

Work around this apparently broken SNP implementation by trying
RecieveFilterMask, then falling back to UNICAST|BROADCAST|MULTICAST,
then UNICAST|BROADCAST, and finally UNICAST.

Modified-by: Michael Brown <mcb30@ipxe.org>
Tested-by: Curtis Larsen <larsen@dixie.edu>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-08-05 23:10:34 +01:00
Michael Brown
7b3cc18462 [efi] Open device path protocol only at point of use
Some EFI 1.10 systems (observed on an Apple iMac) do not allow us to
open the device path protocol with an attribute of
EFI_OPEN_PROTOCOL_BY_DRIVER and so we cannot maintain a safe,
long-lived pointer to the device path.  Work around this by instead
opening the device path protocol with an attribute of
EFI_OPEN_PROTOCOL_GET_PROTOCOL whenever we need to use it.

Debugged-by: Curtis Larsen <larsen@dixie.edu>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-08-05 23:10:33 +01:00
Michael Brown
3b42ed477f [efi] Provide centralised definitions of commonly-used GUIDs
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-08-05 23:08:32 +01:00
Michael Brown
471fdfab79 [efi] Reset multicast filter list when setting SNP receive filters
According to the UEFI specification, the MCastFilter parameter (which
we currently pass as NULL, along with a zero MCastFilterCnt) is
optional only if ResetMCastFilter is true.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-08-05 18:00:17 +01:00
Michael Brown
1f7b269954 [efi] Add ability to dump SNP device mode information
Originally-implemented-by: Curtis Larsen <larsen@dixie.edu>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-08-05 17:32:16 +01:00
Michael Brown
16d99cc8ef [efi] Dump existing openers when we are unable to open a protocol
Dump the existing openers of a protocol whenever we are unable to open
a protocol using attributes of BY_DEVICE, EXCLUSIVE, or
BY_CHILD_CONTROLLER.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-07-31 12:50:14 +01:00
Michael Brown
60891f699a [efi] Use efi_handle_name() instead of efi_devpath_text() where applicable
Using efi_devpath_text() is marginally more efficient if we already
have the device path protocol available, but the mild increase in
efficiency is not worth compromising the clarity of the pattern:

  DBGC ( device, "THING %p %s ...", device, efi_handle_name ( device ) );

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-07-31 11:57:31 +01:00
Michael Brown
2e0821b9ed [efi] Use efi_handle_name() instead of efi_handle_devpath_text()
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-07-31 11:56:44 +01:00
Michael Brown
793a806611 [xen] Add support for Xen netfront virtual NICs
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-07-29 15:57:56 +01:00
Michael Brown
3a02409fc8 [natsemi] Check for ioremap() failures
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-07-16 15:54:49 +01:00
Michael Brown
720ae17aa4 [myson] Check for ioremap() failures
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-07-16 15:53:43 +01:00
Michael Brown
022ef91984 [skel] Check for ioremap() failures
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-07-16 15:52:48 +01:00
Michael Brown
7ab3035749 [vmxnet3] Check for ioremap() failures
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-07-16 15:51:38 +01:00
Michael Brown
857e4f56a7 [realtek] Check for ioremap() failures
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-07-16 15:50:18 +01:00
Michael Brown
9ce2b56af6 [intel] Check for ioremap() failures
Debugged-by: Anton D. Kachalov <mouse@yandex-team.ru>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-07-16 15:49:08 +01:00
Michael Brown
d0cfbd01f5 [efi] Rewrite SNP NIC driver
Rewrite the SNP NIC driver to use non-blocking and deferrable
transmissions, to provide link status detection, to provide
information about the underlying (PCI) hardware device, and to avoid
unnecessary I/O buffer allocations during receive polling.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-07-08 14:01:55 +01:00
Michael Brown
c7051d826b [efi] Allow network devices to be created on top of arbitrary SNP devices
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-07-03 15:28:17 +01:00
Hannes Reinecke
bb5a4a111b [igbvf] Allow changing of MAC address
The VF might not have assigned a MAC address upon startup, and will
end up with a random MAC address during probe().  With this patch the
MAC address can be changed later on.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-06-12 17:46:12 +01:00
Hannes Reinecke
f63ec19dca [igbvf] Assign random MAC address if none is set
If the VF doesn't have a MAC address assigned we should create a
random MAC address.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-06-12 17:38:08 +01:00
Michael Brown
d5cf058994 [iscsi] Include IP address origin in iBFT
The iBFT includes an "origin" field to indicate the source of the IP
address.  We use the heuristic of assuming that the source should be
"manual" if the IP address originates directly from the network device
settings block, and "DHCP" otherwise.  This is an imperfect guess, but
is likely to be correct in most common situations.

Originally-implemented-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-06-12 17:09:16 +01:00
Michael Brown
059adae434 [iscsi] Read IPv4 settings only from the relevant network device
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-06-12 17:03:14 +01:00
Michael Brown
e047811c85 [scsi] Improve sense code parsing
Parse the sense data to extract the reponse code, the sense key, the
additional sense code, and the additional sense code qualifier.

Originally-implemented-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-06-03 02:04:46 +01:00
Michael Brown
4e4fc678c2 [intel] Increase receive ring fill level
As of commit d28bb51 ("[tcp] Defer sending ACKs until all received
packets have been processed"), increasing the RX ring size will
increase the number of received packets per transmitted ACK (since
each poll will process up to one complete receive ring).  Under KVM,
this can make a substantial (up to ~200%) difference to the overall
download speed, since transmissions are very expensive.

Increase the ring fill level from four to eight packets: this
increases the download speed by around 50% at a cost of around 8kB of
heap space.  Further speedups are possible by increasing the ring size
further, but it would be preferable to find alternative methods which
do not use noticeable amounts of heap space.

Tested-by: Robin Smidsrød <robin@smidsrod.no>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-05-16 13:15:40 +01:00
Michael Brown
abf875a2e5 [intel] Exclude time spent in hypervisor from profiling
When profiling, exclude any time spent inside the hypervisor
responding to our MMIO accesses.  This substantially reduces the
variance accumulated on many other profilers.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-05-06 22:53:33 +01:00
Michael Brown
b2c7b6a85e [intel] Push new RX descriptors in batches
Inside a virtual machine, writing the RX ring tail pointer may incur a
substantial overhead of processing inside the hypervisor.  Minimise
this overhead by writing the tail pointer once per batch of
descriptors, rather than once per descriptor.

Profiling under qemu-kvm (version 1.6.2) shows that this reduces the
amount of time taken to refill the RX descriptor ring by around 90%.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-04-27 23:14:48 +01:00
Michael Brown
8a3dcefc0c [intel] Profile common virtual machine operations
Operations which are negligible on physical hardware (such as issuing
a posted write to the transmit ring tail register) may involve
substantial amounts of processing within the hypervisor if running in
a virtual machine.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-04-27 23:14:48 +01:00
Michael Brown
27884298a3 [intel] Avoid completely filling the TX descriptor ring
It is unclear from the datasheets whether or not the TX ring can be
completely filled (i.e. whether writing the tail value as equal to the
current head value will cause the ring to be treated as completely
full or completely empty).  It is very plausible that this edge case
could differ in behaviour between real hardware and the many
implementations of an emulated Intel NIC found in various virtual
machines.  Err on the side of caution and always leave at least one
ring entry empty.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-04-22 13:12:54 +01:00
Michael Brown
ccb6e5c627 [realtek] Clear bit 24 of RCR
On an Asus Z87-K motherboard with an onboard 8168 NIC, booting into
Windows 7 and then warm rebooting into iPXE results in a broken RX
datapath: packets can be transmitted successfully but garbage is
received.  A cold reboot clears the problem.

A dump of the PHY registers reveals only one difference: in the
failure case the bits ADVERTISE_PAUSE_CAP and ADVERTISE_PAUSE_ASYM are
cleared.  Explicitly setting these bits does not fix the problem.

A dump of the MAC registers reveals a few differences, of which the
most obvious culprit is the undocumented bit 24 of the Receive
Configuration Register (RCR), which is set in the failure case.
Explicitly clearing this bit does fix the problem.

Reported-by: Sebastian Nielsen <ipxe@sebbe.eu>
Reported-by: Oliver Rath <rath@mglug.de>
Debugged-by: Sebastian Nielsen <ipxe@sebbe.eu>
Tested-by: Sebastian Nielsen <ipxe@sebbe.eu>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-03-20 15:54:25 +00:00
Michael Brown
87b59677ba [realtek] Add ability to dump all internal registers
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-03-20 12:36:14 +00:00
Michael Brown
145fc26ed5 [realtek] Dump all MII register contents when link status changes
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-03-10 12:22:23 +00:00
Michael Brown
ac5c2e851b [realtek] Include link status register details in debug messages
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-03-04 16:30:06 +00:00
Michael Brown
3fa7a3b136 [intel] Add some missing PCI IDs
Tested-by: Philipp Hagen <Philipp.Hagen@she.net>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-01-29 16:43:39 +00:00
Michael Brown
22001cb206 [settings] Explicitly separate the concept of a completed fetched setting
The fetch_setting() family of functions may currently modify the
definition of the specified setting (e.g. to add missing type
information).  Clean up this interface by requiring callers to provide
an explicit buffer to contain the completed definition of the fetched
setting, if required.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-12-05 00:37:02 +00:00
lolipop
0780bccb68 [intel] Add Intel I217 Gigabit Ethernet PCI ID
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-10-23 12:03:58 +01:00
Michael Brown
8a2dc7a588 [linux] Apply MAC address prior to registering network device
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-09-03 02:02:58 +01:00
Michael Brown
0b65c8cad6 [netdevice] Add method for generating EUI-64 address from link-layer address
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-09-03 01:24:15 +01:00
Michael Brown
ae0124cd40 [linux] Give tap devices a name and bus type
Give tap devices a meaningful name, and avoid segmentation faults when
attempting to retrieve ${net0/bustype} by assigning a new bus type for
tap devices.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-08-27 16:39:43 +01:00
Thomas Miletich
6d72b498c2 [3c90x] Fix High-MTU packet reception
Prevent the card from flagging packets of 1518 bytes length as
overlength.

This fixes the High-MTU loopback test.

Signed-off-by: Thomas Miletich <thomas.miletich@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-08-20 14:38:33 +01:00
Thomas Miletich
e5f6471525 [3c90x] Don't round up transmit packet length
The 3c90x B and C revisions support rounding up the packet length to a
specific boundary.  Disable this feature to avoid overlength packets.

This fixes the loopback test.

Signed-off-by: Thomas Miletich <thomas.miletich@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-08-20 14:37:05 +01:00
Thomas Miletich
b324a9c243 [3c90x] Stall upload engine before setting RX ring address
According to the 3c90x datasheet we have to stall the upload (receive)
engine before setting the receive ring address.

Signed-off-by: Thomas Miletich <thomas.miletich@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-08-20 14:34:53 +01:00
Michael Brown
6d910559b3 [pci] Add pci_find_next() to iterate over existent PCI devices
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-08-05 13:51:16 +01:00