david/ipxe
Archived
1
0
Commit Graph

1029 Commits

Author SHA1 Message Date
Michael Brown
c64747db50 [librm] Speed up real-to-protected mode transition under KVM
Ensure that all segment registers have zero in the low two bits before
transitioning to protected mode.  This allows the CPU state to
immediately be deemed to be "valid", and eliminates the need for any
further emulated instructions.

Load the protected-mode interrupt descriptor table after switching to
protected mode, since this avoids triggering an EXCEPTION_NMI and
corresponding VM exit.

This reduces the time taken by real_to_prot under KVM by around 50%.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-05-02 15:23:21 +01:00
Michael Brown
5a08b63cb7 [librm] Speed up protected-to-real mode transition under KVM
On an Intel CPU supporting VMX, KVM will emulate instructions while
the CPU state remains "invalid".  In real mode, the CPU state is
defined to be "invalid" if any segment register has a base which is
not equal to (sreg<<4) or a limit which is not equal to 64kB.

We don't actually use the base stored in the REAL_DS descriptor for
any significant purpose.  Change the base stored in this descriptor to
be equal to (REAL_DS<<4).  A segment register loaded with REAL_DS is
then automatically valid in both real and protected modes.  This
allows KVM to stop emulating instructions much sooner.

The only use of REAL_DS for memory accesses currently occurs in the
indirect ljmp within prot_to_real.  Change this to a direct ljmp,
storing rm_cs in .text16 as part of the ljmp instruction.  This
removes the only memory access via REAL_DS (thereby allowing for the
above descriptor base address hack), and also simplifies the ljmp
instruction (which will still have to be emulated).

Load the real-mode interrupt descriptor table register before
switching to real mode, since this avoids triggering an EXCEPTION_NMI
and corresponding VM exit.

This reduces the time taken by prot_to_real under KVM by around 65%.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-05-02 15:23:20 +01:00
Michael Brown
03e76c34d8 [librm] Add meaningful labels at section changes
The mode-transition code involves paths which switch back and forth
between the .text and .text16 sections.  At present, only the start of
each function is labelled, which makes it difficult to decode
addresses within the parts of the function existing in a different
section.

Add explicit labels at the start of each section change, so that
addresses can be meaningfully decoded to the nearest label.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-05-02 15:23:20 +01:00
Michael Brown
bd640bc364 [librm] Add a profiling self-test for measuring mode transition times
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-05-02 15:23:20 +01:00
Michael Brown
34eaf69ddf [pcbios] Do not switch to real mode to sleep the CPU
Now that we can handle interrupts while in protected mode, there is no
need to switch to real mode just to halt the CPU.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-04-29 18:24:10 +01:00
Michael Brown
e4593909a8 [pcbios] Do not switch to real mode to check for timer interrupt
The currticks() function is called at least once per TCP packet, and
so is performance-critical.  Switching to real mode just to allow the
timer interrupt to fire is expensive when running inside a virtual
machine, and imposes a significant performance cost.

Fix by enabling interrupts without switching to real mode.  This
results in an approximately 100% increase in download speed when
running under KVM.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-04-29 18:24:10 +01:00
Michael Brown
aaf276ccd4 [comboot] Use built-in interrupt reflector
We now have the ability to handle interrupts while in protected mode,
and so no longer need to set up a dedicated interrupt descriptor table
while running COM32 executables.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-04-29 18:24:10 +01:00
Michael Brown
23b671daf4 [librm] Allow interrupts in protected mode
When running in a virtual machine, switching to real mode may be
expensive.  Allow interrupts to be enabled while in protected mode and
reflected down to the real-mode interrupt handlers.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-04-29 18:24:04 +01:00
Michael Brown
6d4deeeb6c [librm] Use genuine real mode to accelerate operation in virtual machines
We currently use flat real mode wherever real mode is required.  This
guarantees that we will not surprise some unsuspecting external caller
which has carefully set up flat real mode by suddenly reducing the
segment limits to 64kB.

However, operating in flat real mode imposes a severe performance
penalty in some virtualisation environments, since some CPUs cannot
fully virtualise flat real mode and so the hypervisor must fall back
to emulation.  In particular, operating under KVM on a pre-Westmere
Intel CPU will be at least an order of magnitude slower, to the point
that there is a visible teletype effect when printing anything to the
BIOS console.  (Older versions of KVM used to cheat and ignore the
"flat" part of flat real mode, which masked the problem.)

Switch (back) to using genuine real mode with 64kB segment limits
instead of flat real mode.  Hopefully this won't break anything.

Add an explicit switch to flat real mode before returning to the BIOS
from the ROM prefix, since we know that a PMM BIOS will call the ROM
initialisation point (and potentially the BEV) in flat real mode.

As noted in previous commit messages, it is not possible to restore
the real-mode segment limits after a transition to protected mode,
since there is no way to know which protected-mode segment descriptor
was originally used to initialise the limit portion of the segment
register.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-04-28 01:21:08 +01:00
Michael Brown
e5f6a9be38 [profile] Add generic profiling infrastructure
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-04-27 23:14:43 +01:00
Michael Brown
d36e814b8a [libc] Add flsll()
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-04-27 16:56:09 +01:00
Michael Brown
082cedb3c3 [build] Fix __libgcc attribute for recent gcc versions
We observed some time ago (in commit 4ce8d61 "Import various libgcc
functions from syslinux") that gcc seems to treat calls to the
implicit arithmetic functions (e.g. __udivdi3()) as being affected by
-mregparm but unaffected by -mrtd.

This seems to be no longer the case with current gcc versions, which
treat calls to these functions as being affected by both -mregparm and
-mrtd, as expected.

There is nothing obvious in the gcc changelogs to indicate precisely
when this happened.  From experimentation with available gcc versions,
the change occurred sometime between v4.6.3 and v4.7.2.  We assume
that only versions up to v4.6.x require the special treatment.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-04-25 16:06:37 +01:00
Michael Brown
dce7107fc0 [libc] Add inline assembly implementation of flsl() using BSR instruction
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-04-24 14:49:08 +01:00
Michael Brown
c1595129b5 [bios] Fix screen clearing on even more buggy BIOSes
Some BIOSes (observed with a ProLiant DL360p Gen8 SE) perform no range
checking whatsoever on the parameters passed to INT10,06 and will
therefore happily write to an area beyond the end of video RAM.  The
area immediately following the video RAM tends to be the VGA BIOS ROM
image.  Overwriting the VGA BIOS leads to an interesting variety of
crashes and reboots.

Fix by specifying an exact width and height to be cleared, rather than
passing in large values and relying upon the BIOS to truncate them to
the appropriate range.

Reported-by: Alex Davies <adavies@jumptrading.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-03-21 16:56:34 +00:00
Michael Brown
f473b9c3f6 [efi] Disable SNP devices when running iPXE as the application
Some UEFI builds will set up a timer to continuously poll any SNP
devices.  This can drain packets from the network device's receive
queue before iPXE gets a chance to process them.

Use netdev_rx_[un]freeze() to explicitly indicate when we expect our
network devices to be driven via the external SNP API (as we do with
the UNDI API on the standard BIOS build), and disable the SNP API
except when receive queue processing is frozen.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-03-14 17:09:51 +00:00
Michael Brown
f618178e60 [efi] Unload our own image before exiting UEFI application
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-03-14 16:20:55 +00:00
Michael Brown
e662912c53 [efi] Avoid accidentally calling main() twice
EFIRC() uses PLATFORM_TO_ERRNO(), which evaluates its argument twice
(and can't trivially use a braced-group expression or an inline
function to avoid this, since it gets used outside of function
context).

The expression "EFIRC(main())" will therefore end up calling main()
twice, which is not the intended behaviour.  Every other instance of
EFIRC() is of the simple form "EFIRC(rc)", so fix by converting this
instance to match.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-03-14 16:20:55 +00:00
Michael Brown
3f43c1354e [image] Add "--timeout" parameter to image downloading commands
iPXE will detect timeout failures in several situations: network
link-up, DHCP, TCP connection attempts, unacknowledged TCP data, etc.
This does not cover all possible circumstances.  For example, if a
connection to a web server is successfully established and the web
server acknowledges the HTTP request but never sends any data in
response, then no timeout will be triggered.  There is no timeout
defined within the HTTP specifications, and the underlying TCP
connection will not generate a timeout since it has no way to know
that the HTTP layer is expecting to receive data from the server.

Add a "--timeout" parameter to "imgfetch", "chain", etc.  If no
progress is made (i.e. no data is downloaded) within the timeout
period, then the download will be aborted.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-03-10 13:32:39 +00:00
Michael Brown
1137fa3268 [romprefix] Do not clobber stack segment when returning to BIOS
Commit c429bf0 ("[romprefix] Store boot bus:dev.fn address as autoboot
device location") introduced a regression by using register %cx to
temporarily hold the PCI bus:dev.fn address, despite the fact that %cx
was already being used to hold the stored BIOS stack segment.
Consequently, when returning to the BIOS after a failed or cancelled
boot attempt, iPXE would end up calling INT 18 with the stack segment
set equal to the PCI bus:dev.fn address.  Writing to essentially
random areas of memory tends to upset even the more robust BIOSes.

Fix by using register %ax to temporarily hold the PCI bus:dev.fn
address.

Reported-by: Anton D. Kachalov <mouse@yandex-team.ru>
Tested-by: Anton D. Kachalov <mouse@yandex-team.ru>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-03-05 12:32:00 +00:00
Michael Brown
0fac055119 [bzimage] Report exact initrd length via bzImage header
iPXE currently pads initrd images to a multiple of 4kB and inserts
zero padding between images, as required by some versions of the Linux
kernel.  The overall length reported via the ramdisk_size field in the
bzImage header includes this zero padding.

This causes problems when using memdisk to load a gzip-compressed disk
image.  memdisk treats the ramdisk_size field as containing the exact
length of the initrd image, and uses this length to locate the 8-byte
gzip footer.  This will generally cause memdisk to fail to decompress
the disk image.

Fix by reporting the exact length of the initrd image set, including
any padding inserted between images but excluding any padding added at
the end of the final image.

Reported-by: Levente LEVAI <levail@aviatronic.hu>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-03-04 14:38:16 +00:00
Michael Brown
ff1e7fc72b [prefix] Ignore PCI autoboot device location if set to 00:00.0
qemu can load an option ROM which is not associated with a particular
PCI device using the "-option-rom" syntax.  Under these circumstances,
we should ignore the PCI bus:dev.fn address that we expect to find in
%ax on entry to the initialisation vector.

Fix by using the PCI bus:dev.fn address only if it is non-zero.  Since
00:00.0 will always be the host bridge, it can never be the address of
a network card.

Reported-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-03-03 16:28:43 +00:00
Alex Williamson
c429bf0aa2 [romprefix] Store boot bus:dev.fn address as autoboot device location
Per the BIOS Boot Specification, the initialization phase of the ROM
is called with the PFA (PCI Function Address) in the %ax register.
The intention is that the ROM code will store that device address
somewhere and use it for booting from that device when the Boot Entry
Vector (BEV) is called.  iPXE does store the PFA, but doesn't use it
to select the boot network device.  This renders BIOS IPL lists fairly
ineffective.

Fix by using the BBS-specified bus:dev.fn address as the autoboot
device location.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-03-03 15:35:08 +00:00
Alex Williamson
90fc273b2b [prefix] Allow prefix to specify a PCI autoboot device location
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-03-03 15:34:17 +00:00
Alex Williamson
27d1b40ee9 [romprefix] Allow ROM banner timeout to be configured independently
iPXE currently prints a "Press Ctrl-B" banner twice: once when the ROM
is first called for initialisation and again if we attempt to boot
from the ROM.  This slows boot, especially when the NIC is not the
primary boot device.  Tools such as libguestfs make use of QEMU VMs
for performing maintenance on disk images and may make use of NICs in
the VM for network support.  If iPXE introduces a static init-time
delay, that directly translates to increased runtime for the tools.

Fix by allowing the ROM banner timeout to be configured independently
of the main banner timeout.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-03-03 13:09:25 +00:00
Michael Brown
7667536527 [uri] Refactor URI parsing and formatting
Add support for parsing of URIs containing literal IPv6 addresses
(e.g. "http://[fe80::69ff:fe50:5845%25net0]/boot.ipxe").

Duplicate URIs by directly copying the relevant fields, rather than by
formatting and reparsing a URI string.  This relaxes the requirements
on the URI formatting code and allows it to focus on generating
human-readable URIs (e.g. by not escaping ':' characters within
literal IPv6 addresses).  As a side-effect, this allows relative URIs
containing parameter lists (e.g. "../boot.php##params") to function
as expected.

Add validity check for FTP paths to ensure that only printable
characters are accepted (since FTP is a human-readable line-based
protocol with no support for character escaping).

Construct TFTP next-server+filename URIs directly, rather than parsing
a constructed "tftp://..." string,

Add self-tests for URI functions.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-02-27 13:32:53 +00:00
Michael Brown
43c8c272ae [cmdline] Rename "console" command's --bpp option to --depth
Rename the "--bpp" option to "--depth", to free up the single-letter
option "-b" for "--bottom" in preparation for adding margin support.

This does not break backwards compatibility with documented features,
since the "console" command has not yet been documented.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-01-22 14:28:49 +00:00
Michael Brown
11ad25933f [vesafb] Allow for an arbitrary margin around the text area
Allow for an arbitrary margin to be specified in the console
configuration.  If the actual screen size does not match the requested
screen size, then update any margins specified so that they remain in
the same place relative to the requested screen size.  If margins are
unspecified (i.e. zero), then leave them as zero.

The underlying assumption here is that any specified margins are
likely to describe an area within a background picture, and so should
remain in the same place relative to that background picture.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-01-22 14:28:32 +00:00
Michael Brown
608d6cac9e [fbcon] Allow for an arbitrary margin around the text area
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-01-22 14:26:31 +00:00
Michael Brown
b20fe32315 [vesafb] Handle failures from fbcon_init()
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-01-22 14:16:30 +00:00
Michael Brown
fffd98bd37 [uaccess] Add memcmp_user()
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-01-12 22:53:16 +01:00
Michael Brown
8f0173b5c8 [vesafb] Set "magic" colour to transparent when a background picture is used
Use the magic colour facility to cause the user interface background
to become transparent when we have a background picture.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-12-09 15:34:14 +00:00
Michael Brown
153748cce9 [lkrnprefix] Include iPXE version string in image header
Originally-implemented-by: Christian Hesse <list@eworm.de>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-12-06 20:11:36 +00:00
Michael Brown
54c5d08df1 [vesafb] Work around data corruption bug in bochs/qemu VBE implementation
The vgabios used by bochs and qemu (and other virtualisation products)
has a bug in its implementation of INT 10,4f00 which causes the high
16 bits of %ebx and %edx to become corrupted.

The vgabios code uses a "pushaw"/"popaw" pair to preserve the low 16
bits of all non-segment registers.  The vgabios code is compiled using
bcc, which generates 8086-compatible code and so never touches the
high 16 bits of the 32-bit registers.  However, the function
vbe_biosfn_return_controller_information() includes the line:

    size_64k = (Bit16u)((Bit32u)cur_info->info.XResolution *
				cur_info->info.XResolution *
				cur_info->info.BitsPerPixel) >> 19;

which generates an implicit call to the "lmulul" function.  This
function is implemented in vbe.c as:

    ; helper function for memory size calculation
    lmulul:
      and eax, #0x0000FFFF
      shl ebx, #16
      or  eax, ebx
      SEG SS
      mul eax, dword ptr [di]
      mov ebx, eax
      shr ebx, #16
      ret

which modifies %eax, %ebx, and %edx (as a result of the "mul"
instruction, which places its result into %edx:%eax).

Work around this problem by marking %ebx and %edx as being clobbered
by the call to INT 10,4f00.  (%eax is already used as an output
register, so does not need to be on the clobber list.)

Reported-by: Oliver Rath <rath@mglug.de>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-12-06 02:54:13 +00:00
Michael Brown
b0942534eb [settings] Force settings into alphabetical order within sections
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-12-05 12:43:28 +00:00
Michael Brown
03957bcb47 [linux] Provide access to SMBIOS via /dev/mem
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-12-05 03:16:27 +00:00
Michael Brown
22001cb206 [settings] Explicitly separate the concept of a completed fetched setting
The fetch_setting() family of functions may currently modify the
definition of the specified setting (e.g. to add missing type
information).  Clean up this interface by requiring callers to provide
an explicit buffer to contain the completed definition of the fetched
setting, if required.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-12-05 00:37:02 +00:00
Michael Brown
d4f7816de7 [vesafb] Select an optimal mode, rather than the first acceptable mode
There is no requirement for VBE modes to be listed in increasing order
of resolution.  With the present logic, this can cause e.g. a 1024x768
mode to be selected if the user asks for 640x480, if the 1024x768 mode
is earlier in the mode list.

Define a scoring system for modes as

  score = ( width * height - bpp )

and choose the mode with the lowest score among all acceptable modes.
This should prefer to choose the mode closest to the requested
resolution, with a slight preference for higher colour depths.

Reported-by: Robin Smidsrød <robin@smidsrod.no>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-11-28 14:59:48 +00:00
Michael Brown
00bb19257f [vesafb] Return meaningful error when no suitable mode is found
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-11-28 12:12:05 +00:00
Michael Brown
405416e4c4 [vesafb] Skip modes for which we cannot get mode information
The VirtualBox BIOS fails to retrieve mode information (with status
0x0100) for some modes within the mode list.  Skip any such modes,
rather than treating this as a fatal error.

Reported-by: Robin Smidsrød <robin@smidsrod.no>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-11-28 11:51:47 +00:00
Michael Brown
9678fedbe4 [vesafb] Include raw status value within VBE error messages
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-11-28 11:51:32 +00:00
Michael Brown
aa2e04fe1c [vesafb] Add VESA frame buffer console
The VESA frame buffer console uses the VESA BIOS extensions (VBE) to
enumerate video modes, selects an appropriate mode, and then hands off
to the generic frame buffer code.

The font is extracted from the VGA BIOS, avoiding the need to provide
an external font file.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-11-28 05:55:45 +00:00
Michael Brown
c501c980e0 [console] Add concept of generic console configuration
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-11-28 05:55:43 +00:00
Michael Brown
b2251743d8 [console] Allow console input and output to be disabled independently
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-11-28 05:54:53 +00:00
Michael Brown
02a63c6dec [console] Pass escape sequence context to ANSI escape sequence handlers
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-11-27 11:27:50 +00:00
Christian Hesse
3f9a482b88 [build] Update build system for Syslinux 6.x
Syslinux 6.x places its files into a bios subdirectory, and requires
that a ldlinux.c32 module be included within the ISO image.  Add the
relevant search paths for isolinux.bin, and include the file
ldlinux.c32 within the ISO image if it exists.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-11-15 11:59:44 +00:00
Michael Brown
c3d1e78697 [pxe] Ensure cached DHCPACK is retrieved prior to network device creation
The retrieval of the cached DHCPACK and the creation of network
devices are both currently scheduled as STARTUP_NORMAL.  It is
therefore possible that the cached DHCPACK will not be retrieved in
time for cachedhcp_probe() to apply it to the relevant network device.

Fix by retrieving the cached DHCPACK at initialisation time rather
than at startup time.

As an optimisation, an unclaimed cached DHCPACK can be freed
immediately after the last network device has been created, rather
than waiting until shutdown.

Reported-by: Espen Braastad <espen.braastad@redpill-linpro.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-11-08 12:45:22 +00:00
Michael Brown
43eba2f555 [cmdline] Generate command option help text automatically
Generate the command option help text automatically from the list of
defined options.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-11-07 17:00:51 +00:00
Michael Brown
55e85ad1ee [cmdline] Allow "if<xxx>" commands to take options
Allow commands implemented using ifcommon_exec() to accept
command-specific options.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-11-05 17:15:24 +00:00
Michael Brown
5c11ff6304 [netdevice] Make all net_driver methods optional
Most network upper-layer drivers do not implement all three methods
(probe, notify, and remove).  Save code by making all methods
optional.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-11-01 02:26:44 +00:00
Michael Brown
10d19bd2ac [pxe] Always retrieve cached DHCPACK and apply to relevant network device
When chainloading, always retrieve the cached DHCPACK packet from the
underlying PXE stack, and apply it as the original contents of the
"net<X>.dhcp" settings block.  This allows cached DHCP settings to be
used for any chainloaded iPXE binary (not just undionly.kkpxe).

This change eliminates the undocumented "use-cached" setting.  Issuing
the "dhcp" command will now always result in a fresh DHCP request.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2013-10-25 17:29:25 +01:00