david/ipxe
david
/
ipxe
Archived
1
0
Fork 0
Commit Graph

4981 Commits

Author SHA1 Message Date
Michael Brown bcfaf119a7 [librm] Speed up protected-mode calls under KVM
When making a call from real mode to protected mode, we save and
restore the global and interrupt descriptor table registers.  The
restore currently takes place after returning to real mode, which
generates two EXCEPTION_NMIs and corresponding VM exits when running
under KVM on an Intel CPU.

Avoid the VM exits by restoring the descriptor table registers inside
prot_to_real, while still running in protected mode.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-05-02 21:00:53 +01:00
Michael Brown c64747db50 [librm] Speed up real-to-protected mode transition under KVM
Ensure that all segment registers have zero in the low two bits before
transitioning to protected mode.  This allows the CPU state to
immediately be deemed to be "valid", and eliminates the need for any
further emulated instructions.

Load the protected-mode interrupt descriptor table after switching to
protected mode, since this avoids triggering an EXCEPTION_NMI and
corresponding VM exit.

This reduces the time taken by real_to_prot under KVM by around 50%.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-05-02 15:23:21 +01:00
Michael Brown 5a08b63cb7 [librm] Speed up protected-to-real mode transition under KVM
On an Intel CPU supporting VMX, KVM will emulate instructions while
the CPU state remains "invalid".  In real mode, the CPU state is
defined to be "invalid" if any segment register has a base which is
not equal to (sreg<<4) or a limit which is not equal to 64kB.

We don't actually use the base stored in the REAL_DS descriptor for
any significant purpose.  Change the base stored in this descriptor to
be equal to (REAL_DS<<4).  A segment register loaded with REAL_DS is
then automatically valid in both real and protected modes.  This
allows KVM to stop emulating instructions much sooner.

The only use of REAL_DS for memory accesses currently occurs in the
indirect ljmp within prot_to_real.  Change this to a direct ljmp,
storing rm_cs in .text16 as part of the ljmp instruction.  This
removes the only memory access via REAL_DS (thereby allowing for the
above descriptor base address hack), and also simplifies the ljmp
instruction (which will still have to be emulated).

Load the real-mode interrupt descriptor table register before
switching to real mode, since this avoids triggering an EXCEPTION_NMI
and corresponding VM exit.

This reduces the time taken by prot_to_real under KVM by around 65%.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-05-02 15:23:20 +01:00
Michael Brown 03e76c34d8 [librm] Add meaningful labels at section changes
The mode-transition code involves paths which switch back and forth
between the .text and .text16 sections.  At present, only the start of
each function is labelled, which makes it difficult to decode
addresses within the parts of the function existing in a different
section.

Add explicit labels at the start of each section change, so that
addresses can be meaningfully decoded to the nearest label.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-05-02 15:23:20 +01:00
Michael Brown bd640bc364 [librm] Add a profiling self-test for measuring mode transition times
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-05-02 15:23:20 +01:00
Michael Brown 9c16548506 [test] Print out profiling statistics after a successful test run
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-05-02 15:23:20 +01:00
Michael Brown 34eaf69ddf [pcbios] Do not switch to real mode to sleep the CPU
Now that we can handle interrupts while in protected mode, there is no
need to switch to real mode just to halt the CPU.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-04-29 18:24:10 +01:00
Michael Brown e4593909a8 [pcbios] Do not switch to real mode to check for timer interrupt
The currticks() function is called at least once per TCP packet, and
so is performance-critical.  Switching to real mode just to allow the
timer interrupt to fire is expensive when running inside a virtual
machine, and imposes a significant performance cost.

Fix by enabling interrupts without switching to real mode.  This
results in an approximately 100% increase in download speed when
running under KVM.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-04-29 18:24:10 +01:00
Michael Brown aaf276ccd4 [comboot] Use built-in interrupt reflector
We now have the ability to handle interrupts while in protected mode,
and so no longer need to set up a dedicated interrupt descriptor table
while running COM32 executables.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-04-29 18:24:10 +01:00
Michael Brown 23b671daf4 [librm] Allow interrupts in protected mode
When running in a virtual machine, switching to real mode may be
expensive.  Allow interrupts to be enabled while in protected mode and
reflected down to the real-mode interrupt handlers.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-04-29 18:24:04 +01:00
Michael Brown 4413ab4f5a [build] Allow for a debug level of zero
Allow for an explicit debug level of zero, which will enable
assertions and profiling (i.e. anything controlled by NDEBUG) without
generating any debug messages.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-04-28 14:45:47 +01:00
Michael Brown 4e78733094 [downloader] Profile receive datapath
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-04-28 12:31:39 +01:00
Michael Brown e825a96a25 [http] Profile receive datapath
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-04-28 12:31:23 +01:00
Michael Brown 767f2acb98 [tcp] Profile transmit and receive datapaths
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-04-28 12:30:57 +01:00
Michael Brown f65c81b1d0 [ipv4] Profile transmit and receive datapaths
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-04-28 12:30:30 +01:00
Michael Brown 6d4deeeb6c [librm] Use genuine real mode to accelerate operation in virtual machines
We currently use flat real mode wherever real mode is required.  This
guarantees that we will not surprise some unsuspecting external caller
which has carefully set up flat real mode by suddenly reducing the
segment limits to 64kB.

However, operating in flat real mode imposes a severe performance
penalty in some virtualisation environments, since some CPUs cannot
fully virtualise flat real mode and so the hypervisor must fall back
to emulation.  In particular, operating under KVM on a pre-Westmere
Intel CPU will be at least an order of magnitude slower, to the point
that there is a visible teletype effect when printing anything to the
BIOS console.  (Older versions of KVM used to cheat and ignore the
"flat" part of flat real mode, which masked the problem.)

Switch (back) to using genuine real mode with 64kB segment limits
instead of flat real mode.  Hopefully this won't break anything.

Add an explicit switch to flat real mode before returning to the BIOS
from the ROM prefix, since we know that a PMM BIOS will call the ROM
initialisation point (and potentially the BEV) in flat real mode.

As noted in previous commit messages, it is not possible to restore
the real-mode segment limits after a transition to protected mode,
since there is no way to know which protected-mode segment descriptor
was originally used to initialise the limit portion of the segment
register.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-04-28 01:21:08 +01:00
Michael Brown b2c7b6a85e [intel] Push new RX descriptors in batches
Inside a virtual machine, writing the RX ring tail pointer may incur a
substantial overhead of processing inside the hypervisor.  Minimise
this overhead by writing the tail pointer once per batch of
descriptors, rather than once per descriptor.

Profiling under qemu-kvm (version 1.6.2) shows that this reduces the
amount of time taken to refill the RX descriptor ring by around 90%.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-04-27 23:14:48 +01:00
Michael Brown 8a3dcefc0c [intel] Profile common virtual machine operations
Operations which are negligible on physical hardware (such as issuing
a posted write to the transmit ring tail register) may involve
substantial amounts of processing within the hypervisor if running in
a virtual machine.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-04-27 23:14:48 +01:00
Michael Brown 2c820d684a [netdevice] Profile common operations
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-04-27 23:14:47 +01:00
Michael Brown 7c44fd68f0 [cmdline] Add "profstat" command to display profiling statistics
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-04-27 23:14:47 +01:00
Michael Brown e5f6a9be38 [profile] Add generic profiling infrastructure
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-04-27 23:14:43 +01:00
Michael Brown d36e814b8a [libc] Add flsll()
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-04-27 16:56:09 +01:00
Michael Brown 3ffd309375 [libc] Add isqrt() function to find integer square roots
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-04-26 18:19:49 +01:00
Michael Brown 9e8c48deea [test] Check for correct -mrtd assumption on libgcc arithmetic functions
As observed in commit 082cedb ("[build] Fix __libgcc attribute for
recent gcc versions"), recent versions of gcc have changed the
semantics of -mrtd as applied to the implicit arithmetic functions.

It is possible for tests to succeed even if our assumptions about
gcc's interpretation of -mrtd are incorrect.  In particular, if gcc
chooses to utilise a frame pointer in the calling function, then it
can tolerate a temporarily incorrect stack pointer (since the stack
pointer will shortly afterwards be restored from the frame pointer
anyway).

Add tests designed specifically to check that our implementations of
the implicit arithmetic functions manipulate the stack pointer as
expected by gcc.

The effect of these tests can be observed by temporarily reverting
commit 082cedb ("[build] Fix __libgcc attribute for recent gcc
versions"): without this fix in place, the tests will fail on gcc 4.7
and later.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-04-26 16:00:26 +01:00
Michael Brown 082cedb3c3 [build] Fix __libgcc attribute for recent gcc versions
We observed some time ago (in commit 4ce8d61 "Import various libgcc
functions from syslinux") that gcc seems to treat calls to the
implicit arithmetic functions (e.g. __udivdi3()) as being affected by
-mregparm but unaffected by -mrtd.

This seems to be no longer the case with current gcc versions, which
treat calls to these functions as being affected by both -mregparm and
-mrtd, as expected.

There is nothing obvious in the gcc changelogs to indicate precisely
when this happened.  From experimentation with available gcc versions,
the change occurred sometime between v4.6.3 and v4.7.2.  We assume
that only versions up to v4.6.x require the special treatment.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-04-25 16:06:37 +01:00
Michael Brown ad7d5af5e1 [test] Add tests for 64-bit division
On a 32-bit system, 64-bit division is implemented using the libgcc
functions provided in __udivmoddi4.c etc.  Calls to these functions
are generated automatically by gcc, with a calling convention that is
somewhat empirical in nature.  Add these self-tests primarily as a
check that we are using the correct calling convention.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-04-25 01:45:13 +01:00
Michael Brown dce7107fc0 [libc] Add inline assembly implementation of flsl() using BSR instruction
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-04-24 14:49:08 +01:00
Michael Brown 8f0e0e1356 [test] Add self-tests for flsl()
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-04-24 13:40:35 +01:00
Michael Brown 5c6aa56f28 [test] Rewrite TCP/IP tests using okx()
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-04-24 13:01:33 +01:00
Peter Pickford d644ad41f5 [serial] Enable UART FIFOs
Escape sequences received via the serial console can fail since the
cpu_nap() in getchar_timeout() can delay processing for more than the
time it takes for a single character to arrive.

Fix by enabling the UART FIFOs.

Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-04-22 13:59:21 +01:00
Michael Brown 27884298a3 [intel] Avoid completely filling the TX descriptor ring
It is unclear from the datasheets whether or not the TX ring can be
completely filled (i.e. whether writing the tail value as equal to the
current head value will cause the ring to be treated as completely
full or completely empty).  It is very plausible that this edge case
could differ in behaviour between real hardware and the many
implementations of an emulated Intel NIC found in various virtual
machines.  Err on the side of caution and always leave at least one
ring entry empty.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-04-22 13:12:54 +01:00
Michael Brown 93acb5d8d0 [crypto] Allow wildcard matches on commonName as well as subjectAltName
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-04-01 11:36:11 +01:00
Michael Brown f10726c8bb [crypto] Add support for subjectAltName and wildcard certificates
Originally-implemented-by: Alex Chernyakhovsky <achernya@google.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-03-31 13:36:54 +01:00
Michael Brown f1c5f86eef [test] Add subject alternative names to X.509 server test certificate
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-03-31 13:33:46 +01:00
Michael Brown 357f23da9a [test] Add tests for x509_check_name()
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-03-31 13:16:46 +01:00
Michael Brown 7945542fb0 [test] Rewrite CMS tests using okx()
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-03-31 13:07:43 +01:00
Michael Brown cc018ca7d4 [test] Rewrite X.509 tests using okx()
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-03-31 13:07:26 +01:00
Michael Brown 7c7c957094 [crypto] Allow signed timestamp error margin to be configured at build time
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-03-30 20:08:00 +01:00
Michael Brown d90490578d [crypto] Use fingerprint when no common name is available for debug messages
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-03-28 18:42:41 +00:00
Michael Brown bc8ca6b8ce [crypto] Generalise X.509 cache to a full certificate store
Expand the concept of the X.509 cache to provide the functionality of
a certificate store.  Certificates in the store will be automatically
used to complete certificate chains where applicable.

The certificate store may be prepopulated at build time using the
CERT=... build command line option.  For example:

  make bin/ipxe.usb CERT=mycert1.crt,mycert2.crt

Certificates within the certificate store are not implicitly trusted;
the trust list is specified using TRUST=... as before.  For example:

  make bin/ipxe.usb CERT=root.crt TRUST=root.crt

This can be used to embed the full trusted root certificate within the
iPXE binary, which is potentially useful in an HTTPS-only environment
in which there is no HTTP server from which to automatically download
cross-signed certificates or other certificate chain fragments.

This usage of CERT= extends the existing use of CERT= to specify the
client certificate.  The client certificate is now identified
automatically by checking for a match against the private key.  For
example:

  make bin/ipxe.usb CERT=root.crt,client.crt TRUST=root.crt KEY=client.key

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-03-28 17:09:40 +00:00
Michael Brown 2dd3fffe18 [crypto] Add pubkey_match() to check for matching public/private key pairs
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-03-27 00:30:47 +00:00
Michael Brown c27b3c7c33 [build] Add dependency of generated files upon Makefile
Ensure that any generated files (such as DER forms of X.509
certificates) are rebuilt if the Makefile changes.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-03-26 21:36:41 +00:00
Michael Brown 8540300951 [build] Disable ccache for all relevant build targets
The build process currently attempts to disable ccache for files using
the .incbin directive, but the rule fails to apply to anything beyond
the simple object target.  Fix by applying to all relevant build
targets (including debug objects, assembly listings, and so on).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-03-26 21:28:27 +00:00
Michael Brown 9087a03391 [build] Remove long-obsolete mechanism for wrapping embedded images
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-03-26 21:26:17 +00:00
Michael Brown e1ebc50f81 [crypto] Remove dynamically-allocated storage for certificate OCSP URI
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-03-25 16:30:43 +00:00
Michael Brown 01fa7efa38 [crypto] Remove dynamically-allocated storage for certificate name
iPXE currently allocates a copy the certificate's common name as a
string.  This string is used by the TLS and CMS code to check
certificate names against an expected name, and also appears in
debugging messages.

Provide a function x509_check_name() to centralise certificate name
checking (in preparation for adding subjectAlternativeName support),
and a function x509_name() to provide a name to be used in debugging
messages, and remove the dynamically allocated string.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-03-25 16:30:43 +00:00
Alexander Chernyakhovsky 151e4d9bfa [ocsp] Handle OCSP responses that don't provide certificates
Certificate authorities are not required to send the certificate used
to sign the OCSP response if the response is signed by the original
issuer.

Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-03-25 16:30:43 +00:00
Michael Brown e845b7da9b [http] Accept Content-Length header with trailing whitespace
At least one HTTP server (Google's OCSP responder) has been observed
to generate a Content-Length header with trailing whitespace.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-03-25 15:46:14 +00:00
Michael Brown c1595129b5 [bios] Fix screen clearing on even more buggy BIOSes
Some BIOSes (observed with a ProLiant DL360p Gen8 SE) perform no range
checking whatsoever on the parameters passed to INT10,06 and will
therefore happily write to an area beyond the end of video RAM.  The
area immediately following the video RAM tends to be the VGA BIOS ROM
image.  Overwriting the VGA BIOS leads to an interesting variety of
crashes and reboots.

Fix by specifying an exact width and height to be cleared, rather than
passing in large values and relying upon the BIOS to truncate them to
the appropriate range.

Reported-by: Alex Davies <adavies@jumptrading.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-03-21 16:56:34 +00:00
Michael Brown ccb6e5c627 [realtek] Clear bit 24 of RCR
On an Asus Z87-K motherboard with an onboard 8168 NIC, booting into
Windows 7 and then warm rebooting into iPXE results in a broken RX
datapath: packets can be transmitted successfully but garbage is
received.  A cold reboot clears the problem.

A dump of the PHY registers reveals only one difference: in the
failure case the bits ADVERTISE_PAUSE_CAP and ADVERTISE_PAUSE_ASYM are
cleared.  Explicitly setting these bits does not fix the problem.

A dump of the MAC registers reveals a few differences, of which the
most obvious culprit is the undocumented bit 24 of the Receive
Configuration Register (RCR), which is set in the failure case.
Explicitly clearing this bit does fix the problem.

Reported-by: Sebastian Nielsen <ipxe@sebbe.eu>
Reported-by: Oliver Rath <rath@mglug.de>
Debugged-by: Sebastian Nielsen <ipxe@sebbe.eu>
Tested-by: Sebastian Nielsen <ipxe@sebbe.eu>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2014-03-20 15:54:25 +00:00