UEFI PXE booting across subnets

dhcppxe-bootuefi

I'm trying to boot PCs from a Windows 2012R2 WDS server in UEFI mode. If, and only if, the client is in a different subnet from the DHCP/PXE servers, this fails with some of them. (It always works in BIOS mode, but I need UEFI.)

The symptom is that after the initial DHCP request/offer/request/ack sequence, the working clients contact the PXE server to get their boot information, and the failing ones do not.

There are two DHCP servers (also 2012R2) in addition to the PXE. There are no boot-related DHCP options configured on them; DHCP relaying is enabled on the network and relays to all three servers.

This is the packet list from booting a working client:

1  DHCP Discover - Transaction ID 0xe828c4bc
2  DHCP Offer    - Transaction ID 0xe828c4bc (from first DHCP)
3  DHCP Offer    - Transaction ID 0xe828c4bc (from PXE)
4  DHCP Offer    - Transaction ID 0xe828c4bc (from second DHCP)
5  DHCP Request  - Transaction ID 0xe828c4bc (to first DHCP)
6  DHCP ACK      - Transaction ID 0xe828c4bc (from first DHCP)

7  4011 → 4011 Len=347                       (to PXE)
8  4011 → 4011 Len=349                       (from PXE)
9  TFTP Read Request, File: boot\x64\wdsmgfw.efi, (to PXE)
...

With a failing client, it looks exactly the same until line 6, then nothing more happens; it simply does not contact the PXE server.

I have compared the packet contents in Wireshark, and other than the values which are dependent on what network the client is on (giaddr, router, etc.), all the offers are identical between the working and failing cases.

This seems to affect particular BIOS/firmwares: The working clients include VMware Workstation and ESXi, as well as an Intel NUC, and it fails with Asus B150M-C mainboards and at least one Dell Optiplex. BIOSes are current, and at most a few months old, on all devices involved.

It looks to me like the UEFI firmware does not know how to use a router. Is there a way get this working?

Best Answer

The problem is the client; I should have looked more closely at my packet traces. I just figured out that right after it gets the DHCP ACK from the regular DHCP server, the failing client starts ARPing for the PXE server, getting nowhere, of course.

So the problem really is that the firmware does not understand routers.