Electronic – Does PCIe hotplug actually work in practice


I've got into a discussion in the comments of https://security.stackexchange.com/questions/109199/is-physical-security-less-important-now-for-securing-a-server?noredirect=1#comment194327_109199

The question is simple. Has anyone experience of successfully hotplugging a PCIe card? Does it require special motherboards and cards, or is it supposed to work on all consumer hardware?

Best Answer

I used to design PCI-Express hardware that required full hot-plug support in hardware and software, and it certainly is possible, but it's quite involved and requires extensive software support -- the hardware is actually quite simple. I had to design the hardware, then implement BIOS (UEFI) and kernel (Linux) support for hot-plugging arbitrary PCIe devices over fiber and copper.

From a software point of view, one must remember that PCIe continues with the PCI software model, including the concepts of bus, device, function addressing. When the PCI bus is enumerated, it's done as a breadth-first search: PCI Bus topology from tldp.org

PCIe enumeration is generally done twice. First, your BIOS (UEFI or otherwise) will do it, to figure out who's present and how much memory they need. This data can then be passed on to the host OS who can take it as-is, but Linux and Windows often perform their own enumeration procedure as well. On Linux, this is done through the core PCI subsystem, which searches the bus, applies any quirks if necessary based on the ID of the device, and then loads a driver who has a matching ID in its probe function. A PCI device is ID'd through a combination of it's Vendor ID (16-bits, e.g. Intel is 0x8086) and Device ID (another 16-bits) -- the most common internet source is here: http://pcidatabase.com/.

The custom software part comes in during this enumeration process and that is you must reserve ahead of time PCI Bus numbers, and memory segments for potential future devices -- this is sometimes called 'bus padding'. This avoids the need to re-enumerate the bus in the future which can often not be done without disruption to the system. A PCI device has BARs (base address registers) which request to the host how much and what type (memory or I/O space) memory the device needs -- this is why you don't need jumpers like ISA anymore :) Likewise, the Linux kernel implements PCIe hotplug through the pciehp driver. Windows does different things based on the version -- older versions (I think XP) ignore anything the BIOS says and does it's own probing. Newer versions I believe are more respectful of the ACPI DSDT provided by the host firmware (BIOS/EFI) and will incorporate that information.

This may seem pretty involved and it is! But remember that any laptop / device with an ExpressCard slot (that implements PCIe as you can have USB-only ExpressCards) must do this, though generally the padding is pretty simple -- just one bus. My old hardware used to be a PCIe switch that had another 8 devices behind it, so padding got somewhat more complicated.

From a hardware point of view, it's a lot easier. GND pins of the card make contact first, and we'd place a hot-swap controller IC from LTC or similar on the card to sequence power once the connection is made. At this point, the on-board ASIC or FPGA begins it's power-up sequence, and starts to attempt link-training its PCI Express link. Assuming the host supports hot-plugging and the PCI Express SLTCAP/SLTCTRL register (in spec: PCI Express Slot Capability Register, PCI Express Slot Control Register. There is a 1 and 2 for this as well -- enough bits to split across two regs). for that port was configured to indicate the port is hot-plug capable, the software can begin to enumerate the new device. The slot status (SLTSTA, PCI Express Slot Status Register) register contains bits that the target device can set indicating power faults, mechanical release latch, and of course presence detect + presence changed.

The aforementioned registers are located in 'PCI (Express) Configuration Space', which is a small region of the memory map (4K for PCIe) allocated to each potential bdf (bus:device:function). The actual registers generally reside on the peripheral device.

On the host side, we can use PRSNT1#/PRSNT2# as simple DC signals that feed the enable of a power switch IC, or run to GPIO on the chipset / PCH to cause an IRQ and trigger a SW 'hey, something got inserted, go find it and configure it!' routine.

This is a lot of information that doesn't directly answer your question (see below for the quick summary), but hopefully it gives you a better background in understanding the process. If you have any questions about specific parts of the process, let me know in a comment here or shoot me an email and I can discuss further + update this answer with that info.

To summarize -- the peripheral device must have been designed with hot-plug support in mind from a hardware POV. A properly designed host / slot is hot-plug capable as well, and on a high-end motherboard I would expect it to be safe. However, the software support for this is another question entirely and you are unfortunately beholden to the BIOS your OEM has supplied you.

In practice, you use this technology anytime you remove/insert a PCIe ExpressCard from a computer. Additionally, high-performance blade systems (telecom or otherwise) utilize this technology regularly as well.

Final comment -- save the PDF that was linked of the Base Spec, PCI-SIG usually charges bucks for that :)