It's been awhile since this was asked, but I hate orphaned questions :)
First, let's over-simplify a modern x86 platform and pretend it has 32-bits of address space from 0x00000000 to 0xFFFFFFFF. We'll ignore all the special / reserved areas, TOLUD (top of lower usable DRAM, Intel parlance) holes, etc. We'll call this system memory map.
Second, PCI Express extends PCI. From a software point of view, they are very, very similar.
I'll jump to your 3rd one -- configuration space -- first. Any addresses that point to configuration space are allocated from the system memory map. A PCI device had a 256 byte configuration space -- this is extended to 4KB for PCI express. This 4KB space consumes memory addresses from the system memory map, but the actual values / bits / contents are generally implemented in registers on the peripheral device. For instance, when you read the Vendor ID or Device ID, the target peripheral device will return the data even though the memory address being used is from the system memory map.
You stated these are "allocated into RAM" -- not true, the actual bits / stateful elements are in the peripheral device. However, they are mapped into the system memory map. Next, you asked if it was a common set of registers across all PCIe devices -- yes and no. The way PCI config space works, there is a pointer at the end of each section that indicates if there is more "stuff" to be read. There's a bare minimum that all PCIe devices have to implement, and then the more advanced devices can implement more. As for how useful it is for functional operation, well, it's mandatory and heavily utilized. :)
Now, your question about BARs (base address registers) is a good space to segue into memory space and I/O space. Being somewhat x86 centric, the specification allows the specification of a BAR size, in addition to the type. This allows a device to request a regular memory-mapped BAR, or a IO space BAR, which eats some of the 4K of I/O space a x86 machine has. You'll notice that on PowerPC machines, I/O space BARs are worthless.
A BAR is basically the device's way to tell the host how much memory it needs, and of what type (discussed above). If I ask for say 1MB of memory-mapped space, the BIOS may assign me address 0x10000000 to 0x10100000. This is not consuming physical RAM, just address space (do you see now why 32-bit systems run into issues with expansion cards like high-end GPUs that have GB of RAM?). Now a memory write / read to say 0x10000004 will be sent to the PCI Express device, and that may be a byte-wide register that connects to LEDs. So if I write 0xFF to physical memory address 0x10000004, that will turn on 8 LEDs. This is the basic premise of memory-mapped I/O.
I/O space behaves similarly, except it operates in a separate memory space, the x86 I/O space. Address 0x3F8 (COM1) exists both in I/O space and memory space and are two different things.
Your last question, messages refer to a new type of interrupt mechanism, message signaled interrupts or MSI for short. Legacy PCI devices had four interrupt pins, INTA, INTB, INTC, INTD. These were generally swizzled among slots such that INTA went to INTA on Slot 0, then INTB on Slot 1, then INTC on Slot 2, INTD on Slot 3, and then back to INTA on Slot 4. The reason for this is that most PCI devices implemented only INTA and by swizzling it, having say three devices, each would end up with their own interrupt signal to the interrupt controller. MSI is simply a way of signaling interrupts using the PCI Express protocol layer, and the PCIe root complex (the host) takes care of interrupting the CPU.
This answer might be too late to help you, but maybe it will help some future Googler / Binger.
Finally, I recommend reading this book from Intel to get a good, detailed introduction to PCIe before you go any further. Another reference would be the Linux Device Drivers, an online ebook from LWN.
Best Answer
I used to design PCI-Express hardware that required full hot-plug support in hardware and software, and it certainly is possible, but it's quite involved and requires extensive software support -- the hardware is actually quite simple. I had to design the hardware, then implement BIOS (UEFI) and kernel (Linux) support for hot-plugging arbitrary PCIe devices over fiber and copper.
From a software point of view, one must remember that PCIe continues with the PCI software model, including the concepts of bus, device, function addressing. When the PCI bus is enumerated, it's done as a breadth-first search:
PCIe enumeration is generally done twice. First, your BIOS (UEFI or otherwise) will do it, to figure out who's present and how much memory they need. This data can then be passed on to the host OS who can take it as-is, but Linux and Windows often perform their own enumeration procedure as well. On Linux, this is done through the core PCI subsystem, which searches the bus, applies any quirks if necessary based on the ID of the device, and then loads a driver who has a matching ID in its probe function. A PCI device is ID'd through a combination of it's Vendor ID (16-bits, e.g. Intel is 0x8086) and Device ID (another 16-bits) -- the most common internet source is here: http://pcidatabase.com/.
The custom software part comes in during this enumeration process and that is you must reserve ahead of time PCI Bus numbers, and memory segments for potential future devices -- this is sometimes called 'bus padding'. This avoids the need to re-enumerate the bus in the future which can often not be done without disruption to the system. A PCI device has BARs (base address registers) which request to the host how much and what type (memory or I/O space) memory the device needs -- this is why you don't need jumpers like ISA anymore :) Likewise, the Linux kernel implements PCIe hotplug through the pciehp driver. Windows does different things based on the version -- older versions (I think XP) ignore anything the BIOS says and does it's own probing. Newer versions I believe are more respectful of the ACPI DSDT provided by the host firmware (BIOS/EFI) and will incorporate that information.
This may seem pretty involved and it is! But remember that any laptop / device with an ExpressCard slot (that implements PCIe as you can have USB-only ExpressCards) must do this, though generally the padding is pretty simple -- just one bus. My old hardware used to be a PCIe switch that had another 8 devices behind it, so padding got somewhat more complicated.
From a hardware point of view, it's a lot easier. GND pins of the card make contact first, and we'd place a hot-swap controller IC from LTC or similar on the card to sequence power once the connection is made. At this point, the on-board ASIC or FPGA begins it's power-up sequence, and starts to attempt link-training its PCI Express link. Assuming the host supports hot-plugging and the PCI Express SLTCAP/SLTCTRL register (in spec: PCI Express Slot Capability Register, PCI Express Slot Control Register. There is a 1 and 2 for this as well -- enough bits to split across two regs). for that port was configured to indicate the port is hot-plug capable, the software can begin to enumerate the new device. The slot status (SLTSTA, PCI Express Slot Status Register) register contains bits that the target device can set indicating power faults, mechanical release latch, and of course presence detect + presence changed.
The aforementioned registers are located in 'PCI (Express) Configuration Space', which is a small region of the memory map (4K for PCIe) allocated to each potential bdf (bus:device:function). The actual registers generally reside on the peripheral device.
On the host side, we can use PRSNT1#/PRSNT2# as simple DC signals that feed the enable of a power switch IC, or run to GPIO on the chipset / PCH to cause an IRQ and trigger a SW 'hey, something got inserted, go find it and configure it!' routine.
This is a lot of information that doesn't directly answer your question (see below for the quick summary), but hopefully it gives you a better background in understanding the process. If you have any questions about specific parts of the process, let me know in a comment here or shoot me an email and I can discuss further + update this answer with that info.
To summarize -- the peripheral device must have been designed with hot-plug support in mind from a hardware POV. A properly designed host / slot is hot-plug capable as well, and on a high-end motherboard I would expect it to be safe. However, the software support for this is another question entirely and you are unfortunately beholden to the BIOS your OEM has supplied you.
In practice, you use this technology anytime you remove/insert a PCIe ExpressCard from a computer. Additionally, high-performance blade systems (telecom or otherwise) utilize this technology regularly as well.
Final comment -- save the PDF that was linked of the Base Spec, PCI-SIG usually charges bucks for that :)