Neither tgt not IET will give you the functionality of a "shared image", because it is simply not their job. All they do for you is providing access to a block device through the iSCSI protocol. Nothing like that is within the scope of iSCSI at all.
However, you might construct what you are looking for by using LVM snapshots:
http://tldp.org/HOWTO/LVM-HOWTO/snapshotintro.html
where you would use some prepopulated disk as the starting image and create several snapshots with reasonably sized CoW-areas for your thin clients to write on. Exporting the snapshots via iSCSI would give you the desired result.
However, bear in mind that this kind of operation comes with some manageability problems - after you've created the snapshots, changes to the original disk are not propagated to the snapshots, so there is no easy way for central configuration changes or image updates.
Another possible option would be the use of ZFS (either with Solaris or using the zfs-fuse implementation for Linux) and either snapshots or block-based deduplication features. Deduplication is pretty expensive in terms of RAM, but might save more space in some scenarios.
A ZFS setup will give you some more flexibility - with LVM snapshots you have to devise a definite amount of storage to your snapshot CoW area at creation time and need to take care of possible overflow situations afterwards (snapshot overflows do render the snapshots unusable and need to be prevented - e.g. by extending the CoW size), with ZFS' flexible allocation, there is no need for that.
Denis
(My initial answer was premature. As promised, I've rewritten it after having gotten everything working.)
First of all, I've found that in general iSCSI-boot-enabling software is half-baked, and the disparate systems involved interoperate very poorly. For this reason I recommend instead going with a hardware-based solution such as iSCSI HBAs if possible. With that said I'll relate my experiences here in case it helps anyone.
To summarize what I found (I'm assuming you've set up DHCP and TFTP for PXE and an iSCSI target and have gotten as far as chaining to gPXE or iPXE):
gPXE and iPXE never write multiple NICs into the iBFT (iSCSI Boot Firmware Table), and this can impact Windows Server. I have discussed this problem in detail in a separate question here.
In addition to the above design limitation, gPXE has an actual bug that likewise affects systems with multiple network ports. I'll explain below. In order to avoid this bug, I used the "UNDI only" build of gPXE. This prevents gPXE from accessing the NICs directly and makes it instead use an API provided by the NIC's PXE loader. This makes gPXE think there is only one network port (the one it was loaded on), and this evades the bug. I am not sure if this bug is present in newer iPXE versions.
I was initially confused about the keep-san
option in gPXE/iPXE. The keep-san
flag affects gPXE's behaviour only if the boot fails. Therefore, this option is needed only on the very first boot when the installation is started.
Windows Server (at least 2012 and probably others) apparently does not tolerate moving the iSCSI initiator that provides its system disk from one network port to another. If Windows is booted from an initiator on a different network port than the one to which it was installed, Windows will crash (BSOD and/or reboot) during boot, at the handoff to the MS initiator.
There is an acknowledged feature/issue in Windows Server (2003 and up) where it will use the gateway, if one is specified, to access the target, even if the target is on the local subnet. If the gateway is unavailable or doesn't route back onto the same port, the boot will fail at the handoff to the MS initiator. Make sure no gateway setting is given out by the DHCP if one isn't needed.
The gPXE bug I mention above involves the iBFT (iSCSI Boot Firmware Table). This is an object which is placed into memory by the pre-boot system which contains information about the NICs, iSCSI initiator, and the iSCSI target to use as the system disk. The OS uses this information to continue booting once it switches to protected mode. The format is specified here.
Suspecting a problem in the information gPXE was placing in the iBFT, I programmed a boot sector which dumps the contents of the iBFT to the screen. Using this I found that the data written by gPXE is under certain circumstances erroneous.
As mentioned, gPXE only writes one NIC record into the iBFT, but in some situations, the information written to that one NIC record is jumbled up. The MAC address and PCI address will correspond to one NIC, but the local IP and gateway addresses will correspond to another. This is most likely to happen if the SAN is not on the first NIC.
To add to the confusion, this incorrect iBFT information is written if gPXE boots automatically, but when booting from gPXE's command prompt, depending on the exact sequence of commands entered, the correct information may be written. Throw in the fact that Windows will manifest symptoms identical to those caused by this bug if its NIC has been changed (even given a correct iBFT), and you can see why I tore my hair out.
By the way, in my original question I had thought that it was working for Server 2008 R2 but not Server 2012. (I'm editing that out as it's misleading.) I suspect there's actually no difference in their underlying behaviour and that the different outcomes owed to the subtleties of the above problems and minor variations in my tests.
Best Answer
This appears to happen when the system page file is located on the iSCSI device. While locating the page file on iSCSI worked fine under Windows 7, it appears to be broken in Windows 10. Unfortunately, Windows defaults to setting up a page file on the primary disk, so when the primary disk is iSCSI, it is broken out-of-the-box.
(Note that the stop code
PAGE_FAULT_IN_NONPAGED_AREA
does not necessarily relate to the system page file in general, despite containing the word "page". This stop code is more like the NT kernel's version of "Segmentation Fault", a general invalid memory access. But, in my specific case, it coincidentally turned out to be related to the page file.)I was able to solve the problem by disabling the page file entirely. (It also works to locate the page file on a local disk, if one exists, but this is easier to configure after getting the OS up and running with no page file.)
Disabling page file offline
Since your machine is not bootable, you cannot disable the page file through the UI. Luckily, it's easy to disable the page file via the registry. To do so, locate the following registry key, and set its value to be empty:
If your registry contains
ControlSet002
and/orCurrentControlSet
in addition toControlSet001
, make sure to make the same changes to those.Editing registry offline
But how do we edit the registry without booting? There are multiple approaches. You could temporarily mount the iSCSI volume from an existing, working Windows machine, or from a Windows Preinstallation Environment (WinPE) that you booted from USB or maybe even from PXE. Many guides exist describing these options.
In order to edit a registry offline (i.e., edit a registry other than the one of the system that is running regedit):
regedit
("Registry Editor") normally.HKEY_LOCAL_MACHINE
.Windows\System32\config\SYSTEM
.The offline registry file will appear in the tree under
HKEY_LOCAL_MACHINE
with the name you chose. Edits you make to keys within it will usually be saved automatically, although it is advised that you explicitly unload the offline hive before closing regedit to be sure (see Harry Johnston's comment below). This is a very strange UI, but that's apparently how it is done.