Put a suitable resistor in series, and your DVM (in voltage mode) over the resistor.
For instance, a current of 100 nA through a 1M resistor will give you 0.1V. Check the input impedance of your DVM, for this to work it must be >> 1M. You will probably need a switch to short the resistor for the period that your target is not yet in low-power mode.
Another way would be to use a known-sized capacitor as power supply and see how fast it discharges. Again, your measurement instrument must have a high impedance, otherwise it will disturb the measurement. You might get around this by connecting the voltmeter to measure the voltage only at specific moments.
(This is a generic guide. I suspect you could also benefit from code optimisation, but that's outside the scope of this website.)
Step 1: Rough sizing, budget, vendor
Pick one of:
Computer (Raspberry Pi, Beagleboard, PC104 board, Intel Edison, etc). Boots a general-purpose operating system and has lots of processing power. More expensive and power-hungry. $10-$100.
Large MCU. ARM Cortex-A / PIC32 / dsPIC / AVR32 / TI C series DSP etc. Decent computing power, OS optional. ~$5.
Small MCU. Cortex-M / PIC16. Not really enough space for a true OS, maybe just a lightweight task scheduler. ~$2.
Tiny MCU. Only really for applications where you care about every last microamp of power consumption. ~$1 or less.
You should also consider at this stage which vendors and toolchains you like and dislike. Have a look at the cost of things like in-circuit debugging devices and IDEs.
Step 2: Minimum Peripherals
Do you need things like USB? PCI? HDMI? SATA? Unusually fast ADCs or DACs? Almost all of the "small" or "tiny" category do not have these, although USB is fairly widely available.
Step 3: Prototype
Pick something which meets the above criteria, at random if necessary, make a start, find out how feasible it is and how much space / processing power you need. You've already done some of this. Writing in C should make much of the logic portable.
Once you have the prototype you can say to yourself, "I need one like this, but with more X" and let that guide your decisions.
Step 4: Shrink
It's generally easier to start with the largest (most Flash and RAM) member of a CPU family, write v1 of your application, and then choose a smaller, cheaper one to fit. You can also spend time on the art of fitting software into fewer resources. What's worthwhile depends on how many units you're going to make.
Best Answer
Consider a transimpedance amplifier, which will give you zero burden (negligible voltage drop). You have to use an amplifier with very low bias current compared to your measured current to get good accuracy. For example, an amplifier with 1pA bias current and a 10M feedback resistor would give you an accurate 1V out for -100nA in. You can feed that into the ADC and display it. As this (ignore the photodiode).
You can switch Rf to different values for different ranges (eg. 3nA/30nA/300nA/3uA full scale would require 1G/100M/10M/1M resistors).
Making circuits to measure nA accurately (pA or fA leakage) requires care. Google electrometer circuits for more details, there is more to it than is appropriate for this answer format.