In the scenario you gave, Device 1 performs the calculation and asks Device 2 to move accordingly. Therefore Device 1 runs the control algorithm; it is the controller. That suggests Device 1 should be the SPI master, requesting gyroscope data, and commanding moves.
Also presumably, the RF2500 is receiving commands from a transmitter somewhere - another data point suggesting it is the (onboard) controller.
I suspect you can make either approach work, but that's not really the point - starting from the cleanest design makes the job easier.
Many different approaches can be used for using a flash memory device to emulate a non-volatile memory store [i.e. something that can accommodate frequent small updates, as opposed to large but infrequent updates]. I've used a number of different approaches in various applications (and sometimes combined multiple approaches within one application), since no single approach will be optimal for all purposes (a lot will depend upon frequency of updates, the amount of information that changes frequently, etc.) Another important thing to consider is the probability of partial writes or partial erasures. In many situations, it may be reasonable to expect that in case of power failure, the bypass caps will be able to supply enough energy to complete any write operation which has been started, but not enough to complete any erase. If an erase is interrupted by a power loss, one should make no assumptions about the state of the erased memory, even if it appears blank. It's entirely possible for bytes of data in a partially-erased block to be in a weird state that sometimes reads as FF and sometimes as something else. Consequently, any time a block of memory is going to be erased, there should be something written in some other area of memory which indicates that.
If there are four or more separately-erasable flash blocks available, it may be helpful to write them in rotation, maintaining the invariant that some particular byte (e.g. the first byte of each block) will be programmed in every block except for the block where new data is being written and the block following that (wrapping around), which should be presumed to be invalid. If there's no room to write any more data to the current block, erase the following block (without regard for whether it already looks like it blank), write anything necessary to make it valid, and then finally program the first byte of the block that was just filled up (christening the newly-erased block as the new place to put data). Note that power-up code which is trying to decide which block should hold freshly-written data might read the first byte of a block that was partially-erased, but the pattern of programming in the other blocks will be such that it won't matter whether that page reads as programmed or not.
If you use this approach and want to use some number of e.g. 4Kbyte flash blocks to simulate a 255-byte RAM (not 256), one could use bytes 1-255 of the "currently-being-written" block to hold the contents of the RAM as of the time the program started writing to that block; the remainder of the block would use pairs of bytes to store changes. If the first byte in a pair is not FF, the second byte will indicate the value of corresponding RAM byte. When writing to a byte of RAM, identify the last pair which doesn't read as FFFF, write the second byte of that pair with the new value of that RAM byte, and then write the first byte of the pair with the "address". Note that if power fails between the data write and the address write, there will be a "record" with a non-FF data field but an FF address; the next write should go after that.
Using this approach, if one had e.g. 64K of flash with 100,000 erase cycles, one could accommodate approximately 3,200,000,000 discrete parameter-update operations. If multiple bytes are frequently written at a time, or if one needs to simulate a RAM larger than 255 bytes, other approaches could be used; the 3,200,000,000 discrete updates is probably within a factor of two of the optimal number one could hope to achieve in the absence of an advance-power-warning interrupt (such an interrupt might allow one to forgo writing any updates to flash until the power-fail-interrupt is tripped, thus allowing parameters to be updated an unlimited number of times).
From what I can see, the (main) difference between it and SRAM is it's slower, and the difference between it and EEPROM is it's more expensive.
I'd say it's sort of "in between" both.
Being a pretty new technology, I'd expect the price to drop a fair bit over the next year or so providing it becomes popular enough. Even though it's not as fast as SRAM, the speed is not bad at all, and should suit many applications fine - I can see a 60ns access time option on Farnell (compared with a low of 3.4ns with SRAM)
This reminds me - I ordered some Ramtron F-RAM samples quite a while back, still not got round to trying them yet...