"An introduction to asynchronous circuit design" by Davis and Nowick
(in particular, Figure 1 and Figure 2 and the nearby text)
describes two handshaking protocols as "pervasive".
The 4-cycle protocol, aka RZ (return to zero), 4-phase protocol, and level-signaling.
And the similar but more complicated to implement 2-cycle protocol, aka transition, 2-phase, or NRZ (non-return to zero) signaling -- which is very similar to the "data strobe encoding" used by SpaceWire and FireWire.
Either one sounds like it has most of the features you requested --
it's SPI-like in that there are exactly 4 signals, all 4 signals are one-way (no passive pull-ups), the master can pause the slave indefinitely until it is ready for the next bit from the slave, etc.
It also has a feature supercat requested that SPI doesn't have: the slave can pause the master indefinitely until it is ready for the next bit from the master.
I don't know of any chips that have the 4-cycle protocol built in, but it looks like it would be easy to bit-bang on a microcontroller or a CPLD.
In fact, it looks like it would be easier to bit-bang than SPI, since (like SPI) the master has no timing requirements, and (unlike SPI) the slave has no timing requirement either.
Is it possible to use the 4-phase protocol for synchronous bit transfers, and somehow build a higher-level protocol on top of that to get the other things supercat wants -- byte alignment, start-of-command frame alignment, attention/busy/idle states, etc?
To fully listen in on a SPI conversation, you need to be watchin all four lines: chip select, clock, MOSI, and MISO. You could possibly do this with two separate SPI peripherals. Both are connected to chip select and clock, with one watching MOSI and the other MISO. The rest is then firmware to interpret the results.
Best Answer
Usually you don't have to wire the clock as the sending device, which transmits also the clock, always samples the data at the same clock it sends the data out.