I guess the output state of a D-flipflop is unknown upon power up. But what are the chances that it is neither 0 nor 1 but an intermediate state such as VDD/2? The D-flipflop in this question has an output driver inverter.
I dunno what the modern chips do internally, but for instance the 8080 used a non-verlapping two-phase clock:
I think the problem with your approach is
the levels at which CMOS logic is active is not easily controlled, and/or controlling it accurately would put extra demands on other aspects (geometry, size, yield, etc.)
there is not realy a level at which a FF samples the signal, rather there is a level 1 at which the first (master?) latch stops sampling, and a level 2 at which the second (slave?) latch starts sampling the master's output. When level 1 << level 2 (wide band between them) and the clock edge is not too steep you automatically have a 'deadband' as you want.
The simple answer is that they don't on their own. The synchroniser is there not to ensure the data gets across, but the ensure you don't end up with metastable signals feeding lots of other signals and causing problems. The second FF as the diagram shows catches the metastable first FF output and prevents it propagating further through the design.
There are various sorts of signals, and how you include synchronisers depends on what signal you are talking about. But lets look at a couple of common types:
Trigger Signals - or any signal which is basically a pulse which must start something else running. These generally carry no data, and all you are interested in is that there is, say, a rising edge in order to start something going in another clock domain. To get these to cross over, you would need a synchroniser (essentially doing what is shown in your diagram), but you need a little bit more.
The simplest option is to extend the pulse - essentially you make sure the input pulse is more than 1 clock periods of the destination clock (it should be longer than 1 cycle by at least the larger of the setup and hold times for the destination register). For example if you are going from a 20MHz clock to a 15MHz clock, you would make sure your pulse is two clock cycles at the input which would ensure that it is presented to the destination clock and not lost. This also answers your question in how the signal is guaranteed to go across. If the pulse is wider than one destination clock period it means that if it goes metastable on the first clock edge and ends up being seen as a 0, then on the second clock edge it will definitely catch the pulse.
Because with this type of signal you are only interested that the pulse has gotten across, it doesn't matter if the output signal ends up with two clock cycles high some of the time and only one cycle the rest. If you need to ensure it is a single cycle pulse, you can instantiate a simple edge detector circuit.
Control Buses - or possibly types of data buses. These are arguably more difficult because if you have multi-bit data stream that needs to stay synchronised. In this case what you would do is implement something called "handshaking". You basically load your data on the source clock and hold it. Then you send a request signal (like in 1) across through a synchroniser. Once the request signal is across you know that the data bus will also be stabilised in the destination domain. You can then clock it into a register bank in the destination. The destination then sends an acknowledge pulse back again to inform the source that it can load the next word.
You would use this sort of bus if you needed to send a control word to the destination clock for which you need to know that it has gotten there before you send another (e.g. if you are sending a command to do something).
Data buses - for data where you have a source that spits out data continuously or in bursts, you are arguably better off using a FIFO than synchronisers. The FIFO uses a dual-clock memory to hold the data, along with counters to keep track of how much data is in the FIFO. You write the data in to the FIFO when there is space, and then increment the write address. This address is then typically encoded into a "Gray Coding" scheme which ensures that each increment in address causes only one bit in the address bus to change (meaning you don't need to synchronise multiple bits). This address is then transferred to the destination domain (through one of your synchroniser chains), where it is compared with the read address. If there is data in the FIFO, it can then be read out of the memory using the destination clock port. The read address is similarly Gray coded and sent back to the source through another synchroniser so that the write port can calculate if there is any space in the FIFO.
Reset Signals - these typically use a modified version of the synchroniser in what is know as "Asynchronous Assert, Synchronous Deassert". In this modifed version, the data input to the first flip flop is tied to GND, and instead the incoming reset signal is connected to asynchronous preset signals of each flip-flop in the synchroniser. This results in an output signal which is entirely asynchronous when it goes high, but the synchroniser chain ensures that it goes low synchronously with the destination clock by clocking through zeros in the register chain.
This type of synchroniser is terrible for data and control, but perfectly suited to reset signals. If all the destination logic feeds the output of this chain into the asynchronous reset inputs of any register in the domain, then there is little worry of metastability on assert (even though it is asynchronous) as all the registers are forced to a known state. Then when the reset signal is deasserted in the source domain, it synchronously deasserts in the destination domain meaning all registers come out of reset on the same clock cycle (rather than +/- 1 cycle if it was asynchronous deassert).
As you can see from the above, it is a lot more complex to do clock-domain crossing than to just stick a 2 flip-flop synchroniser on the signal. The exact method used depends on the application.