RF bandwidth and data rate are related by the modulation format. Different modulation formats will require different bandwidths for the same data rate. For FM modulation, the bandwidth is approximately 2*(df + fm) where df is the maximum frequency deviation and fm is the frequency of the message. FSK is basically FM where the message signal is a square wave. The highest frequency component of a binary bit sequence transmitted serially occurs when the sequence is 01010101. This component is one half of the bit rate. So for FSK, the bandwidth is approximately Δf + r where Δf is the separation between the two frequencies and r is the bit rate. The reason this is bigger than Δf is because whenever the frequency is changed, extra frequency components are generated. Switching between frequencies more often (higher data rate) results in more power in these extra frequency components. Now, these can be filtered out to some extent, but if you filter more of them than Δf + r, the result will be too distorted to reliably extract the original bitstream.
Think about it this way: a pure sinewave consumes zero bandwidth, but it also contains zero information. As soon as you start changing a characteristic of a pure sinewave (frequency, phase, amplitude, etc.) its bandwidth must increase accordingly. In the case of amplitude modulation, modulating the amplitidue of a sinewave of frequency fc at frequency fm will result in a signal with components at fc, fc+fm, and fc-fm. If the message contains components all the way down to DC, then the resulting modulated signal will have twice the bandwidth of the message signal. FSK is basically transmitting two AM signals at the same time on different frequencies, so the bandwidth will naturally be increased by the separation of these two carrier frequencies.
For FSK, the bit rate and the symbol rate are the same. But for higher order modulations like QPSK and QAM, each transmitted symbol can code for more than one bit so the bit rate can be significantly higher than the symbol rate. This means that the required transmit bandwidth is less than what would be required for AM or FSK. QPSK and QAM have higher spectral efficiency. However, QPSK and QAM are more susceptible to noise and distortion and therefore require a relatively higher SNR.
Also, for FSK, you want the two frequencies to be integer multiples of the data rate. This will result in an integer number of cycles in each bit period so that the carrier always ends up at the same level on data bit transitions. This probably won't be done at RF, though. Generally the FSK signal would be generated at an intermediate frequency which would then be mixed up to the actual RF carrier frequency.
By reading the standards documents for each one.
I know that in LF at least, the data rate is normally a simple divider from the carrier frequency. A 128 kHz carrier might be divided by a factor of 32 or 16 to create a data rate 4 or 8 kbps, for example.
Best Answer
The number '2' refers to the number of tones used to encode the signal. For example, 2-FSK is essentially sending binary data using two frequencies. One symbol (time slice) has the potential for two values only. 2-FSK is usually what is meant by an unqualified mention of FSK.
It's possible to use more than two tones. These modulation schemes usually are called MFSK for Multiple Frequency Shift Keying. A common MFSK scheme in amateur radio is MFSK16, which uses 16 different frequencies to represent 4 binary bits in each symbol. By your notation, this would be 16-FSK.
The more frequencies that you use in a symbol, the more complicated your modulation/demodulation circuit. Typical MFSK applications in amateur radio require a computer sound card and CPU for signal processing. Analog circuitry can be developed somewhat easily for FSK using two band-pass filters.