As Leon Heller said, this is not RF. However, it sure is an interesting experiment.
You have noticed that the magnetic field of the primary coil isn't strong enough to transfer energy over such a distance. Amplifying is a good idea indeed, but the question is: how much do you need to amplify?
The transistor you're using in your circuit needs a specific voltage in order to start conducting. The secondary coil probably won't give such voltage. What you can do, is use the transistor as an amplifier:
As you can see, a pull-up (R1) and a pull-down (R2) are used to give the NPN transistor the minimum voltage it needs. With this circuit, even a tiny fluctuation in Vin will affect the current through collector and emitter. Vout is Vin, but amplified (and inverted, but that's not a problem here). You can use Vout to feed a transistor as a switch, as your circuit shows.
However, this is theory. How much you have to amplify heavily depends on the distance between the coils, and you might need to amplify so much, that it isn't worth trying.
Do you have an oscilloscope? I would recommend you making a graph of the amplitude of the voltage on the secondary coil as a function of the distance between the coils. I'm guessing here, but I think this will be an exponential function. When the voltage is nice AC, you might be able to do this with a multimeter as well. Now you have some data and you can calculate the amplification you need at a specific distance. The needed amplification will dramatically increase when increasing the distance, is my guess. That makes this setup not very useful on further distances, and that's why we use RF.
To get you started in RF, I can recommend you the book Crystal Sets to Sideband by Frank W. Harris, K0IYE. Skip or scan chapter one about the history of radio. Chapter 2 is basic knowledge which I think you already have, so also scan it. Chapter 3 is some blahblah about a workspace, which I found demotivating because Harris expects you to have a lot. In chapter 4, the fun starts, with a crystal set.
Your FM radio works because it receives a band of frequencies around the frequency it shows or that you tune it at. As you say, the nature of FM means that the frequency will vary. However, the extent of these variations is well defined, and the radio is designed to "find" the carrier anywhere in this range.
This may surprise you, but AM also requires the receiver to pick up a range. I'm not going to go into Fourier analisys right now, but basically changing amplitude is adding frequencies. Put another way, a true pure single frequency can't ever change in amplitude and can't carry any information.
The way AM works, there is a frequency band on either side of the carrier that is the width of the highest frequency that can be transmitted. For example, if the signal being modulated onto the AM carrier can be up to 10 kHz, then there is a 10 kHz band of stuff on either side of the carrier. In fact, the carrier is constant and the actual information is in these side bands. Yes, I know that may be unintuitive, but at the level you are asking and what I have time to explain here you'll just have to trust me on this. Look up Fourier analisys if you want to learn more.
For example a AM radio station a 1 MHz carrying up to 10 kHz content will have a signal spread out over the range of 990 kHz to 1.01 MHz.
Best Answer
It uses something called a filter. You can build filters out of all sorts of different things.
RC filters made out of resistors and capacitors are probably the simplest to understand. Basically, the capacitor acts as a resistor, but with a different resistance at different frequencies. When you add a resistor, you can build a voltage divider that is frequency dependent. This is called an RC filter. You can make high pass and low pass filters with one resistor and one capacitor. A low pass filter is designed to pass low frequencies and block high frequencies, while a high pass filter does the opposite. A low pass in series with a high pass forms a bandpass, which passes frequencies within some range and blocks other frequencies. Note that the operation of an RC filter (and most filters, for that matter) will depend on the source and load impedance. This is especially important when cascading simple filter stages to construct larger filters as the operation of each stage will be affected by the impedance of the adjacent stages.
simulate this circuit – Schematic created using CircuitLab
Filters can also be made with other components, such as inductors. Inductors also act like resistors, but they change in the opposite direction as capacitors. At low frequencies, an inductor looks like a short while a capacitor looks like an open. At high frequencies, an inductor looks like an open while a capacitor looks like a short. LC filters are a type of filter built with inductors and capacitors. It is possible to make a rather sharp LC filter that cuts off quickly and is easy to tune with a variable capacitor. This is what is normally done for simple radios like crystal radios.
simulate this circuit
You can make bandpass filters out of anything that has a resonant frequency. A capacitor and an inductor in series or in parallel form a resonant tank circuit that can be used as a bandpass or bandstop filter, depending on precisely how you hook it up. An antenna is also a bandpass filter - it will only receive frequencies well that have wavelengths around the size of the antenna. Too large or too small and it won't work. Cavities can also be used as filters - a sealed metal box has various standing wave modes, and these can be exploited to use as filters. Electronic waves can also be converted to other waves, such as acoustic waves, and filtered. SAW (surface acoustic wave) filters and crystal filters both work by mechanical resonance and use the piezoelectric effect to interface with the circuit. It is also possible to build filters out of transmission lines by exploiting their inherent inductance and capacitance as well as by exploiting constructive and destructive interference that results from reflections. I have seen a number of microwave band filters that are made out of a crazily shaped piece of copper printed on a PCB. These are called distributed element filters. Incidentally, most of these other filters can all be modeled as LC or RLC circuits.
Now, a software defined radio is a different animal altogether. Since you are working with digital data, you can't just throw some resistors and capacitors at the problem. Instead, you can use some standard filter topologies like FIR or IIR. These are built out of a cascade of multipliers and adders. The basic idea is to create a time-domain representation of the filter you need, and then convolve this filter with the data. The result is filtered data. It is possible to build low pass and bandpass FIR filters.
Filtering goes hand in hand with frequency conversion. There is a parameter that you will see all over the place called Q. This is the quality factor. For bandpass filters, it is related to the bandwidth and center frequency. If you want to make a 100 Hz wide filter at 1 GHz, you would need a filter with an astronomically high Q. Which is infeasible to build. So instead, what you do is filter with a low Q (wide) filter, downconvert to a lower frequency, and then filter with another low Q filter. However, if you convert 1 GHz to, say, 10 MHz, a 100 Hz filter has a much more reasonable Q. This is often done in radios, and possibly with more than one frequency conversion. Additionally, this method makes it very easy to tune the receiver as you can just change the frequency of the oscillator used for the frequency translation to tune the radio instead of changing the filters.
In the case of digital filters, the longer the filter, the higher the Q and the more selective the filter becomes. Here is an example of an FIR bandpass filter:
The top curve is the frequency response of the filter and the bottom curve is a plot of the filter coefficients. You can think of this type of filter as a way of searching for matching shapes. The filter coefficients contain specific frequency components. As you can see, the response oscillates a bit. The idea is that this oscillation will match up with the input waveform. Frequency components that match closely will appear in the output and frequency components that do not will get cancelled out. A signal is filtered by sliding the filter coefficients along the input signal one sample at a time, and at each offset the corresponding signal samples and filter coefficients are multiplied and summed. This ends up basically averaging out signal components that do not match the filter.
Frequency conversion is also performed both in software and in hardware. In hardware, this is required to get the band you're interested in inside of the ADC IF bandwidth. Say, if you want to look at a signal at 100 MHz but your ADC can only receive 5 MHz of bandwidth, you will have to downconvert it by around 95 MHz. Frequency conversion is performed with a mixer and a reference frequency, generally called the local oscillator (LO). Mixing exploits a trig identity, $$\cos(A)\cos(B) = \frac{1}{2}(\cos(A+B)+\cos(A-B))$$. Mixing requires a component that multiplies the amplitudes of the two input signals together, and the result are frequency components at the sum and difference of the input frequencies. After mixing, you'll need to use a filter to select the mixer output that you want.