The response of any LTI system to an input can be computed, in the time domain, by convolution of the input signal and the system's transfer function. It also happens that the transfer function equals the system's impulse response. So by using a convolution with the impulse response of the system and the applied input, the response of the system to that input can be computed. The first equation you have there is the convolution equation.
Things are much easier to compute if everything is mapped to the s domain by using a Laplace transform. In the s domain, convolution becomes a simple multiplication.
You can find more details on the wikipedia article for LTI systems and in many textbooks.
1- Does "an impulse transmitted at time t−τ" means δ(t−τ)?
The phrase "which \$h(\tau,t)\$ is the response at time t to an impulse transmitted at time \$t−\tau\$" is a not so great way of saying "the impulse response of the system" so you are correct. Note that since the system is considered time invariant, the time the impulse is transmitted doesn't matter.
2- how does formula of h(τ,t) have been derived?
This is the system transfer function in the time domain. It can be derived by combining the dynamic equations that describe how the system evolves over time. It can also be derived experimentally by using system identification techniques. The simplest method being to excite the system using an impulse and just record the response. This is not always desirable. More info can be found here
When a transceiver is transmitting the PLL frequency is set to Fs and when it is receiving the PLL frequency is set to either a frequency greater than Fs by the intermediate frequency or one lower than Fs by the intermediate frequency i.e
When transmitting:
$$
F_{\text{PLL}} = F_{\text{S}}
$$
and when receiving:
$$
F_{\text{PLL}} = \begin{cases} F_{S} - F_{IF} & \text{for low side Injection} \\
F_{S} + F_{IF} & \text{for high side Injection}\end{cases}
$$
Remember that we are only permitted to transmit over a very low frequency range e.g 2.39GHz to 2.42GHz so we can not use any of the approaches you propose above because Fs has to be constant.
Further notice that for use the only way to avoid retuning the PLL is to set the intermediate frequency to zero, which will essentially now turn our superhet receiver to a one stage TRF receiver (You can read the the disadvantages of TRF receivers in the link provided).
As for the solution you propose for reducing the IF frequency to a very low frequency so that we can reduce the amount of tuning to be done, you have to remember that the lower the value of our IF frequency the poorer the image channel rejection our system will have so there is a limit to how low our IF can be.We can't just set IF to a very low value.
Best Answer
Freescale AN2253 "Channel Estimation for a WCDMA Rake Receiver" by Ahsan Aziz 2004 has a pretty good explanation -- see Figure 1.
The signal from from an antenna is split into several fingers.
Each finger takes that input signal and produces one output signal by:
Then the receiver combines the output from all outputs from all the fingers from all the antennas, hopefully producing a combined signal with a better signal/noise ratio than it would get without all this rake complexity.
The part of the radio that independently tunes the delay for each path in part (a) is the Path Searcher. The part of the radio that independently tunes the complex attenuation for each path in part (c) is the Channel Estimator.
People familiar with AM and FM radio may think that part (c) is the only one needed -- in those kinds of radios, time delays have approximately the same effect as phase changes. But with spread spectrum technology, time delays act very different from phase changes. A phase change of +20 degrees can be compensated by either an additional phase change of -20 degrees or a phase change of +340 degrees or a phase change of +700 degrees -- all three methods of compensation are identical. A spread-spectrum signal with a relative time delay of -21.5 chips must be compensated by a time delay of close to +21.5 chips; anything below +20.5 chips and anything above +22.5 chips is useless.
The part of the radio that "can be used to find the multipaths" is the Path Searcher. I suppose you might have one integrated component that simultaneously tunes the time delays in (a) and the complex attenuations in (c); that component includes the Path Searcher and the channel estimator. (This happens often with highly-integrated electronics -- you have a bunch of conceptually distinct blocks on the overview block diagram, but often there is some single physical component that is shared between two blocks).