I am not sure about intuition in general, but regarding the step-function FT being a sync function:
Note that the shape will remain the same, but the frequencies over which the FT of a particular step-function resides is a function of the pulse-width of the original signal. Namely, expanding a function in the time-domain actually shrinks the corresponding frequency-domain function (think slowing down voice recordings, the sound gets very low i.e. lower frequency).
That being said, as you decrease the pulse-width of a particular step-function the frequency components of that signal increase because now there is more change happening (to use a loose descriptor) in a shorter amount of time.
In contrast, if we expand the step-function in the time-domain to have a longer pulse-width then there is less change and the corresponding frequency components must be much lower.
In general, I look at a function and try and get a feel for how quickly it might be changing to get a rough idea. But as I said, I don't know of any general rule of thumb here.
There is an important trick to solve such problems. If you have a standard transform pair and you just interchange the variables \$t\$ and \$\omega\$, then you can just use your table if you know the following:
$$\mathcal{F}\{f(t)\}=F(\omega)\Longrightarrow\mathcal{F}\{F(t)\}=2\pi f(-\omega)\tag{1}$$
This is a consequence of the fact that the Fourier transform and the inverse transform are essentially identical, apart from the factor \$2\pi\$ and a minus sign in the exponent. This is exactly what you see in (1): you get a factor of \$2\pi\$ and you have to invert the frequency variable \$\omega\$.
So in your case the standard transform pair is
$$\mathcal{F}\{e^{-t}u(t)\}=\frac{1}{1+j\omega}$$
from which you get using (1)
$$\mathcal{F}\left\{\frac{1}{1+jt}\right\}=2\pi e^{\omega}u(-\omega)\tag{2}$$
PS: The mistake in your calculations is in the last derivative. You compute the derivative as if you considered the function \$e^{-\omega}\$ instead of \$e^{-|\omega|}\$. The correct derivative is
$$\pi\text{sign}(\omega)e^{-|\omega|}$$
which gives the answer
$$X(\omega)=\pi e^{-|\omega|}(1-\text{sign}(\omega))=2\pi e^{-|\omega|}u(-\omega)=2\pi e^{\omega}u(-\omega)$$
which is of course identical to (2).
Best Answer
Devices using the Fourier Transform
It's not.
There's actually quite a few devices that do that, explicitly.
First of all, you'll have to make a difference between the continuous Fourier transform (which you probably know as \$\mathcal F\left\{x(t)\right\}(f)=\int_{-\infty}^{\infty} x(t)e^{j2\pi f t}\,\mathrm dt\$) and the Digital Fourier Transform (DFT), which is what you can do with a sampled signal.
For both, there's devices that implement these.
Continuous Fourier Transform
There's little in the ways of actual need for this in digital electronics – digital signals are sampled, so you'd use the DFT.
In optics and photonics you'll notice that there's an actual chance to get perfectly periodic things for a "large" (read as: nearly as infinite as the integral above) length. Effectively, an acousto-optic element can be excited with one or multiple tones, and it will have the same correlating effects as the integral above. You don't have to look at 2018's Physics Nobel Prize winners to find an example of Fourier Optics.
Discrete Fourier Transform
This is really all over the place; it's such a standard processing step that as a communication engineer, we often even forget where it is.
So, this list is far less than complete; just examples:
Note that the above list only contains things that do DFTs during operation. You can be 100% sure that during design of anything remotely related to RF, especially antenas, mixers, amplifiers, (de)modulators, a lot of Fourier Transforms / Spectral analysis was involved. Same goes for audio device design, any high-speed data link design, image analysis…
How is it done?
I'll just address the DFT here.
Usually, that's implemented as an FFT, Fast Fourier Transform. That's one of the most important algorithmic discoveries of the 20th century, so I will spare but few words on it, because there's literally thousands of articles out there that explain the FFT.
You go in and look at the \$e^{j2\pi \frac nN k}\$ multipliers of a DFT. You'll notice that these can basically be understood as \${e^{j2\pi \frac 1N k}}^n=W^n\$; and there you have your twiddle factor. Now you avoid calculating coefficients that you've already calculated, and just swap a sign where necessary.
That way, you can reduce the complexity of a DFT from the $N^2$ (which would be the complexity if you implemented the DFT as the naive sum) to something in the order of \$N\log N\$ – a huge win, even for relatively small \$N\$.
It's relatively straightforward to implement that in hardware, if you can get your whole input vector at once – you get \$\log N\$ as a combinatorial depth and fixed coefficients at every step. The trick is knowing how (whether) to pipeline the individual layers, and how to use the specific hardware type you have (ASIC? FPGA? FPGA with hardware multipliers?). You can basically piece together \$N=2^l\$-length transform only from what we call Butterflies, which you'll recognize once you read about the FFT.
In software, the principle is the same, but you need to know how to multi-thread very large transforms, and how to access memory as fast as possible by utilizing your CPU caches optimally.
However, for both hardware and software, there's libraries that you'd just use to calculate the DFT (FFT). For Hardware, that usually comes from your FPGA vendor (e.g. Altera/Intel, Xilinx, Lattice…) , or a large ASIC design tool company (Cadence) or your ASIC house.