You have two functions. One is \$f(t)\$ and the other is the time shift of \$f\$, \$g(t) = f(t+\Delta)\$. You would like to find \$\Delta\$.
We can take the Laplace/Fourier transform (say via Cooley-Tukey FFT algorithm) and denote the transformed signals by \$\hat{f}(s)\$ and \$\hat{g}(s)\$.
Now \$\hat{g}(s) = e^{\Delta s} \hat{f}(s)\$ so the quantity you seek is
$$\Delta = \frac{ln(\hat{g}(s))-ln(\hat{f}(s))}{s}.$$
In other words, the natural log of the quotient \$\frac{\hat{g}}{\hat{f}}\$ will be linear in \$s\$ and the slope of this line will be \$\Delta\$.
First, as Dave Tweed has answered, you generally use a lock-in to recover a small signal buried in noise.
That said, your script is not properly implementing a lock-in amplifier, as evidenced by your second trace. Your problem is that the DC component of your original signal needs to be suppressed (the signal should be AC-coupled). If your reference sine wave has a DC component of zero (which it should) then for a signal with zero degrees of phase shift and an average of zero, the output will be a sine-squared wave (plus noise). Note that this will be rectified, with no signal component negative.This will allow a low-pass filter to recover the amplitude of the desired frequency, but not its shape.
What you seem to be trying to do is simple noise rejection, and there are two possibilities. Either your noise is broadband, with significant noise energy both above and below your frequency of interest, or the noise is only significant above your fundamental.
Assuming the latter, you can process your signal using only a high-pass filter, made arbitrarily sharp and close to your fundamental. If the former, you need a bandpass filter.
In either case, looking at the crossover distortion shown in the last figure is going to be very, very difficult. That's a low-energy, high-frequency artifact, and may not be easily recovered from the noise. If you really want to try, the first thing you need to do is simulate your signal, then perform an FFT on it to establish the frequency response your filter needs in order not to exclude the signal of interest. Then compare this to the noise spectrum and you'll probably see that they overlap.
Other than an extremely large averaging filter (many, many waveforms averaged), I don't see any good way to recover your feature of interest.
EDIT - Having stated that a signal in noise needs a bandpass filter to recover it, I should explain that the multiplier used acts as what is called "mixer" in the RF world, and its effect is to frequency shift the signal by the reference frequency. This is useful in the case of the lock-in amplifier because it shifts the signal frequency to DC. In this case, a bandpass filter on the original signal becomes a low-pass filter on the processed signal, and the trick of the lock-in is that it's MUCH easier to make a very sharp, narrow lowpass filter than it is a very sharp, narrow bandpass filter. To begin with, the lowpass filter response is intrinsically referenced to DC, or zero Hz. This means that there is no central frequency of the filter to drift with time and temperature, which is a major problem with bandpass filters.
On the other hand, since the desired signal is now DC, you cannot recover the signal shape. Every deviation from the fundamental frequency (sine wave) shows up as a frequency deviation in the processed signal. If the artifact of interest is part of the signal at the base frequency, the frequency deviations show up as harmonics, and the closest to the fundamental is at twice the fundamental. This means that any close filtering will eliminate the part of the signal which corresponds to the glitch.
Best Answer
This looks rather similar to this MATLAB example. The maximum of the FFT plot is the estimated amplitude (perhaps with a scaling factor).
Note the ~10% error in the amplitude estimation of the 120Hz signal- not coincidentally it's higher by the noise floor. Perhaps you can estimate the noise floor if it's reasonably flat in your case and subtract it to get a better estimate. More samples will also help, but at 1Hz it may take quite a long time.