What parameters of a real op amp determine the lowest voltage it can
amplify?
It's all about signal (desired) to noise (not desired) ratio (SNR). The LM358 has an equivalent input voltage noise of 55 nV per sqrt(Hz). Now that probably sounds confusing but it isn't. Let's say your bandwidth is 10kHz\$^1\$ - the total noise will be 55 nV x sqrt(10,000) = 5.5 uV RMS.
If your signal level is 500 uV then your SNR is 20 log (500/5.5) = 39 dB.
Is this acceptable? I don't know but it would be fine for a telephone conversation.
Does it mean that it is not possible to amplify signals on the order
of 2mV?
No the 2mV figure tells you that with a gain of 100 you will see an output offset voltage (an error) of 0.2V - this shouldn't normally be a problem in an AC amplifier.
\$^1\$ The onus is on the designer to incorporate filtering that sufficiently removes noise above 10 kHz - for instance a 1st order filter (a simple capacitor across the gain setting feedback resistor) is usually enough but, for this type of filter the "noise bandwidth" will be a bit bigger than that determined by the CR components (\$\pi/2\$ bigger). In other words a 10kHz filter will have a noise bandwidth of 15.7 kHz and this would raise the noise from 5.5 uV RMS to 6.9 uV RMS.
This can be easier to understand if you look at the waveforms.
In a push-pull, or bridge amplifier both lines are driven as shown below.
Notice \$L-\$ is literally the inverse of \$L+\$
Similarly the other signal, \$R-\$ is the inverse of \$R+\$
The difference between the +/- voltages is that excites the speakers.
Now, if you connect Opposites to one speaker the difference in signal becomes the mixture of both signals. Hey-Presto.. you have a mono system.
Note however, the amplitude of each "Side" is now effectively reduced by half. If you can't understand that, consider the case where there is no signal on the \$R\$ side. The difference between the blue lines is now only half what is was between \$L+\$ and \$L-\$.
If the original sound was recorded central, that is, an equal waveform on both left and right channels, the \$L+\$ waveform would be identical to the \$R+\$ waveform, same for the negatives. As such, joining \$L+\$ with \$R+\$ would result in no voltage difference for that sound. That is why you need to cross connect them.
Best Answer
No, using diodes does not accomplish what you want. That will partically rectify the additional signal, making it sound like a mess.
You need to actually mix the new signal onto each of the two stereo signals separately. That might be as simple as a few resistors if the new signal is well buffered so that its impedance is low, and whatever is receiving the mixed signals can handle high input impedance, like a few 100 Ohms.