An amplifier... amplifies. Takes a tiny signal, and beefs it up with more voltage / current / power.
A speaker needs that power, to make a coil move in a magnetic field. Attached to the coil is the cone, this displaces air when the coil moves - and we get sound.
A speaker WILL act as a microphone. It's just not very efficient.
If you get real close, and shout really loud, you will get a mediocre signal. In theory, this get passed back to the input where, since we are reducing signal strength along the way, we would have a microscopic signal.
Then comes the final hurdle: recording this signal.
Computer data, in the form of ones and zeroes, gets chewed through a chip called Digital-Analog-Converter [DAC] before being sent to that amplifier. To record our eensy signal, we need an Analog-Digital-Converter [ADC].
ADC's are weird and wonderful beasts. Your computer has one - it's what the "mic" input goes to. No matter how you quibble about passing signals backwards through an amplifier, I fail to see how you get a digital version without adding your own hardware [ie. planting a bug].
A summing opamp is the correct way to do this. However, your current configuration is problematic: you have the positive input connected to ground, as well as the negative supply of the opamp. This means that your signals need to be ground referenced (have a center point around 0v), but the opamp can't output voltages below 0v, so half your waveforms will be clipped off.
To fix this, either connect the opamp's negative supply to a negative voltage rail, or bias your signals to VCC/2, and connect the opamp's positive input to a resistor divider that also provides VCC/2.
Whether or not you need another amplifier before a speaker depends on the speaker; a small speaker is likely within the capabilities of your opamp, though it may not behave ideally. If in doubt, find a suitable audio amplifier IC and connect it on the output.
Best Answer
Prompted by a comment by @BruceAbbot (on a previous answer of mine that I deleted because it wasn't spot-on) I did some further researches and I found a reference that seems perfectly fit to answer your question.
In short: modern audio power amplifier generally have a very low output impedance (fraction of ohms) and act as (almost ideal) voltage sources. Therefore, using your terminology, they "modulate" the voltage across the speakers, which react absorbing the current needed for their operation, as their characteristics mandate.
The reference is Audio Power Amplifier Design Handbook, by Douglas Self (link to google books), under "Damping Factor" section. Excerpt:
The few exceptions cited are trasconductance power amplifier used for so-called current-driven loudspeakers, where the amplifier act like a current source and "modulates" the current into the loudspeaker, which reacts by generating a voltage across itself accordingly.
See also this EE.SE answer: How important is impedance matching in audio applications?