I think RQDQ has the right idea. The best method of sending digital data over restricted bandwidth audio lines is a problem that has already been solved for modems.
The 300 bit/s modems used audio frequency-shift keying to send data. In this system the stream of 1s and 0s in computer data is translated into sounds which can be easily sent on the phone lines. In the Bell 103 system the originating modem sends 0s by playing a 1,070 Hz tone, and 1s at 1,270 Hz, with the answering modem putting its 0s on 2,025 Hz and 1s on 2,225 Hz. These frequencies were chosen carefully, they are in the range that suffer minimum distortion on the phone system, and also are not harmonics of each other.
In the 1,200 bit/s and faster systems, phase-shift keying was used. In this system the two tones for any one side of the connection are sent at the similar frequencies as in the 300 bit/s systems, but slightly out of phase. ...
Before I get into details, let me say that I have probably designed more professional-audio over Ethernet hardware than anyone else-- both in terms of number of different PCB designs as well as number of PCBs manufactured and shipped to end customers. Odds are very high that you have heard products where I have designed the audio over ethernet circuitry in them. (This is pro-audio only, and does not include VOIP or other non-pro products.)
Let's start with the issues:
Software: The hardware is honestly the easy part. The software is difficult. The closer you want to pro-audio performance the harder it is. Your application doesn't sound like pro-audio, but the software task is still not trivial.
Audio Clocking: Transmitting the audio data from point A to point B is relatively easy. Doing it in a way that the two devices have a synchronized audio clock is difficult. Non-pro applications solve this by doing sample rate conversion or just simple drop/duplicate samples as the audio clocks drift. There are difficulties and side effects of both of these, which increases the software difficulty immensely. Just saving the data to a file on the PC side of things is easy-- using it in a real-time way is hard.
Low-latency: How long it takes the audio to go from the Mic, over the network to the PC, and then used by the PC is called latency. The shorter the latency the harder things are. Just saving audio data to a file is a good example of super-long latency, and is one reason why that is also the easiest thing to do. A latency of <2.5 mS is damn hard to do correctly an robustly. The shorter the latency, the less issues there are with things like audio echo and stuff.
Bandwidth: Sending telephone quality audio with high latency is the easiest. Pro-audio quality with low latency is super hard. Using the mic, MCU, and Ethernet interface that you proposed is going to put you into the telephone quality side of things. There are many cases where raw bits-per-second of the Ethernet interface is not the only problem. Other issues like IRQ rate, packet transmit/receive time (not just overall bandwidth), and sometimes packet timing are super important.
Network Topology: As the audio quality goes up (and latency goes down) your network topology becomes really important. I am talking about the number of Ethernet switches, the type of switches, how they are connected, and the number/type of non-audio ethernet devices also on the network. For you this probably wouldn't be an issue, but you never know.
I think that your proposed solution would work for telephone audio quality with a high latency. You'll probably have to do sample dropping/repeating to deal with non-synchronized audio clocks. And it won't be all that great. You might be quite underwhelmed by the audio. I also think that you'll have a lot of software to write on the PC side of things. That being said, I would not do the project with that.
If I were doing the project, I would look at one of the new-ish ARM Cortex-M3 or M4 devices by TI or Freescale that includes a 100 mbps or gigabit ethernet controllers. Many of these things are less than US$10 each and can run at up to 100 MHz. The amount of RAM and Flash makes the task of writing software much easier.
For amusement, my current Audio Over Ethernet project uses an 800 MHz ARM Cortex-A8 with dual gigabit Ethernet ports and runs a customized version of Linux. The system as a whole (not just this Cortex-A8 device) can handle over 2048 audio channels at 48KHz, 32 bit, with an overall system latency of just 2.5 mS (including ADC, DAC, two times over the network, and lots of processing). Audio devices on the network have their sample clocks sync'd to less than 1 uS, even if there are 8+ switch hops in the middle.
Best Answer
Getting a little off the question here but I think we have an XY problem situation here.
Going by the comments you don't need to get 4 analog signals into the audio input of a PC.
What you need to do is get 4 analog signals into a PC/phone as cheaply as possible.
That is a very different problem.
For the price of a 4 input analog multiplexer and the electronics to switch between the inputs at the appropriate rate you could probably get a small micro controller with 4 analog inputs. This would allow you to sample all 4 signals constantly at a sufficient rate and then transfer that data to the PC/phone over USB. It has the added advantage that unlike the microphone socket the USB socket would also be able to supply plenty of power to run the external electronics.