Okay, I have some feedback, and mine is mostly directed at your circuit.
First, I see two discrete stages.
Stage 1: Averaging
Your first stage is the capacitor/diode circuit. This is your averaging circuit. This is missing a very important component, a resistor. right now, if you ignore the loading from the rest of the circuit, this will effectively be a peak detector. as the input is connected with a current independent .7V drop to the capacitor. With addition of a series resistor with the diode you will find that you can make this system have a time constant to get an averaged value out of your system. You will need to tweak this value to give you a resistance in the range of what you are wanting.
Stage 2: Gain
Your second stage is the amplifier circuit, or driving circuit. This is made up of a basic biasing resistor and the BJT. This gives the bias you need to then superimpose a small signal(welcome back to school, it is small signal model all over again). The issue I see is that you have your capacitor directly tied into the BJT gate. This BJT looks like a diode when you look into the base. A diode to ground. This means that it will really pull current when you try to pass .7 V but that your capacitor has almost no load below this voltage. This means that your input in the current circuit drives it directly in your current circuit. This is bad.
If you place a resistance between the averaging circuit and the driving circuit you can control how much the voltage change on the capacitor affects the driving current. This will give a more controlled load on the RC averaging circuit, and allow it to just impose a small increase in current through the base so that your averaging circuit helps drive without over-driving the input and allowing the averaged signal to slowly affect your system. Make sure the isolation is large enough compared to your averaging resistance to not affect load your averaging circuit to the point it is just showing the peak.
Schematic
Choosing Resitance
Editing this in because it seems I did not give good feedback on how to pick values.
Averaging Resistance
First, Vin from the audio minus .7 volts(diode drop) divided by R needs to be less than the maximum current your audio device can supply. This is worst case where the capacitor is charged to 0V.
Averaging Capacitance
Now, your averaging resistance times your capacitance needs to equal 1 over the frequency you want to have as a maximum. This means, if you want it to "update" 20 times a second you need RC to equal 1/20.
Isolation Resistance
This is a more complex choice. If you would like 5 V to be your full drive from audio, and this relates to needing 10mA from the BJT then you need to pick an isolation resistance that will give you 100uA(hfe=100) when you have 5V-.7V/Isolation.
This gets messy if you end up loading your circuit too much and then will need a preamp stage that reduces the load that your averaging circuit sees.
Extra Information
Second, if you find you are discharging the capacitor too quickly, use a darlington pair as your transistor instead. Easy to take care of by just hooking up two of those transistors together.
Third, if you are having problems where your averaged current is staying a bit too low and you come back and post it here I can draw a schematic of how to level shift your input.
Let me know if there is anything I was not clear on.
Relays have the best specs: (almost) zero resistance when on, infinite resistance when off. Use reed-relays, not power relays.
They're better suited for the low current, and often have a lifetime as high as 100 \$\times\$ 10\$^6\$ operations, which is forever. Reed relays won't give you an audible click either. If you use SPDT or DPDT relays you can switch between signal and ground, so that the input doesn't pick up noise when off.
If you want to go electronic there's the 74HC4066, like Pentium100 suggests, but you'll easily find switches with better specs. Analog Devices has a wide offering, the dual SPDT ADG1636 has very good figures: 1\$\Omega\$ on resistance, and 0.007% THD+N.
On the other hand, the 74HC4066 has 0.12% THD (at 4V\$_{PP}\$) and 50\$\Omega\$ on resistance. With a series resistor of 47k\$\Omega\$ that results in a 0.00012% THD. The ADG1636 will result in 0.00000015% :-). Keep in mind that the 4066 is SPST.
If you're a purist you go for the AD, otherwise the 4066 will do if you don't need the double throw feature.
The electronic switches can be controlled by a logic voltage and hardly need power (the ADG1636 consumes less than 1\$\mu\$A).
Best Answer
For a bargain in FET switches and precision analog resistor networks in the same package, take a look at MDACs like the AD7528. The signal path is just that : FET switches and resistors, nothing else.
They have been used in some quite high end audio products.
Downsides:
One design approach, mitigating the second and eliminating the third, is to use one AD7528 per audio channel. Use both its channels to extend the attenuation range, e.g. using one as a coarse (6dB steps) and the other as fine gain adjustment, probably with a buffer (emitter follower) between them.