Hmm, let's start w/ power first:
3072 * 20mA * 2.1 = 130 W!
Your power supply can only give you about 5W. You're very much short there.
If you can, greatly reduce the size of your matrix.
One solution would be to use a desktop ATX power supply. Those have 5V lines through their hard drive connector and can give you lots of power. If you take the 5V you'll either need to design an LED drivers or you'll be burning (5V - 2.1V) * 20mA * 3072 = 180W. That's 180 watts of heat son! That's not gonna be good, or comfortable or usable. So you really need to design an efficient LED driver.
The main issue with driving LEDs is that their voltage drop is not consistent for the same current between different batches, meaning one LED will drop 2.0 and another 2.1. If you just line them up an give them 5V you'll get noticeable variations.
You can buy efficient LED drivers and google is your friend but not your best friend. Because it will find you drivers to drive a string of LEDs, not a single one. You want the LEDs to be individually controllable so you'll need to put a regulator on every single one of these -- that's 3072 individual copies of the same circuit -- and then enable/disable those. That is a lot of work.
Yes you can use shift registers to control each regulator. How you wire up the shift registers very much depends on the 'frame rate' that you want to get out of controlling these LEDs. I'm not gonna go there right now because as you can see the power/design to drive each individual LED is quite high and complex.
You won't find a chip that can drive 128 LEDs but there are a few that can drive 16 columns: -
There is another TI chip that can do 24 channels of LEDs - the TL5947 and there's also a 32 channel chip from linear tech called the LT3746 - it has integrated power management to suit multiple leds in series and varying supply voltages.
Best Answer
I will not go into specific IC's since you might change your design. Instead I will go into the theory. Driving LED's is quite easy.
First:
There are different ways to drive LED's. The two main methods is by using a constant current or constant voltage source. LED's "operate" on current. Their intensity is generally someone proportional to the current and exponential to the voltage. Therefore, a constant current driver is generally used. When you use a constant voltage source and a resistor(the basic way) you are creating a rudimentary constant current source. Most driver chips use a constant current source.
With such a driver you do not need to supply resistors. You tell the driver how much current it is suppose to supply and it will supply that amount to the LED. Usually you "program" the current by using a resistor and the datasheet will tell you about how much current will be used for a given resistor. (you can use a variable resistor to allow you to adjust the intensity of the LED's after the fact)
Second,
LED's have specific current ratings that relate to how long the will live. Usually this number of around 1mA to 20mA for your average LED's. You need to figure out how bright you want your LED's to get some estimate of power usage. If you have a 64 driver chip, each using 10mA, then that is 640mA total. If the IC is 5V then 5V*640mA = 3.2W. This may be too much for the chip. Check the datasheets of the driver to find out. (and remember, these generally are absolute maximum ratings)
Also, the more current you use the more power the LED's dissipate this may or may not be an issue for the board. If you can't get rid of the heat then your LED's could burn up.
Third,
You can "daisy chain" IC's they have the capabilities without issue BUT this could potentially reduce the speed if you are doing fast updates(like graphics). Daisy chaining is very simple and the driver IC's usually have an SDI and SDO along with a clock. Your simply send your clock to all the chips in parallel(star routing) and connect the your uC to the SDI of one chip, then that chip's SDO to the SDI of the next chip, etc...
Using this method, you do have to worry about clock skew and such but for 3-4 chips it shouldn't be a problem.
Also, it will be faster if you use the uC to it's full advantage, and possibly easier. Most uP's can output at least 8 bits on a "port" at once. So you could drive up to 8 LED drivers at the same time. This, in theory, would be 8 times faster than if you daisy chained 8 LED drivers.
As far as your "voltage" question, it doesn't make a lot of sense. You use the voltage that is required. If the uP uses 5V, the LED drivers use 3V then you use 5V for the uP and 3V for the drivers.
In most cases though, you want to use the lowest voltage you can get away with. This allows you to reduce the power consumption. Before I said 64 LED's all using 10mA at 5V = 3.2W but at 3V it is 1.92W. Almost half(since you cut the voltage in half). BUT the LED's are just as bright(since they are still using 10mA)!!
So, if your uP can use 3V and your LED driver can use 3V then you drive it with 3V. (note that if you want to drive your LED's with a lot of current you might actually have to use a little more voltage. You do need some headroom but generally 3V is plenty)
LEDs:
N LED's all using a max of A amps at V volts will dissipate a total of N*A*V W. You can calculate this per driver to find out what each driver will dissipate. Make sure you have plenty of "room" from the absolute max values.
Obviously if you are only driving 2 LED's out of 10 the value will be different BUT we must calculate the worst case, if all 10 were on, else we will be sorry(unless we know it can never happen, then we have to find out how many).
So you have 288 LED's, If you use 10mA and 5V then that is about 14W. Pretty significant. That means your power supply has to supply 2.88A. (Thats assuming you are only driving LED's and no power losses).
If you drop that down to 1mA then that cuts all everything by an order of magnitude. Only 1.4W and .288A which is much more reasonable.
You shouldn't worry too much though, I've done a project with 500 LED's using 20 driver chips without any major issues. I had to use serial and parallel driving to get it all done but it worked without issues. I think I was using about 1mA or so per LED.
(I'm assuming you're not doing anything crazy like trying to make a torch(using high power LED's) or something)