To start with, I do not, nor have I have I ever used Arduino, but I am very familiar with AVR chips and the ATmega328p in particular. If I am understanding you correctly, you are trying to dim a 4x8 matrix of LEDs. The entire matrix should be dimmed all at once, but not every LED will always be on, meaning individual on/off control with collective dimming. This is actually a very simple thing to do. Let me start by explaining PWM control, since you mentioned you might not be doing it correctly.
If I have an LED and series resistor and connect 5V, it will shine at some brightness - a factor of the current through the LED, set by the series resistor. If I lower the voltage, the current will also lower causing the LED to dim. If I send a pulse to the LED, the effective voltage of the LED will be an average of the pulse on and off states. This percentage of on time is known as the duty cycle. The frequency of the pulse itself is how often it repeats. For example, to create a 100Hz pulse with a 50% duty cycle, I would want to turn a signal ON for 5ms and then OFF for 5ms. The total period is 10ms. Frequency = inverse of Period = 1/10ms = 100Hz. The duy cycle is 5ms/10ms = 50%.
Leds can switch on and off very quickly, but the human eye cannot distinguish these changes above a certain frequency - this value is different for different people. Considering that a TV refreshes at 60Hz in the USA (50Hz elsewhere), we can safely say that 50Hz is a good minimum, although many studies have shown that with LEDs, certain frequencies can actually cause the LED to appear brighter with the same duty cycle. A common number is 100Hz.
Controlling a single LED and even groups in this manner is very simple using a timer. The following code will enable Timer 1 to run at 125Hz (8ms period) with Fcpu = 16MHz.
#define _BV(FOO) (1<<FOO)
// Set up Timer 1 for 125Hz LED pulse to control brightness
PRR &= ~_BV(PRTIM1); // Enable Timer1 Clock
TCCR1A = 0x00; // Outputs Disconnected
TCCR1B = _BV(WGM12) | // CTC Mode, Top = OCR1A
_BV(CS11) | _BV(CS10); // Prescaler = 64
OCR1A = 1999; // Top = (16MHz * 8ms / 64)-1
OCR1B = LED_DUTY_CYCLE; // LED PUlse Width
TIMSK1 = _BV(OCIE1B) | _BV(OCIE1A); // Interrupts Enabled
In this code, compare match A will happen every 8ms, creating a 125Hz pulse. Compare match B will happen at whatever value is defined as "LED_DUTY_CYCLE." For an 80% duty cycle, as you mentioned, set OCR1B to 1600. This value could also be changed in code, such as when a user presses a dimming function button.
The LED control will take place in the ISR for the two compare matches. The variable "outputs" is updated in the main program whenever an LED should be on or off. Each bit of this variable maps to an LED. For example, to turn on LEDs 0 and 5, outputs should be set to 0b00100001 in main. The variable "brightness" can be updated in main to control the duty cycle of the LEDs. In the COMPA ISR, the LEDs that are enabled by "outputs" will be turned on. Then, all LEDs should be turned off in the COMPB ISR.
ISR(TIMER1_COMPA_vect){
PORTD = (outputs & 0xFF); // Turn On LEDs Q0 - Q7
OCR1B = brightness; // Set the pulse width
}
ISR(TIMER1_COMPB_vect){
PORTD = 0x00; // Turn Off LEDs Q0 - Q7
}
In this example, there are 8 LEDs connected to each of the 8 pins of PORTD. They could be put anywhere, this just makes the code example easier to read. If the LEDs are spread around, you would need to do something more like this:
if(outputs & 0x01) PORTD |= LED0;
if(outputs & 0x02) PORTD |= LED1;
if(outputs & 0x04) PORTC |= LED2;
//...
if(outputs & 0x40) PORTB |= LED6;
if(outputs & 0x80) PORTB |= LED7;
Note that each LED is mapped to a bit in "outputs", but the LEDs themselves reside in various IO ports. Whatever LEDs are enabled will be turned on.
Controlling a matrix is a bit more complex since only one column will be on at a time. With that in mind, the highest possible duty cycle you can achieve is 25% even if the LED rows were all ON all the time. That is because each column would only be on 1/4 of the total time. If more than one column is on at a time, then you will completely lose your ability to turn on and off individual LEDs. Something else to keep in mind with a matrix is the current consumption. Depending on your definition of row and column, you will have either 4 banks of 8 or 8 banks of 4 parallel LEDs. If the cathodes are all tied together, then that IO pin is sinking the current through all of the LEDs. The ATmega328p has a max current of 40mA per pin and a total of 200mA at any one time. The individual pin problem could be easily avoided by sinking the LEDs through a "logic level gate" MOSFET. Take a look at this schematic:
Of course, the whole thing could be rotated 90 degrees to suit your needs. In any case, the 8 "CTRL" lines will turn on an LED so long as the appropriate "COLUMN" signal is also high. The column control should be simple and can be done in main or a timer interrupt, but the PWM frequency should be about 4 times faster than the column switching frequency to ensure the dimming of the LEDs still works correctly. In that case, each column would be pulsed four times before the next column turns on. But, like I said, with 4 columns, each LED will only be on 25% of the time at max, so if your PWM duty is set to 80%, the LED is really only on .8 * .25 = 20% of the time. Also, don't forget that as the active column switches, the control swiches from one bank of LEDs to the next, so the "outputs" variable used above would need to be updated for the appropriate bank of LEDs whenever the active column switches.
Also of note is that it doesn't matter what you pulse to dim the LEDs. In the above example, I was pulsing the rows because its easy to enable a specific LED in that way. Pulsing the column control signal to the FET gate instead would also work. Lastly, since only one column is on at a time, each shared row can share a resistor. Each column cannot share a resistor because the individual brightness of the LEDs would changed depending on how many are turned on or off.
Best Answer
I've done this as an overall dimmer for panel lights:
simulate this circuit – Schematic created using CircuitLab
This is a relaxation oscillator (the bottom part) compared to an adjustable reference (the top part). The reference is a front-panel potentiometer that is padded with fixed resistors to match the voltage range of the oscillator. I would recommend 10k resistors all around because they're easy to get, as well as the pot. Choose the capacitor for a low-ish frequency that is still higher than you can see, like maybe 30Hz or above. A 555 calculator may be useful because the voltage range is similar.
The PWM output then feeds one of these, depending on which is easier for you:
simulate this circuit
As you can see, each LED is individually controlled from a digital output, and the common rail is controlled by the PWM'ed transistor. Thus, OFF is still off, and ON is whatever brightness you set with the PWM generator.
Just make sure that your LED's can withstand the full supply worth of reverse voltage because they'll get that when the PWM is off and you're driving some LED's on and others off. Or to avoid that problem, you could use the opposite type of transistor to control each LED, but I've never needed to.
This has the advantage (for me at least) of the processor not needing to care about the overall brightness. For panel indicators, all the software needs to know is ON/OFF, and this analog circuit defines how bright ON actually is.