Electronic – Why doesn’t the VGA implementation using an AVR microcontroller work

assemblyatmega328pavrmicrocontrollervga

Problem

I'm trying to generate the appropriate signal output for interfacing an AVR ATmega328P microcontroller with an LCD monitor, via the VGA specification. The VGA specification I am trying to meet is the industry standard 640*480 with 60 Hz frame refresh rate (1).

Background information

The AVR is operating at a frequency of 20 MHz which is short of the VGA pixel clock frequency of 25.175 MHz. However, I believed, having considered similar projects online (2), that through some manipulation of the number of clocks per region, I could meet the timings stipulated in the VGA specification. My adjusted timings for both the vertical and horizontal specification can be found below:

Vertical timing (frame)

Visible area: 15.24 ms, 480 lines (304800 clk cycles @ 20 MHz)
Front porch: 0.3175 ms, 10 lines (6350 clk cycles @ 20 MHz)
Sync pulse: 0.0635 ms, 2 lines (1270 clk cycles @ 20 MHz)
Back porch: 1.04775 ms, 33 lines (20955 clk cycles @ 20 MHz)
Whole frame: 16.66875 ms, 525 lines (333375 cycles @ 20 MHz)

Horizontal timing (line)

Visible area: 25.45 us, 508 pixels (508 clk cycles @ 20 MHz)
Front porch: 0.65 us, 13 pixels (13 clk cycles @ 20 MHz)
Sync pulse: 3.8 us, 76 pixels (76 clk cycles @ 20 MHz)
Back porch: 1.9 us, 38 pixels (38 clk cycles @ 20 MHz)
Whole line: 31.75 us, 635 pixels (635 clk cycles @ 20 MHz)

Based upon these timings, the frame refresh rate is 59.9925 Hz. The visible region on screen has a resolution of 508*480 in contrast to the 640*480 of the specification.

I am aware that at a clock frequency of 20 MHz the timings can not be met precisely, but if you compare my timings with the actual specification (3), the timings are very close.
The software I wrote to generate these outputs has been written in AVR assembler, this allows me count the number of clock cycles for each region, and can be found just below this paragraph. The software will indefinitely output a red frame on screen.
An image of the schematic for the hardware implementation can be found below the code.

;
; VGA_INTERFACE.asm
;
; Outputs the required signals with the correct timings for VGA output to a monitor.
;
;
;
; Created: 13/02/2018 
; Author : Tom
;









; COMPILER SETTINGS
.INCLUDE "M328pDEF.INC"


; INTERRUPT VECTORS
.org 0      ; defines absolute address for interrupt vector




; ****************************************************************************************
; **** IO PORT D SETUP
; ****************************************************************************************

    ; ddrd pin I/O direction configured

    sbi ddrd,0  ; RED BIT 0             
    sbi ddrd,1  ; RED BIT 1             
    sbi ddrd,2  ; GRN BIT 0
    sbi ddrd,3  ; GRN BIT 1
    sbi ddrd,4  ; BLU BIT 0
    sbi ddrd,5  ; BLU BIT 1
    sbi ddrd,6  ; HORIZONTAL SYNC       
    sbi ddrd,7  ; VERTICAL SYNC          



    ldi r20, 0xC0
    out portd, r20      ; clears the RGB bits and sets the sync pulses high





; ****************************************************************************************
; **** STARTUP SEQUENCE
; ****************************************************************************************

; INITIALIZE STACK POINTER

    ldi r16,low(ramend)     ; loads the lower byte of top stack address into register 16 
    out spl,r16             ; stack pointer lower byte is set to lower byte of the top 
                            ; stack address stored in register 16

    ldi r16,high(ramend)    ; loads the upper byte of top stack address into register 16
    out sph,r16             ; stack pointer upper byte is set to upper byte of the top 
                            ; stack address stored in register 16






; main program loop
main:



V_LOOP:

; ****************************************************************************************
; **** VERTICAL LOOP - BEGIN
; ****************************************************************************************


; **** V-SYNC DRIVE LOW (2 lines, 1,270 cycles)




    cbi portd,7     ;2 drives v-sync active low 


; ========================================================================================
    ; Delay 1268 cycles
    ldi  r18, 2
    ldi  r19, 165
L1: dec  r19
    brne L1
    dec  r18
    brne L1
; ========================================================================================


    sbi portd,7     ;2 drives v-sync high 









; **** VERTICAL BACK PORCH (33 lines, 20955 cycles)

    ; **NOTE: Only 20951 cycles required to be wasted as 4 cycles are used by Horizontal 
    ; loop. 2 are used when setting max loop value in r16 and r17. A further 2 are used 
    ; setting horizontal sync active low.


; ========================================================================================
    ; Delay 20951 cycles
    ldi  r18, 28
    ldi  r19, 52
L2: dec  r19
    brne L2
    dec  r18
    brne L2
    rjmp PC+1
; ========================================================================================









; ****************************************************************************************
; **** HORIZONTAL LOOP - BEGIN (LOOPS 480 times)
; ****************************************************************************************


    ldi r16,low(480)        ;1 holds LSB of loop value
    ldi r17,high(480)       ;1 hold MSB of loop value




H_LOOP:


; **** H-SYNC DRIVE LOW (76 cycles)


    cbi portd,6     ;2 drives h-sync active low 

; ========================================================================================
    ; Delay 74 cycles
    ldi  r18, 24
L3: dec  r18
    brne L3
    rjmp PC+1
; ========================================================================================


    sbi portd,6     ;2 drives h-sync high









; **** HORIZONTAL BACK PORCH (38 cycles)


    ; **NOTE: Only 36 cycles required to be wasted as 2 cycles are used by RGB for setting
    ; the red bit 0 high.



; ========================================================================================
    ; Delay 36 cycles
    ldi  r18, 12
L4: dec  r18
    brne L4
; ========================================================================================









; **** RGB (508 cycles)

    ldi r20, 0xC1       ;1
    out portd, r20      ;1 sets red bit 0 high, all other RGB low, sync pulses high 


; ========================================================================================  
    ; Delay 506 cycles
    ldi  r18, 168
L5: dec  r18
    brne L5
    rjmp PC+1
; ========================================================================================


    ldi r20, 0xC0       ;1
    out portd, r20      ;1 sets the RGB outputs low, sync pulses high 









; **** HORIZONTAL FRONT PORCH (13 cycles)


    ; **NOTE: Only 5 cycles required to be wasted as 8 cycles are used up already. 4 are    
    ; are used for subtracting one from the loop counter. A further 4 are used for 
    ; jumping to start of horizontal loop and setting the Horizontal sync active low.


; ========================================================================================
    ; Delay 5 cycles
    lpm
    rjmp PC+1
; ========================================================================================


    ldi r18, low(1)     ;1
    ldi r19, high(1)    ;1


    sub r16,r18         ;1
    sbc r17,r19         ;1  


    brne H_LOOP     ; 2 cycles if true, 1 if false  









; ****************************************************************************************
; **** HORIZONTAL LOOP - END
; ****************************************************************************************











; **** VERTICAL FRONT PORCH (10 lines, 6350 cycles)

    ; **NOTE: Only 10 cycles have been used up for the Horizontal front porch, as a result a  
    ; further 3 must be added to the vertical front porch. 
    ; However 4 cycles are already being used, 2 to jump to start of vertical loop and a 
    ; further 2 to drive horizontal sync active low.
    ; As a result taking these two factors into account, the delay needs to be 6350+3-4 =
    ; 6349  cycles long.


; ========================================================================================
    ; Delay 6349 cycles
    ldi  r18, 9
    ldi  r19, 62
L6: dec  r19
    brne L6
    dec  r18
    brne L6
; ========================================================================================


    rjmp V_LOOP     ;2  relative jump to start of vertical loop









; ****************************************************************************************
; **** VERTICAL LOOP - END
; ****************************************************************************************

ATmega328P hardware implementation

Results

Having tested the hardware and software implementation, I've found it works well on the VGA interface on my living room TV, but does not work on any other monitor or TV I've tested.
I've found it sort of works on a friends monitor, it will output a red screen but with random black patches and will periodically lose signal. The other monitors I've tested on do detect the input, but are just unable to output anything on screen.

I believe the reason it works on some displays and not others, is simply down to the tolerances the manufacturers have stipulated in their devices.

Potential solutions

I've tried numerous tweaks in the code, this mainly involved changing the number of clock cycles in each region but this yielded no positive result. This leads me to believe one of two possible situations:

1) (MOST LIKELY) I've implemented the software or hardware incorrectly resulting in the timings being slightly out.

2) It is very tricky/ impossible to implement the VGA specification with consistent performance across all VGA capable devices utlizing a the ATmega328P operating at 20 MHz.

As a result of this, I'm planning to either overclock the Atmega device with a 25.175 MHz crystal to ensure the timings can be met or I am going to use a more capable microcontroller with greater processing power, something like a PIC24EP128MC202.

If anybody has any thoughts as to why my current implementation doesn't work and how I could rectify this, it would be much appreciated!

If you've managed to read up to here, thanks anyway! 🙂

References

(1)(3) VGA Signal 640 x 480 @ 60 Hz Industry standard timing – http://tinyvga.com/vga-timing/640×480@60Hz

(2) Lucid Science VGA Video Generator (unfortunately site has been closed down now) https://web.archive.org/web/20141102012544/http://www.lucidscience.com:80/pro-vga%20video%20generator-6.aspx

Best Answer

First, there should not be any issues outputting VGA using an ATMega328. You can output VGA the works with everything from ancient CRTs, little mystery LCD modules off aliexpress, or a modern LCD. I've never really heard of compatibility issues with VGA - especially not anything timing related.

There are dozens of projects that successfully output at least monochromatic 640x480 VGA (paralleling the RGB lines together) using much less than an ATMega, and you don't even need a 20MHz clock - several projects manage with a 16MHz clock. And if you are ok with less than 640x480, this project outputs valid VGA using an ATTiny15 running at 1.6MHz.

Now, I haven't double checked your assembly so I might be wrong, but it seems unlikely that the timing would work well on one display but not others. No, I think this is simply a problem of signal attenuation due to impedance mismatching.

The 3 color lines expect 0 to 0.7V. And I see you've sized your resistors such that, with a 75Ω impedance, it will result in either 0.696V or .348V.

I know that probably seems like a very nice, elegant solution to get black + two shade, but I am afraid it won't work very well. Your resistors are much too large and this is causing a bad case of impedance mismatching. And you would likely experience the exact problems you describe - some displays (a small minority I would expect with resistors that large) would be able to use the signal, but most would not, or only correctly read pixels intermittently per each frame.

If you want to properly drive a 75Ω impedance while reducing a higher voltage to the correct voltage range, you need to use an impedance matching network, like an L-pad divider.

Discussing the theory behind L-pad dividers is kind of beyond the scope of this question, but you would do well to use transistors to properly set the voltage levels without having the high impedance of a divider.

Here, just to do a sanity check and see if my guess is correct, ditch the second IO line on the 3 color inputs. You can't correctly drive a VGA input using the 75 ohms as a participant in a voltage divider anyway, so you'll have to ditch that idea entirely I'm afraid. You can get away with using a proper DAC style ladder, but even then, all the resistors will be an order of magnitude smaller than 1K and 2K.

Just use one IO pin, and I just realized this probably exceeds the single pin current... 50mA I think? But the ideal l-pad for 75Ω that attenuates 5V to 0.7V would be a 64Ω and 12Ω resistor in series to ground, with the signal out to the display being the tap between the two.

The magic here is that, if you remove my rounding errors, this results in a resistance of 75Ω. That is what matched impedance means. That each end has the equivalent to 75Ω from the signal pin to ground.

Basically, the pins of your atmega aren't strong enough to drive a 75Ω impedance at full power when running from 5V, as that would consume more than 50mA. Instead, just use a ladder with a similar ratio, but not so large. Try 220Ω and 56Ω, and that should resolve your signal problems and reduce the signal voltage to 0.7V WITHOUT having so much of an unmatched impedance that most displays won't even work correctly.