Electronic – Wrong frequency with Timer1 on Atmega328p in CTC Mode

assemblyavrtimer

I want to make a LED blink at a frequency of 0.5 Hz. So, I use an Atmega328p, clocked at 16 MHz, and the Timer1 in CTC mode, which triggers an interrupt every milisecond. Yet, my program, in Atmel Assembly, looks like this:

.include "m328pdef.inc"

.def     WR1 = R16    ; the working registers
.def     WR2 = R17
.def     WR3 = R18

.def     T1 = r13     ; A 24 bits value used to hold the
.def     T2 = r14     ; number of milliseconds
.def     T3 = r15     ; In T1 is the LSB, and in T3 the MSB

.cseg
.org  $0000
    RJMP    Start

.org OC1Aaddr         ; The address of the Timer1A compare match interrupt
    JMP     Timer1comp
    NOP

.org INT_VECTORS_SIZE

;---------------;
;   Settings    ;
;---------------;
Start:
    ; Initialize the Stack Pointer
    LDI     WR1, HIGH(RAMEND)
    OUT     SPH, WR1
    LDI     WR1, LOW(RAMEND)
    OUT     SPH, WR1

    ; Timer1A setup
    CLR     T1        ; Reset the ms counter
    CLR     T2
    CLR     T3

    LDI     WR1, (1<<WGM12)|(1<<CS10)  ; Timer in CTC mode, TOP = OCR1A
    STS     TCCR1B, WR1

    LDI     WR1, HIGH(15999)  ; Must wait 16000 clock cycles in order to
    LDI     WR2, LOW(15999)   ; have a frequency of interrupt of 1 kHz.
    STS     OCR1AH, WR1       ; ISR doesn't interrupt timer, so have
    STS     OCR1AL, WR2       ; to set OCR1A to 16000 - 1

    LDI     WR1, 1 << OCIE1A  ; Enable Output Compare A match interrupt
    STS     TIMSK1, WR1

    ; I/O settings
    SER     WR1               ; The LED is plugged on any pin of port D
    OUT     DDRD, WR1
    OUT     PORTD, WR1

    ; Enable interrupts
    SEI

;---------------;
;  Main Loop    ;
;---------------;
Loop:
    CLI                    ; Reset the ms counter
    CLR    T1
    CLR    T2
    CLR    T3
    SEI

    IN     WR1, PORTD      ; Toggle port D
    SER    WR2
    EOR    WR1, WR2
    OUT    PORTD, WR1

    ; HERE WAS MY ERROR
    ;LDI    WR1, BYTE1(988)  ; 12 clock cycles are passed
    ;LDI    WR2, BYTE2(988)  ; We want to wait 1000 ms
    ;LDI    WR3, BYTE3(988)

    ; CORRECT WAY
    LDI    WR1, BYTE1(1000)  ; We have already spent 12 clock cycles (750 ns)
    LDI    WR2, BYTE2(1000)  ; which is totally negligible against
    LDI    WR3, BYTE3(1000)  ; our delay of 1000 ms

Wait:
    CLI
    CP     T1, WR1    ; Compare the time (ms) since start of loop
    CPC    T2, WR2    ; with 1000
    CPC    T3, WR3
    SEI

    BRLO   Wait       ; Go to Wait if time < 1000 ms

    RJMP   Loop

;-----------------;
;   Timer1A ISR   ;
;-----------------;
Timer1comp:
    ; Save the current state on the stack
    PUSH    WR1
    IN      WR1, SREG
    PUSH    WR1

    INC     T1          ; Increment the ms counter
     BRNE    Timer1compEnd
    INC     T2
     BRNE    Timer1compEnd
    INC     T3

Timer1compEnd:
    ; Reset the state
    POP     WR1
    OUT     SREG, WR1
    POP     WR1

    RETI

The problem is, when I measure the blink frequency of the LED, instead of finding 0.500 Hz, I see 0.506 Hz. I wonder where this error comes from.

Also, I'm really new to assembly programming, so I'll be glad to receive any kind of advise.

EDIT 1

Finally, I found the origin of my problem: in the main Loop, I load WR1:WR2:WR3 with 988 ms, because the main Loop takes 12 clock cycles. However, even if the main Loop takes 12 clock cycles, it only takes 12 / 16000000 = 0.75 µs, which is totally negligible against the 1 s period. We can assume that this Loop takes 0s: then, I wait 12 ms less than I'd expect to, which represent an error of 1.2 % as Michael Karas said. To correct this, I should set WR1:WR2:WR3 to (1000 – 0.00075), so 1000.

EDIT 2

There was a problem with my previous code: in the Wait loop, I tested whether the desired time was higher or equal to the elapsed time. Thus, If the elapsed time is 1000, then the micro controller would wait until the 1001'th millisecond to branch to the main Loop. So, to correct this, I must test whether the elapsed time is smaller or not than the desired time.

Best Answer

Try using a value of 15999 for TOP in OCR1.

You do not need to account for the time it takes to execute the ISR. The timer is free running. In CTC mode, it starts at 0 and counts up to the TOP value, then clears back to zero and starts over again. It can optionally generate an interrupt each time it clears, but this is a side effect - it will count like this completely on its own with or without an interrupt.

The timer continues counting while your ISR executes. Your ISR will get called again the next time the clear happens regardless of how long it takes to execute, as long as it is finished executing before the next clear and the next interrupt.

Therefore, you can also eliminate the lines...

LDI    WR1, BYTE1(988)  ; 12 clock cycles are passed
LDI    WR2, BYTE2(988)  ; We want to wait 1000 ms
LDI    WR3, BYTE3(988)

...since they have no effect.

Note that with the above code, your LED will not toggle exactly every 2 seconds because of jitter due to the fact that the foreground task can be interrupted at random times by the ISR. If it gets interrupted after the compare happens, the LED will not toggle until the next pass. You will still have a very solid 0.5Hz output frequency, there will just be a few dozen cycles of jitter on any given transition. If you want exactly the output to be jitter free +/- 1 cycle, you can enable one of the output compare pins to toggle mode and let the hardware blink the LED for you.

Make sense?

The final ASM code is very nice, but here are a few suggestions...

-I do not think you need the NOP after the Timer1comp vector. This NOP is actually landing in the next vector. While it does no harm here, it is at least superfluous.

-The CS01 bit in the line LDI WR1, (1<<WGM12)|(1<<CS10) ; Timer in CTC mode, TOP = OCR1A actually enables the timer and starts it counting. Since you have not yet assigned the OCR registers, they are set to thier initial values of 0, which means that a match occurs immediately. Again, this doesn't really cause any problems in the current program but once you start adding functionality it can make some hard to find bugs. Better to set up all the timer registers and then enable it as a final step when everything is ready and in a know state. This would involve changing the above line to LDI WR1, (1<<WGM12) and then adding an additional line LDI WR1, (1<<WGM12)|(1<<CS10) at the end of the setup.

-I would not manually set the output pins high at startup. Instead, let them get set at the first ms timer roll over. This will delay the start of the signal generation, but will make sure that the first pulse is the correct width. As written, the first pulse will be short by the time between when you start the timer and when you set the pins high. If you really must have the signal generation begin as soon as possible after reset, then you could set the pins high at the very beginning of the program, then initialize the TCNT to account for the lost time on the first pass.

-You can replace the lines...

IN     WR1, PORTD      ; Toggle port D
SER    WR2
EOR    WR1, WR2
OUT    PORTD, WR1

...with the slightly more efficient...

SER    WR2 
OUT    PIND, WR1

From the data sheet:

14.2.2 Toggling the Pin

Writing a logic one to PINxn toggles the value of PORTxn, independent on the value of DDRxn.

-Clearing the Tx registers in the main thread could cause a potential race condition and lost ticks. You do not know the value of Tx when you pass :Loop because they could in theory have been updated in the background by the ISR since the last time you looked at them. In practice, this is impossible with the current code but if you start adding stuff it would lead to hard to find bugs. Imagine, for instance, that the Tx increments from 1,000 to 1,001 between when you compare it and when you clear it. You have now lost this millisecond forever and then next pulse will be 1ms short. In general, it is considered bad practice to update a value from 2 different threads for this reason.

One way around this would be to subtract 1,000 from Tx at the top of loop rather than clearing it to 0. Another way around this would be to check for the match to 1,000 inside the ISR and instead just set a flag that the main loop then checks and clears (this is how the Ardunio timer code does it).

But I think for this code, the easiest and most efficient solution would be to move the increment, the test, the action, and the rest all into the ISR. While you generally want to do as little as possible in the ISR, by moving things around we can make the ISR still be very short. I see you appreciate carry bit juggling, so I instead of counting from 0 up to 1,000, I instead am counting from -1000 to 0. I think this makes slightly more efficient boundary checking inside the ISR. I also moved the Tx registers to be able to LDI them. Here is this final modified code...

.def     WR1 = R16    ; the working registers
.def     WR2 = R17
.def     WR3 = R18

.def     T1 = r19     ; A 24 bits value used to hold the
.def     T2 = r20     ; number of milliseconds
.def     T3 = r21     ; In T1 is the LSB, and in T3 the MSB

.cseg
.org  $0000
    RJMP    Start

.org OC1Aaddr         ; The address of the Timer1A compare match interrupt
    JMP     Timer1comp

.org INT_VECTORS_SIZE

;---------------;
;   Settings    ;
;---------------;
Start:
    ; Initialize the Stack Pointer
    LDI     WR1, HIGH(RAMEND)
    OUT     SPH, WR1
    LDI     WR1, LOW(RAMEND)
    OUT     SPH, WR1

    LDI     T1, BYTE1(-1000)   ; Init the ms counter to -1000
    LDI     T2, BYTE2(-1000)   ; we start a a negative number and count up becuase it
    LDI     T3, BYTE3(-1000)   ; is more efficient to test if we got to 0

    ; Timer1A setup
    LDI     WR1, (1<<WGM12)   ; Timer in CTC mode, TOP = OCR1A
    STS     TCCR1B, WR1

    LDI     WR1, HIGH(15999)  ; Must wait 16000 clock cycles in order to
    LDI     WR2, LOW(15999)   ; have a frequency of interrupt of 1 kHz.
    STS     OCR1AH, WR1       ; ISR doesn't interrupt timer, so have
    STS     OCR1AL, WR2       ; to set OCR1A to 16000 - 1

    LDI     WR1, 1 << OCIE1A  ; Enable Output Compare A match interrupt
    STS     TIMSK1, WR1

    ; I/O settings
    SER     WR1               ; The LED is plugged on any pin of port D
    OUT     DDRD, WR1

    ; Enable Timer  
    LDS     WR1, TCCR1B
    ORI     WR1, (1<<CS10)  ; clock prescaler=1
    STS     TCCR1B, WR1

    ; Enable interrupts
    SEI

;---------------;
;  Main Loop    ;
;---------------;
Loop:
    RJMP   Loop

;-----------------;
;   Timer1A ISR   ;
;-----------------;
Timer1comp:
    ; Save the current state on the stack
    PUSH    WR1
    IN      WR1, SREG
    PUSH    WR1

    LDI     WR1,1       ; Increment the ms counter LSB by 1

    ADD     T1,WR1      

    LDI     WR1,0       ; Increment the rest by zero so only the carry will propigate

    ADC     T2,WR1
    ADC     T3,WR1

    BRCC    Timer1compEnd   ; If carry is set, then we rolled the ms counter

    ;Reset the ms counter back to -1000

    LDI     T1, BYTE1(-1000)   ; Init the ms counter
    LDI     T2, BYTE2(-1000)        
    LDI     T3, BYTE3(-1000)        

    SER    WR1
    OUT    PIND, WR1         ;; Toggle all PORTD output pins


Timer1compEnd:
    ; Reset the state
    POP     WR1
    OUT     SREG, WR1
    POP     WR1

    RETI

This code is functionally equivalent but about 15% smaller, and eliminates some race conditions that could have turned into bugs in future versions. Let me know if I have missed something.

Again, let me reiterate that your original code was very nice and a lot cleaner than lots of commercial production code I've seen. Considering that you are new to asm programming, I think with a little experience you will quickly become an asm master!