Electronic – Strange issue with ATTiny10 + avr-gcc: counter used in ISR corrupted by global variables

attinyavr-gcc

I have a very strange issue of a possible memory corruption when using global variables and a timer overflow interrupt on the ATTiny10 (using avr-gcc 4.9.2). I can't make any sense of it but managed to narrow it down to reproducing it with a very simple program:

#include <avr/io.h>

/* Timer overflow counter */
volatile unsigned int ovrf = 0;

/* Some global variables used in main() */
/* (MOVING THESE INTO main() FIXES THE ISSUE) */
unsigned long foo;
unsigned int bar;

int main(void) {
  /* Fast PWM 8 Bit Mode */
  TCCR0A |= _BV(WGM00);
  TCCR0B |= _BV(WGM02);
  /* Enable Timer Overflow Interrupt */
  TIMSK0 |= _BV(TOIE0);
  /* /8 prescaler */
  TCCR0B |= _BV(CS01); // 
  /* PB0 as output */
  DDRB |= _BV(PB0);
  /* Enable interrupts */
  sei();

  for (;;) {
    /* Some random code that uses the global vars */
    /* (REMOVING THIS FIXES THE ISSUE) */
    if (foo > bar) {
      foo = 0;
    }
  }
}

ISR(TIM0_OVF_vect) {
  ovrf++;

  /* Toggle LED (about once per second) */
  if ((ovrf / 500) % 2 == 0) {
    PORTB &= ~(_BV(PB0));      
  } else {
    PORTB |= _BV(PB0);
  }
}

All it does is the following:

  • Sets up the timer and a timer overflow ISR that increments a counter (global variable ovrf) and toggles an LED on and off based on the value of this counter.
  • The main loop just accesses two other global variables (doesn't even write anywhere).

I'd expect the LED to blink periodically, proving that the interrupt works and the counter is incremented correctly. But it doesn't turn on — or when modifying the program slightly, e.g. adding some more code in main() or more variables — it flashes erratically or at a very fast rate. From this, after a lot of testing, trying to exclude any other explanation, I'm assuming that the counter (ovrf) is somehow getting corrupted from the main loop.

I found several changes that can make the issue disappear:

  • moving the two global variables (foo and bar) into main()
  • removing the code that accesses them
  • changing the type of foo to int
  • turning off all optimisations with -O0 (the default was -Os), but this makes the code ~1.6x bigger.

But I still can't see any explanation about the actual cause. Am I completely missing something obvious…? I've run out of ideas and can't think of anything else other than a compiler bug, but that's very unlikely, given that this example is so simple.

UPDATE

Based on the suggestion from @MarkU I tried to play with various optimisation settings, trying to find the exact option that might cause the problem:

  • Also tried -O1, but it doesn't help either

  • I found that -Os -fno-toplevel-reorder also fixes the issue! — However, I suspect this might just be an accidental effect:

  • In my original program (very similar to the above simplified example), where I found the issue, none of the above helps (not even -O0). There I have one more global variable (a bool), and the only thing that seems to help is to remove an initial assignment (e.g. "bool ledOn = true;" --> "bool ledOn;").

So it's definitely something to do with how variables are allocated, but not simply about their overall size. (There are no other dependencies, no function calls, etc.)

UPDATE 2

Following the advice from @Curd, I also tried replacing ovrf / 500 with ovrf >> 9 (rougly the same, I don't care about exact timing here anyway). This reduced the code by 74 bytes(!) — and this also happens to fix the issue!

I had a look at the disassembled code for the ISR: this change reduces the number of bytes pushed at the beginning from 13 to 7, which could explain why it helps!

(This ovrf / 500 was just meant to be a quick and simple test to verify the ISR was working, but I didn't realize that in actual implementation it's not that simple at all! In my original program there is no division, I maintain an approximate millis count by simply multiplying ovrf by 2.)

I also compared the disassembled code for -Os ("bad") and -Os -fno-toplevel-reorder ("good"), but apart from code being reordered at the beginning, both the contents of main() and the ISR seem the same (same number of pop/pushes, etc.)

It appears that I can fix the problem in this concrete example with one of the above workarounds, but I still feel uncomfortable with not really understanding the actual cause and not knowing how to avoid this in the general case. And I don't know enough about assembly to analyze the generated code.

Maybe I should also ask some of these questions:

  • Is this kind of trial-and-error process "normal" when using C for the ATTiny10? (I mean: not nearly enough resources and/or insufficient compiler support to make this reliable – so don't expect it to work and just revert to assembly if it doesn't?)

  • Is there something that should generally be avoided (e.g. not using global vars, or optimisation)?

UPDATE 3

Thanks for all the comments and answers, there is a lot of useful suggestions in them, worth checking all for anyone running into a similar problem!

I had one more "mystery" remaining with my original program where replacing a bool ledOn = true; with bool ledOn; was the only fix.

Now that I understand more, I had another look at the generated assembly and memory usage: turns out that the initialisation makes the compiler produce a .data segment and one more byte is allocated in memory, which is just over the limit to cause a collision with the stack. Although (I think) it uses a register for this variable at the end, just like in the "no explicit initialisation" case, so the extra allocation shouldn't be necessary. I guess the compiler has simply no optimisation for this extreme case with such small amount of RAM.

Best Answer

With only 32 bytes of memory (as mentioned by MarkU in a comment), memory on the ATtiny10 is incredibly tight. The AVR-GCC compiler does not provide any tools for stack checking, and will happily generate code which will overrun the stack. For example, here's what it generated for the prologue to your ISR:

000000ba <__vector_4>:
  ba:   1f 93           push    r17
  bc:   0f 93           push    r16
  be:   0f b7           in      r16, 0x3f       ; 63
  c0:   0f 93           push    r16
  c2:   10 e0           ldi     r17, 0x00       ; 0
  c4:   4f 93           push    r20
  c6:   5f 93           push    r21
  c8:   6f 93           push    r22
  ca:   7f 93           push    r23
  cc:   8f 93           push    r24
  ce:   9f 93           push    r25
  d0:   af 93           push    r26
  d2:   bf 93           push    r27
  d4:   ef 93           push    r30
  d6:   ff 93           push    r31

I count 13 pushes in there. That'll make the stack expand to nearly half your device's memory alone. Combined with another rcall in the body of the ISR, as well as a couple of pushes and rcalls in the prologue to main, the ISR stack will end up colliding with the memory used to store your global variables, overwriting them with unexpected data.

The ATtiny10 is not a good target for a C compiler. If your application can support a slightly larger microcontroller, upgrading to the tiny25/45/85 family might be warranted. Otherwise, I would recommend targeting this device with assembly.