Electrical – STM32F0 – interrupt/breakpoint not working on certain hardware

gdbstm32stm32f0swd

OK I realise this sounds like a dumb/noob question but please do read it through before calling me an idiot – at this stage I'll gladly take it if you can spot where I've gone wrong!

The scenario:

We have an existing working PCB design using an STM32F051K4, multiple boards have been built & programmed and all have performed as expected.

Populating a new bare PCB with just the micro, its associated smoothing/decoupling caps, and BMP/SWD (debug/ICP) header the SysTick interrupt would not fire and breakpoints set in the debugger would not trigger (setting break main would never trigger!).

I have tried several fresh boards, micros from two different batches, getting a more skilled colleague to solder the board… and none of them work!

The complete environment:

Boards are a fairly basic PCB with STM32F051K4U6 / K6U7 (UFQFPN32 package) microprocessor plus a couple of 100n / 4u7 capacitors around the 3v3 rail.

In the test setup, the board is powered, programmed, and debugged by a Black Magic Probe (BMP) via GDB.

The firmware is based on generated code from STM32CubeMX using low-level (LL) libraries, compiled in SW4STM32, initialising only the basic system clock / systick and setting all GPIO pins to analogue (floating / High Z).

The pseudo-code of the entire thing is as follows – standard init purely as-generated by CubeMX:

LL_Init();
SystemClockConfig(); // Include SysTick IRQ
LL_SYSTICK_EnableIT(); // Enable the SysTick interrupt
MX_GPIO_Init();
while(1)
{
    count++;
    //optionally toggle a pin here to prove it's running
}

The SysTick_Handler simply increments a global ticks variable.

Then the main loop literally just increments a count while we wait for the SysTick that never comes.

The code compiles, and via GDB/BMP downloads and runs – in the "good" board I can set break main and break SysTick_Handler and both will be triggered as soon as the board is run.

In the "bad" boards the breakpoints are never tripped and the interrupt never fires (ticks remains 0) despite the main loop counting upwards (and, if I toggle a pin in main(), I can see it toggle). However, the micro can be stopped and single-stepped and GDB reports no errors setting breakpoints. programming, etc.

I have checked the interrupt vectors in the startup code and they are as expected.

I am really stumped by this, my best-guess is there's some pin that needs to be pulled up or down or something, or perhaps some extra power rail smoothing is needed somewhere that the fully-populated board introduces, but I would NOT expect the micro/debugger to be that sensitive?

I've checked and re-checked the code, the boards, the components fitted, the soldering job, and my sanity (TBC) to no avail.

Any ideas gratefully received at this stage!

Edit to address questions in comments (in order & as I can):

@ChrisStratton:

The binary is identical every time, from the working running board I am then detaching GDB, issuing BMP "power disable", swapping boards, "power enable", scan, re-attach, load the binary, run. Not touching the IDE at all.

Hard to be 100% sure the ground pad is soldered as it's under the chip in the UFQFPN32. All power pins (2x VDD + VDDA) are connected and have smoothing caps per the datasheet.

I have not (yet) tried other interrupts as I would have to write new code to interrupt from a GPIO pin for example.

I don't have a discovery board with this micro on so can't try that.

Linker & startup are generated by CubeMX for this part and are obviously working on the "good" board with no modification.

@SamGibson:

The "bare" board is minus peripherals such as MAX232, I2C EEPROM, external voltage regulator, etc. just has the micro, debug header, and power supply caps / relevant pullups (BOOT0, NRST) fitted.

I'm not currently in a position to post photos/schematics, I will if I can.

Best Answer

Thanks for the update. Due to the restrictions which prevent supplying photos & schematics, I would follow the plan outlined in my earlier comment, in order to make progress, including:

Make a list of all the physical differences between the working "full" and non-working "minimal" boards, and their component parts including PCB, their sourcing, history/age etc.

Also, measure the voltages at all possible nodes (even better to use a 'scope to view the waveforms - DMM measurements can be misleading) and compare them between the working and non-working boards.
Take a non-working "minimal" board and add components to it (testing after each step) until it is a "full" board. At some point along that process, it might start to work (e.g. breakpoints start to trigger) - since your "full" boards work, and that's what you are progressing towards.

Investigate the last changes made just before that point, since whatever that change was, caused a change in the board's behaviour (in this case, from non-working to working).

It's possible that the above step of converting a non-working "minimal" board, step-by-step, into a "full" board might not result in a "full" board which works.

In that case, there must be one or more differences between the original "full" working boards, and the newly-made "full" but non-working board (otherwise, the newly-made board would also work - so there must be differences). Your challenge is to find them, using knowledge that only you have about the hardware of the two boards, the design, the sourcing of the components etc.
Another approach to consider, is to reverse the direction of the investigation:

Start with a known-working "full" board. Then remove components from it, step-by-step, testing after each change, until it has only the same components as your current non-working "minimal" boards.

Did that board stop working during this process? If so then, as in the process working in the reverse direction of changes above, whatever you just changed to cause that difference in behaviour (here, changing from working to non-working) is the point to investigate, using your knowledge of the hardware and the design etc.

If you start with a working "full" board, and remove all the components to make it a "minimal" board and it still works, then there are one or more differences between your working "minimal" board, and your non-working "minimal" boards. Again, since you have them in your hands, you have more knowledge than we do about the differences.

In that situation, I would also consider moving components, one at a time, from a non-working "minimal" board, to a working "minimal" board, testing after each change. Again, if/when the working board stops working, the last component that was moved from the non-working board, will tell you something about the cause of that change in behaviour.
Of course there are risks of getting misled during this process, due to practical issues like damaging components during transfer from one board to another, soldering problems (shorts or bad joints) etc. etc. Therefore after any change in behaviour during testing, it's important to consider the (hopefully small, but non-zero) possibility of an "own goal" during the troubleshooting.

Best Answer

Related Solutions

Electronic – High resolution system timer in STM32

Electronic – STM32F0 UART + DMA + Interrupt with STM32CubeMX HAL 1.2.1 problem

Related Topic