Electronic – Are thermal effects on an FPGA different depending on how it is programmed/configured

circuit-designfpgastress-testingtestingthermal

I am currently on the fence in a debate regarding FPGA environmental testing (thermal cycling).

The question is, can the same FPGA stressed to the same thermal limits yield different results depending on how it is configured/programmed?

For example, data rates, data corruption, etc.

With something like a microcontroller I would say that regardless of how it is programmed the hardware remains the same, so its thermal properties will remain the same. FPGAs are a bit of a grey area, because although the configuration is described in code (i.e. HDL), it is creating a configuration in hardware within the FPGA.

Thoughts and opinions welcome, but if you have any relevant research papers or articles that would be great.

Thanks

Edit: To add a bit of clarity, I will be more specific about the use case. The under test is a system control unit, where the main processor is an i.MX6 Arm CPU, with the main signal processing unit being an IGLOO2 FPGA. There are two builds of software available for it:

  1. Test control software, designed to stress it for design proving activities, but serves no purpose in the real world.
  2. Operational software, the software that will be shipped with the item

The argument is that because the environmental stress testing has been performed with the TCS there is a gap in our proving evidence because thermal cycling has not been performed using the software which it will be shipped with.
The testing is performed in an environmental chamber which cycles between -40C and +70C.
The counter argument is that the TCS is inherently designed to excessively stress the item, therefore the operational software should be fine.

The third argument thrown in to the mix is that regardless of the build of software, the hardware is the same, so the effects of external thermal stimuli should be the same.

Edit 2: Clearly I'm still struggling to explain myself. The product was designed internally but is being manufactured by a contractor in another country. The main qualification testing (environmental and EMC) is being performed by the contractor, however we are unable to ship them the 'real' software as it is restricted and cannot leave the country. With that in mind, we provided the contractor with technical requirements such as thermal environment, bus speeds, and a 'sanitized' functional requirements which were designed to be representative but without breaking restrictions. With this, they created their own software (a.k.a the TCS). The question I am asking is if the code is functionally representative but simply 'not the same' as the software that will ship with the product, then how much of a gap do we really have in our evidence?

Best Answer

Sorry, but both a microcontroller and an FPGA will have different thermal behavior depending on how they are used. If you intend to do thermal cycling or life testing you must run the application software (or similar) on a microcontroller and apply the application configuration (or similar) to an FPGA.

You need to be careful to differentiate between the thermal properties (which depend only on the physical construction and packaging) and the thermal behavior (which depends on usage as well). For thermal cycling and life test you want to be sure that the die itself is as hot, and in the same places, as it will be in use.

EDIT: It sounds like your real problem is that you haven't convinced the customer that your "TCS" really is an overstress. If the customer remains skeptical then you have no choice but to run the actual application software. If you are trying to demonstrate something about reliability or failure rate, then running a test without application code (or demonstrably worse) is no test at all.