Electronic – How is quality/damage control done in large circuits

gpumosfettransistors

There is apparently more than 28 billion transistors in the recently released Nvidia GeForce RTX 3080.

First, when you manufacture such a circuit there must be a decent chance of botching some of these. How do the plants make sure that they all work?

Second, after prolonged usage surely at least one of these transistors must fail. Does the circuit somehow recognize this and redirect the workflow?

Best Answer

We don't "botch" them by the time they are sold. We've botched them long before you ever saw an IC with those features. I was working on 14nm SOI in 2012, and the single biggest reason that things don't get out into the wild is yield, but this does not explicitly mean that the transistor did not work. I was making FPGAs because it allowed me to change the routing graph when things fail due to fabrication issues. I was making them asynchronous because at that point, the timing was too difficult to predict on a large-scale. Even by the time I was on the process, we weren't seeing dead transistors, but only mismatch issues. I have had very few dead transistors in my career, and if they were dead, it was often because I made them dead due to playing around with hot-carriers.(As an aside, when you raise the voltage on ICs for overclocking, that's the usual failure) There are many reason that ICs can "fail", but they are not explicit failures. It's usually due to timing because a threshold offset, or in the case of FinFETs, you have two devices per gate, and the mismatch between those can cause timing violations (I have an XOR here). This the origin of "binning" of ICs. Some of faster, some are slower, but they all have the masks to create them. You have test structures that go around different regions of the IC and you get timing information from those.

How do you fix these little nuances? Let's say that you have a cache bank that just doesn't pass timing, you'll just blow a few fuses and your 2MiB cache becomes a 1MiB cache.

If you want the real scoop on this, the IEDM conference is where we talk about how everything is terrible most of the time.

Related Topic