Quality, Robustness and Reliability - Breaking Down the Bathtub

Gallium nitride (GaN) offers superior efficiency and power density, with reliability assessed using the "bathtub curve" to predict and prevent failures from defects, overstress, and aging.

author avatar

28 Nov, 2024. 6 min read

Gallium nitride (GaN) offers significant benefits of increased efficiency and power density, and allows designers to meet far more challenging power supply specifications than they can with silicon MOSFETs.  One reasonable concern about any new technology that offers game-changing benefits is robustness and reliability. To address any doubts that potential users may have, let’s discuss the robustness, reliability and quality of GaN.

GaN, as a power conversion technology, does not have as long a pedigree as silicon. Even so, the bathtub curve as a representation of the failure rate of a device over time is certainly very well established and multiple companies have contributed to fleshing out the details as they pertain to the new GaN devices. It is a useful visualization tool for reliability engineering and deterioration modeling, and serves as a clear mechanism to discuss why failures occur and how to predict and prevent them. 

‘Bathtub' of course, refers to the shape of the curve. The x axis in Figure 1 is the log of time, and the y axis is the probability of failure. The green line – our bathtub curve - is the rate of failure of a typical product over time. In the early phase of a product's life, failures are mainly quality-related due to manufacturing defects. In the center zone, there are devices that were well manufactured, and that haven't yet worn out, but they failed anyway, typically because these parts have been overstressed. Then, after some time, parts wear out, so the failure mode is due to lifetime issues.

Figure 1

Lifetime

If we consider failures in the far distant future, the challenge is that users can’t wait for, say, 100,000 hours (11 years), or a century, one million hours, for a device to fail before the wear-out mechanisms can be investigated. To accelerate time for the purposes of IC testing, a test called High Temperature Operational Life (HTOL) is employed. This leverages the tendency of physical systems to change faster with temperature, under strong electric fields and in the presence of free ions (humidity and metal migration), to accelerate the effects of time and use on the devices under test. Surprisingly, a relatively few parts (48 for commercial and 77 for automotive) when tested with acceleration over 1000 hours provide the information that we need to determine the probability of failure in many years under normal conditions. The test is run on multiple batches of devices to expose any process dependency. Another test, called HALT (highly accelerated life test), applies a similar test under increased humidity but at a slightly lower temperature. 

These tests are standard procedure for analog integrated circuits. But because GaN is a new technology, users and manufacturers must make doubly sure that the GaN switch inside the IC is robust and will have a long life, and there are tests specifically designed for discrete transistors called HTRB (high temperature reverse bias) and HTGB (high temperature gate bias) which address this. HTRB and HTGB test the device for 1000 hours at elevated temperatures under a DC voltage stress to make sure there are no migration issues, or long term degradation mechanisms associated with high voltage operation at high temperature. Hot Carrier Tests (HCI) are also performed to find any places where electrons in the channel can get diverted into oxide layers and become trapped. Electro-migration testing makes sure that the metal conductors on the top of the die are appropriately sized for the current that they carry. And gate oxide integrity tests prove the strength of the gate. All these tests are performed on devices intended for industrial and commercial customers. 

Of course, GaN is also very suitable for the automotive market, which is even more demanding. So in addition to the usual tests just described, for automotive devices we run more tests, primarily H3TRB, which increases the humidity and temperature demands above the standard HTRB. Then we run a series of tests that are designed to switch the part on and off multiple times to prove that the devices are immune to the thermal shock of being switched on. These tests are called Power Temperature Cycling and Intermittent Op Life - the latter was originally a military test. As well as stressing the devices to a higher level, automotive tests are carried out on a larger sample size than for commercial and industrial parts. 

Manufacturing quality

So much for the long term; we also need to address manufacturing quality problems that can cause parts to fail in early life. Power Integrations’ quality policy is to prevent any of these parts from entering the product stream. You might think that a robust final test of finished products would catch all of the mis-manufactured devices, and that is certainly an important part of it. However, in order to create an essentially perfect product stream, we work hard to screen out substandard material in prior manufacturing steps – using so-called “process monitors”. These monitor tests reduce the number of parts with latent defects that make it into final test and are critical for improving final product quality. 

Power Integrations regards epitaxially grown GaN (EPI) as the critical stage in GaN transistor design and production. Because of this, we make our own EPI – we don’t buy it, nor finished wafers, from a third party. Therefore we have complete control over the process, and we can check it continuously. This means that only the best wafers make it through to patterning. After a wafer has been patterned into transistors, we run stress tests to identify parts that are faulty or have a risk of a quality problem.

To validate that our process monitors, our yield improvements and our final test regime yield a perfect product stream, a test called ELFR (Early Life Failure Rate) is used. ELFR is analogous to the High Temperature Op Life (HTOL) test. But instead of doing a small number of parts for a long time, we do a large number of parts for a relatively short time - 800 parts for 48 hours, across multiple IC lots to ensure that any process dependencies are exposed. 

Overstress

We've spoken about reliability and wear out mechanisms and how to guarantee initial quality. But what happens in the middle section of our bathtub curve? These are parts that have been manufactured perfectly. They haven't worn out yet, yet they are dead anyway due to overstress.

Figure 2

To ensure that our devices are robust under the temperature, voltage and humidity conditions that they will be exposed to during their operational life, we run more tests. These are:

l MSL - moisture sensitivity level;

l UHAST - also a moisture test, but under pressure;

l TMCL - a temperature cycling test which tests for differential heating and cooling; and 

l HTSL - high temperature storage life which is the absolute high temperature test. 

One of the most important attributes of a power semiconductor is its ability to withstand voltage. Power Integrations specifies three types of GaN, with BVs (breakdown voltages) of 750 V, 900 V and 1250 V. However, even though we talk about BV for convenience, GaN transistors are not specified in the same way as a silicon device. There's a property of GaN called dynamic RDS(ON) which actually increases when high voltage is applied. It recovers later, but under the initial voltage, and for a short while afterwards, it is higher than the typical value. 

Power Integrations has chosen 5% as being the limit that we can accept as an increase in RDS(ON). This determines the datasheet ‘BV’ of our GaN devices. We choose the conservative values noted above for the datasheet specification, but actual breakdown does not occur until around 1400 V for the lower voltage families and 2200 V for the 1250 V product line – in all cases, a massive margin with respect to the datasheet BV limits. It’s unwise to design systems to use the devices above the datasheet BV, but this headroom is very useful insurance against infrequent line swells and surges, such as lightning strikes.

Conclusion

With this battery of tests (Figure 2) - process monitors for initial quality, voltage margin to allow for overstressing, wear-out mechanisms that ensure reliability, plus tests related to temperature and moisture sensitivity - whether you're making a smart phone adapter or an automobile, you can have confidence in PowiGaN™, Power Integrations’ GaN technology.