BY MANFRED REICHE AND GERALD WAGNER
As electronics applications shrink in size, integrated circuit (IC) packaged devices must be reduced both in footprint and thickness. The main motivation for the development of smaller packages is the demand for portable communications devices, such as memory cards, smart cards, cellular telephones and portable computing.1
One of many crucial aspects in developing ultra-thin packages is die thickness. The reduction of the chip thickness, however, is combined with an increasing wafer diameter, but larger wafer diameters require thicker silicon to withstand wafer manufacturing. The requirement of an increasing thickness of the wafers during processing and the contrasting interest of thinner die makes thinning techniques more and more important.
Because the thinning of the whole wafer at the back end, i.e., after the complete device processing on the front side, is the most effective way for preparing ultra-thin chips, new or improved thinning techniques are necessary. Time- and cost-efficient processes are required, and the thickness tolerance should be ≤1µm even at final wafer thickness of 20 µm.
There are four primary methods for wafer thinning: mechanical grinding, chemical mechanical polishing (CMP), wet etching and atmospheric downstream plasma (ADP) dry chemical etching (DCE).
Because of its high thinning rate, mechanical grinding currently is the most common technique for wafer thinning. All commercially available grinding systems use a two-step process including a coarse grinding (with thinning rates of about 5 µm/sec) and a subsequent fine grinding (thinning rate ≤1µm/sec). The second step of the process is necessary to remove most of the damage layer created by the coarse grinding step and reduce surface roughness. Damages induced by mechanical grinding have been analyzed by techniques such as interference contrast microscopy and X-ray topography. X-ray topography has shown that most of the damage is located within a region about 20 µm deep. Defects below this depth probably are point damages not readily resolved by topography.2 Transmission electron microscopy (TEM) can give more details. After rough grinding a complex structure of surface cracks (oriented parallel to 111 directions and about 1 to 2 µm deep), dislocations were observed in cross-sectional TEM samples. The fine grinding removes most of this layer if standard conditions are applied (i.e., removal of an additional amount of 20 µm). However, there is a remaining defect band near the surface (Figure 1).
Figure 1. Defect structure near the surface after fine grinding, as shown by cross-sectional TEM.
The thickness of the defect band is strongly affected by the grinding conditions and is between 0.1µm and about 1 µm. The residual defects cause stress in the thinned wafer, leading to an additional bow and often broken wafers during handling or further processing. This means that additional thinning is necessary to remove the defect layer.
Furthermore, the high removal rate, especially by the first grinding step (rough grinding), causes rough surfaces. Typical values are in the order of about 2 µm (Rms) measured by atomic force microscopy (AFM). During the second grinding step, the roughness is reduced to a few nanometers depending on the wheel combination applied. For instance, fine grinding using a typical wheel (mesh size 2,000) results in Rms @ 3 nm, which is about 10 times larger than for a polished bare silicon wafer.
The remaining defect layer and surface roughness are the reasons for an additional thinning process after mechanical grinding. This can be done either by CMP, dry etching or wet chemical etching.
CMP based on buffered silica slurries generally is used for the polishing of silicon wafers. CMP results in very flat surfaces and low total thickness variation (TTV) values. The thinning rate, however, reaches values of only a few micrometers per minute. The large volume application of the CMP process is optimized for sufficiently thick wafers (about 200 µm or more depending on the wafer size) and is available for thinner wafers only for laboratory use.
Figure 2. Pseudo 3-D AFM image of the silicon surface after dry etching (ADP-DCE). The dimension of the vertical axis (Z-axis) is 5 nm.
Dry etching is a new method for wafer thinning. The ADP-DCE process introduced by Tru-Si3 uses Ar/CF4 plasma. The thinning rate is about 20 µm/minute and results in a uniformity <2 percent for removing 20 µm. Moreover, conventional dry etch processes using fluorine- or chlorine-containing plasmas also were applied. A surface roughness Rms of about 0.3 nm was measured for wafers thinned by ADP-DCE. However, AFM measurements also identified numerous needle-like hillocks having heights of several nanometers, resulting in a nonuniform smoothness (or higher roughness) of the etched surfaces (Figure 2). The reason may be anisotropic etching effects caused by the rougher surface after grinding (the grinded 100 surface is characterized by 111 and 100 surface steps that can cause a different etching behavior).
As described above, ADP-DCE uses an Ar/CF4 plasma. Here, chemical sputtering is the mean process where radicals of the reactive gas, produced in the plasma, cause chemical reactions on the silicon surface, resulting in the formation of a reaction layer. Cross-sectional TEM has revealed an amorphous layer about 0.2 µm thick.4 Furthermore, photoluminescence analysis proved that electrically active defects form near the plasma-etched surface. The effect of the amorphous layer and observed defects on further process steps has not been understood until now.
Figure 3. Principle of the SEZ spin etch processor.
Wet chemical etching is one of the most common thinning techniques. To etch one side of the wafer, one approach is spin etching, in which a thin stream of an etching agent is moved periodically over the surface of the rotating wafer (Figure 3). The front surface of the wafer is protected either by additional layers or by applying special chucks that allow the processing of thin wafers without surface protection layers or tapes.5 The etching agents for silicon are mostly mixtures of HF and HNO3. The different mixtures allow different etching rates and are characterized by different selectivities, which may be important if different layers are involved. A common value for the etching rate for spin etching of silicon is about 10 µm per minute. The TTV value obtained for Si etching depends on the etching time, but is strongly affected by the flow of the etching agent across the wafer surface. The latter depends on parameters such as wafer rotation speed and motion of the agent stream over the surface. The roughness of spin-etched silicon surfaces is below 1 nm (Rms) and, therefore, almost comparable to CMP processes (Figure 4).
Figure 4. AFM image of a silicon surface after wet chemical etching (spin etching). The surface roughness (Rms) is below 1 nm.
An important parameter when specifying highly pure Si is the minority-charge carrier lifetime (MCL). Minority-charge carriers in Si recombine at crystal defects, surface damage or other impurities before reaching the contacts. Thus, the MCL is a quantitative measure for qualifying the crystalline Si material. It is the most important quality parameter for Si substrates used in highly integrated memory devices and many other applications ranging from smart cards to photovoltaic applications and MEMS. In spin-etched silicon substrates, the MCL reaches high values — comparable to CMP processes — and they are significantly higher than for dry-etched specimens.
The production of ultra-thin wafers is a combination of different steps, which includes not only thinning, but also the handling and further production steps such as backside contact formation, separation and packaging. All these steps are affected by the wafer history, i.e., the processes that result in a sequence of layers on the wafer front side that generate different internal stresses on the wafer.
There are no technologies available for the production of ultra-thin wafers. The original thinning concept used for thicker wafers assumes the application of tapes to protect the front side during thinning. Tapes no longer can be used for ultra-thin wafers. The main reasons are the nonuniformity of the tapes, the adhesion and the extreme flexibility. Thin wafers are very flexible, which is the biggest problem for conventional handling tools. Therefore, nonflexible supports actually are preferred. The application of double-sided adhesive tapes mounted on a support wafer and on the wafer used for thinning was the first approach.6 This application, however, is limited by the nonuniformity of the tape. Another approach is the temporary mounting of the thinned wafer on a support using glues. There are improved technologies allowing the mounting with very low TTV values (TTV @ 1 µm).7 The temporary mounting process allows the application of conventional handling tools during further processing. Most of the available glues, however, are thermally stable up to about 120°C, which is too low for additional thermal processes.
Further approaches based on this mounting technique integrate the chip separation. The most common examples are dicing before grinding (DBG)8 or dicing by thinning (DbyT).9 DBG uses sawing from the front side before thinning from the backside. The depth of the sawing lines must be greater than the thickness of the final chips. The sawing process using diamond blades, however, results in additional damage. On the other side, DbyT applies a trench-etching process instead of the mechanical sawing, which reduces the additional damage formation.
The temporary mounting of the wafer allows for handling during processing, but there are no applicable concepts today for the final handling of the isolated thin, flexible wafer. Assuming, for instance, that only the wafer edge is supported, the deflection y of a wafer is proportional to the square of the radius (a) and inversely proportional to the third power of the thickness (t). The maximum deflection under these conditions is given as
where W denotes the total applied load, m (=1/n) is the reciprocal of Poisson's ratio and E is Young's modulus.10 Using the data for silicon, i.e., E = 1.30208 x 1012 dyn/cm2 and n = 0.2786, the maximum deflection of 100 µm thick wafers is between 1 and about 10 µm, depending on their diameter and whether an additional load is applied. Reducing the thickness to 10 µm, the deflection increases to values of 1 mm to 1 cm. The most important is the wafer handling where a vacuum chuck, having mostly a smaller diameter than the wafer, is fixed on the front or on the backside. Here, the deflection is not only increasing by the applied load W (caused by the vacuum), but also by the ratio of the wafer diameter to the diameter of the chuck. The deflection increases to the fourth power of both diameters.10 Therefore, handling systems other than the conventional are strongly required.
Thin chips are necessary for the next generation of ICs. The current chip thickness of about 100 µm for volume production is expected to decrease to about 50 µm in the near future and 20 µm within the next decade. At the same time, wafer diameters will increase, which consequently leads to increased wafer thickness. Therefore, a thinning procedure is required. The most effective way is the thinning of the whole wafer. This can be achieved through mechanical grinding, followed by an additional polishing step to remove the damage layer.
For a complete list of references, Manfred Reiche, Ph.D., scientific staff member, may be contacted at Max-Planck Institut für Mikrostrukturphysik, Weinberg 2, D-06120 Halle, Germany, 49 345 558250. Gerald Wagner, Ph.D., process application manager, may be contacted at SEZ AG, Draubodenweg 29, A-9500 Villach, Austria, 43 4242 204; E-mail: firstname.lastname@example.org.