Package level integration: challenges and opportunities A wide array of package level integration technologies now available to chip and system designers are reviewed. As technical challenges to shrink transistors per Moore’s Law become increasingly harder and costlier to overcome, fewer semiconductor manufacturers are able to upgrade to the next lower process nodes (e.g., 20nm). Therefore various alternative schemes to cram more transistors within a given footprint without having to shrink individual devices are being pursued actively. Many of these involve 3D stacking to reduce both footprint and the length of interconnect between the devices. A leading memory manufacturer has just announced 3D NAND products where circuitry are fabricated one over the other on the same wafer resulting in higher device density on an area basis without having to develop smaller transistors. However such integration may not be readily feasible when irregular non-memory structures, such as sensors and CPUs, are to be integrated in 3D. Similar limits would also apply for 3D integration of devices that require very different process flows, such as analog with digital processor and memory. For applications where integration of chips with such heterogeneous designs and processes are required, integration at the package level becomes a viable alternative. For package level integration, 3D stacking of individual chips is the ultimate configuration in terms of reducing footprint and improving performance by shrinking interconnect length between individual chips in the stack. Such packages are already in mass production for camera modules that require tight coupling of the image sensor to a signal processor. Other applications, such as 3D stacks of DRAM chips and CPU/memory stacks, are under development. For these applications 3D modules have been chosen so as to reduce not just the form factor but also the length of interconnects between individual chips. Figure 1: Equivalent circuit for interconnect between DRAM and SoC chips in a PoP package. Interconnects a necessary evil To a chip or system designer the interconnect between transistors or the wiring between chips is a necessary evil. They introduce parasitic R, L and C into the signal path. For die level interconnects this problem became recognized at least two decades ago as RC delay in such interconnects for CPUs became a roadblock to operation over 2GHz. This prompted major changes in materials for wafer level interconnects. For the conductors, the shift was from aluminum to lower resistance copper which enabled a shrink in geometries. For the surrounding interlayer dielectric that affect the parasitic capacitance, silicon dioxide was replaced by various low and even ultra low k ( dielectric constant ) materials, in spite of their poorer mechanical properties. Similar changes were made even earlier in the chip packaging arena when ceramic substrates were replaced by lower– k organic substrates that also reduced costs. Interconnects in packages and PCBs too introduce parasitic capacitance that contributes to signal distortion and may limit the maximum bandwidth possible. Power lost to parasitic capacitance of interconnects while transmitting digital signals through them depend linearly on the capacitance as well as the bandwidth. With the rise in bandwidth even in battery driven consumer electronics, such as smart phones, power loss in the package or PCBs becomes ever more significant (30%) as losses in chips themselves are reduced through better design (e.g., ESD structures with lower capacitance ). Improving the performance of package level interconnects Over a decade ago the chip packaging world went through a round of reducing the interconnect length and increasing interconnect density when for high performance chips such as CPUs, traditional peripheral wirebond technology was replaced by solder-bumped area-array flip chip technology. The interconnect length was reduced by at least an order of magnitude with a corresponding reduction in the parasitics and rise in the bandwidth for data transfer to adjacent chips, such as the DRAM cache. However, this improvement in electrical performance came at the expense of mechanical complications as the tighter coupling of the silicon chip to a substrate with a much larger coefficient of thermal expansion (6-10X of Si ) exposed the solder bump interconnects between them to cyclic stress and transmitted some stress to the chip itself. The resulting Chip Package Interaction (CPI) gets worse with larger chips and weaker low-k dielectrics on the chip. The latest innovation in chip packaging technology is 3D stacking with through silicon vias (TSVs) where numerous vias (5µm in diameter and getting smaller) are etched in the silicon wafer and filled with a conductive metal, such as Cu or W. The wafers or singulated chips are then stacked vertically and bonded to one another. 3D stacking with TSVs provides the shortest interconnect length between chips in the stack, with improvements in bandwidth, efficiency of power required to transmit data, and footprint. However, as we shall see later, the 3D TSV technology is delayed not only because of complex logistics issues that are often discussed, but actual technical issues rooted in choices made for the most common variant: TSVs filled by Cu, with parallel wafer thinning. Figure 2: Breakdown of capacitance contributions from various elements of intra-package interconnect in a PoP. The total may exceed 2 pF. Equivalent circuit for packages PoP (package-on-package) is a pseudo-3D package using current non-TSV technologies and are ubiquitous in SmartPhones. In a PoP, two packages (DRAM and SoC) are stacked over one another and connected vertically by peripheral solder balls or columns. The PoP package is often talked about as a target for replacement by TSV-based 3D stacks. The SoC to DRAM interconnect in the PoP has 4 separate elements (wirebond in DRAM package, vertical interconnect between the top and bottom packages, substrate trace and flip chip in bottom package for SoC) in series. The equivalent circuit for package level interconnect in a typical PoP is shown in FIGURE 1. From FIGURE 2 it is seen that interconnect capacitance in a PoP package is dominated by not just wire bonds (DRAM) but the lateral traces in the substrate of the flip chip package (SoC) as well. Both of these large contributions are eliminated in a TSV based 3D stack. In a 3D package using TSVs the elimination of substrate traces and wire bonds between the CPU and DRAM leads to a 75% reduction in interconnect capacitance (FIGURE 3) with consequent improvement in maximum bandwidth and power efficiency. Effect of parasitics Not only do interconnect parasitics cause power loss during data transmission but they also affect the waveform of the digital signal. For chips with a given input/output buffer characteristics, higher capacitance slows down the rise and falling edges [1,2]. Inductance causes more noise and constricts the eye diagram. So higher interconnect parasitics limit the maximum bandwidth for error free data transmission through a package or PCB. TSV-based 3D stacking As has been previously stated, a major reason for developing TSV technology is to use it to improve data transmission – measured by bandwidth and power efficiency — between chips and go beyond bandwidth limits imposed by conventional interconnect. Recently a national Lab in western Europe has reported results  of stacking a single DRAM chip to a purpose-designed SoC with TSVs in a 4 x 128 bit wide I/O format and at a clock rate of just 200MHz. They were able to demonstrate a bandwidth of 12.8 MB/sec (2X that in a PoP with LP DDR3 running at 800MHz). Not surprisingly the power efficiency for data transfer reported (0.9 pJ/bit) was only a quarter of that for the PoP case. Despite a string of encouraging results over the last three years from several such test vehicles, TSV-based 3D stacking technology is not yet mature for volume production. This is true for the TSV and manufacturing technology chosen by a majority of developers, namely filling the TSVs with copper and thinning the wafers in parallel but separately which requires bonding/debonding to carrier wafers. The problems with filling the TSVs with copper have been apparent for several years and affect electrical design . The problem arises from the large thermal expansion mismatch between copper and silicon and the stress caused by it in the area surrounding copper-filled TSVs, which alters electron mobility and circuit performance. The immediate solution is to maintain keep-out zones around the TSVs, however this affects routing and the length of on-die interconnect. Since the stress field around copper-filled TSVs depend on the square of the via diameter, smaller diameter TSVs are now being developed to shrink the keep out zone. Only now the problems of debonding thinned wafers with TSVs, such as fracturing, and subsequent handling are being addressed by development of new adhesive materials that can be depolymerized by laser and thinned wafers removed from the back-up without stress. The above problems were studied and avoided by the pioneering manufacturer of 3D memory stacks. They changed via fill material from copper to tungsten, which has a small CTE mismatch with copper, and opted for a sequential bond/thin process for stacked wafers thereby totally avoiding any issues from bond/debond or thin wafer handling. It is baffling why such alternative materials and process flows for TSVs are not being pursued even by U.S. based foundries that seem to take their technical cues instead from a national laboratory in a small European nation with no commercial production of semiconductors! Figure 3: When TSVs (labeled VI) replace the conventional interconnect in a PoP package, the parasitic capacitance of interconnect between chips, such as SoC and DRAM, is reduced by 75%. Options for CPU to memory integration Given the delay in getting 3D TSV technology ready at foundries, it is normal that alternatives like 2.5D, such as planar MCMs on high density silicon substrates with TSVs, have garnered a lot of attention. However the additional cost of the silicon substrate in 2.5D must be justified from a performance and/or foot-print standpoint. Interconnect parasitics due to wiring between two adjacent chips in a 2.5D module are significantly smaller than that in a system built on PCBs with packaged chips. But they are orders of magnitude larger than what is possible in a true 3D stack with TSVs. Therefore building a 2.5D module of CPU and an adjacent stack of memory chips with TSVs would reduce the size and cost of the silicon substrate but won’t deliver performance anywhere near an all TSV 3D stack of CPU and memory. Alternatives to TSVs for package level integration Integrating a non-custom CPU to memory chips in a 3D stack would require the addition of redistribution layers with consequent increase in interconnection length and degradation of performance. In such cases it may be preferable to avoid adding TSVs to the CPU chips altogether and integrate the CPU to a 3D memory stack via a substrate in a double-sided package configuration. The substrate used is silicon with TSVs and high-density interconnects. Test vehicles for such an integration scheme have been built and electrical parameters evaluated [5,6]. For cost driven applications e,g. Smart Phones the cost of large silicon substrates used above may be prohibitive and the conventional PoP package may need to be upgraded. One approach to do so is to shrink the pitch of the vertical interconnects between the top and bottom packages and quadruple the number of these interconnects and the width of the memory bus [7,8]. While this mechanical approach would allow an increase in the bandwidth, unlike TSV based solutions they would not reduce the I/O power consumption as nothing is done to reduce the parasitic capacitance of the interconnect previously discussed (FIGURE 3). A novel concept of “Active Interconnects” has been proposed and developed at APSTL. This concept employs a more electrical approach to equal the performance of TSVs  and replace these mechanically complex intrusions into live silicon chips. Compensation circuits on additional ICs are inserted into the interconnect path of a conventional PoP package for a Smart Phone (FIGURE 4) to create the SuperPoP package with Bandwidth and Power efficiency to approach that of TSV-based 3D stacks without having to insert any troublesome TSVs into the active chips themselves. Figure 4: Cross-section of a APSTL Super POP package under development to equal performance of TSV based 3D stacks. Integrated circuit with compensation circuits for ea. interconnect is inserted between the two layers of a PoP for SmartPhones. This chip contains through vias and avoids insertion of TSVs in high value dice for SoC or DRAM. Conclusion A wide array of package level integration technologies now available to chip and system designers have been discussed. The performance of package level interconnect has become ever more important for system performance in terms of bandwidth and power efficiency. The traditional approach of improving package electrical performance by shrinking interconnect length and increasing their density continues with the latest iteration, namely TSVs. Like previous innovations, TSVs too suffer from mechanical complications, only now more magnified due to stress effects of TSVs on device performance. Further development of TSV technology must not only solve all remaining problems of the current mainstream technology – including Cu-filled vias and parallel thinning of wafers — but also simplify the process where possible. This includes adopting more successful material (Cu-capped W vias) and process choices (sequential wafer bond and thin) already in production. In the meantime innovative concepts like Active Interconnect that altogether avoids using TSVs and APSTL SuperPoP using this concept show promise for cost-driven power-sensitive applications like smart phones. • References Gupta, D., “A novel non-TSV approach to enhancing the bandwidth in 3D packages for processor- memory modules “, IEEE ECTC 2013, pp 124 – 128. Karim, M. et al , “Power Comparison of 2D, 3D and 2.5D Interconnect Solutions and Power Optimization of Interposer Interconnects,” IEEE ECTC 2013, pp 860 – 866. Dutoit, D. et al, “A 0.9 pJ/bit, 12.8 GByte/s WideIO Memory Interface in a 3D-IC NoC-based MPSoC,” 2013 Symposium on VLSI Circuits Digest of Technical Papers. Yang, J-S et al, “TSV Stress Aware Timing Analysis with Applications to 3D-IC Layout Optimization,” Design Automation Conference (DAC), 2010 47th ACM/IEEE , June 2010. Tzeng, P-J. et al, “Process Integration of 3D Si Interposer with Double-Sided Active Chip Attachments,” IEEE ECTC 2013, pp 86 – 93. Beyene, W. et al, “Signal and Power Integrity Analysis of a 256-GB/s Double-Sided IC Package with a Memory Controller and 3D Stacked DRAM,” IEEE ECTC 2013, pp 13 – 21. Mohammed, I. et al, “Package-on-Package with Very Fine Pitch Interconnects for High Bandwidth,” IEEE ECTC 2013, pp 923 – 928 Hu, D.C., “A PoP Structure to Support I/O over 1000,” ECTC IEEE 2013, pp 412 – 416 DEV GUPTA is the CTO of APSTL, Scottsdale, AZ (firstname.lastname@example.org).