Chipworks


Intel clarifies 32nm NMOS stress mechanism at IEDM 2011

I was browsing through the advance program for the upcoming IEDM conference when, almost at the end, I came across paper number 34.4, "Modeling of NMOS Performance Gains from Edge Dislocation Stress," by Weber et al. of Intel. According to the abstract: "Simulations show stress from edge dislocations introduced by solid phase epitaxial regrowth increases as gate pitch is scaled, reaching over 1GPa. This makes edge dislocations attractive, as stress from epitaxial and deposited film stressors reduces as pitch is scaled. We show dislocation stress varies with layout and topography."

The abstract doesn’t have much detail, but it does re-enforce the teachings from a Samsung paper at last year’s IEDM conference, paper 10.1, "Novel Stress-Memorization-Technology (SMT) for High Electron Mobility Enhancement of Gate Last High-k/Metal Gate Devices" (Lim et al.).

The essence of this paper is that if you give the source/drains a deep amorphization implant, and then anneal to create solid-phase epitaxial re-growth with a tensile stress liner in place, then crystalline dislocations are formed adjacent to the gate edge, which apply tensile stress to the channel.

Source: IEDM/Samsung

Like the embedded SiGe stress for PMOS, this works better with a gate-last process, since the surface is not locked by a polysilicon gate. Samsung claimed ~1% lattice distortion, verified by nano-beam diffraction measurements. A vertical slice was taken below the center of the gate and the color coding shows strain of ~1%:

Source: IEDM/Samsung

When I saw this paper it made me wonder if this mechanism was what we had been seeing in the Intel 32nm parts; none of the earlier stress mechanisms seemed to being used. Intel were the first to apply stress to transistor channels at the 90nm node, using (for NMOS) the contact etch-stop layer (CESL) silicon nitride; and then at the 45nm node they evolved to using the contact plug itself and the gate-fill metal, since the CESL is almost gone.

But in the 32nm process the contact plugs have been polished away, and there is less gate metal (since it’s a smaller gate) — so what is supplying Intel’s fourth-generation strain?

In the light of these two papers we can now take a good guess when we see what Intel’s 32nm NMOS transistor looks like:

And as we can see, there are stacking faults on both sides of the gate, and they look similar to the ones in the image from the Samsung paper:

Source: IEDM/Samsung

Samsung claimed an increase in electron mobility of 40%-60% and drive current improvement of over 10%.

Stacking faults are not normally what we want to see in transistors, because they can be leaky if they go through a junction, but as long as they are contained within the source/drain diffusions, they should not be a problem. They are certainly in every NMOS transistor that we imaged (though given the billions of transistors in the millions of processors shipped, we cannot exactly claim a large sample).

At an intuitive level it makes sense that this mechanism should work — a stacking fault is a missing layer of atoms within the crystalline lattice, and we are now working with channel lengths of a hundred atomic spacings or less. So if a couple of atomic layers are missing at opposing ends of the channel, it seems logical that tensile stress would be induced in the channel.

Like the other stress techniques, this only works now that we are down in the nanometer range, but the good thing about this one is that the applied strain should increase as the channel length gets shorter.

So it seems that we have finally deduced at least some of what Intel are doing in their 32nm NMOS transistors. Now, of course, the question will be — can it be transferred to the trigate structures we’re looking forward to in the 22nm process?

Looking forward to IEDM, in addition to the conference program, ASM will be holding a lunchtime seminar on the Wednesday, Dec. 7th, at 12 noon, with Ivo Raaijmakers hosting; and I have the privilege of speaking on "High-k/Metal Gate in Leading-Edge Silicon Devices." To register, email Roseanne de Vries at rosanne.de.vries@asm.com. We hope to see you there!

GLOBALFOUNDRIES Takes a Different Approach to HKMG in AMD’s Llano CPU/GPU

After much anticipation, and with quite a few design wins, AMD’s Llano CPU/GPU chip arrived on the scene a couple of months ago. Fabricated by GlobalFoundries (more easily known as GloFo) in their 32nm SHP process, it was the first foundry-based gate-first HKMG product to come on the market.

As a processor, it garnered pretty favorable reviews, but of course we were keen to get it into the lab and see how it had been put together. When we did, it became a bit of a mystery — we couldn’t see any significant differences in gate stack between NMOS and PMOS! It’s common wisdom that you need different work function materials in the NMOS and PMOS gates to differentiate them and make up the CMOS circuitry.

For example, Panasonic uses lanthanum to tweak the work function of their NMOS transistor and distinguish it from the PMOS stack in their HKMG Uniphier chip that we looked at back in the spring.

Fig. 1  Panasonic 32-nm HKMG Transistor

As we can see in Fig. 1 above, the gate metal is titanium nitride under the polysilicon, and the hafnium-based high-k layer is below that, over the interface oxide. There was no apparent physical difference between NMOS and PMOS until we start looking in detail, and then we found just a tickle of lanthanum in the NMOS stack, but presumably enough to move the work function into the NMOS regime.

When we look at the Llano, it also uses a gate-first transistor style, with TiN as the gate metal, but there the resemblance stops. Below (Fig. 2) is a composite image of the Llano NMOS/PMOS transistors, and you can see that they are more complex.

Fig. 2  AMD/GloFo 32-nm HKMG NMOS and PMOS Transistors

Dual-stress liners are used to add tensile and compressive stress; we can see from the above that the PMOS (compressive) nitride is twice as thick as the NMOS (tensile) layer. The PMOS device also has embedded SiGe in the source/drains to add more compressive stress, whilst there is possible evidence of stress memorization (SMT) for NMOS. And if we look carefully, the PMOS SOI layer is also a little thicker than the NMOS SOI.

The NMOS and PMOS gate stacks shown in Fig. 3 appear to be the same — highly silicided poly on a thin AlO barrier layer, on TiN gate metal, which is on the Hf-based hi-k layer with a SiO interfacial layer on the substrate. The AlO layer in the PMOS stack is more diffuse, and some of the aluminum has migrated into the TiN, and arsenic is present as expected in the NMOS, but essentially they are the same.

Fig. 3  AMD/GloFo Transistor Gate Stacks

So now we have a bit of a mystery; how are the NMOS and PMOS transistors differentiated? We looked long and hard in both NMOS and PMOS for a dopant such as the lanthanum used by Panasonic, something other than hafnium, silicon, or titanium, but if it’s there’s, it’s below the detection limits. Aluminum is known as a dopant for PMOS, but to be effective it has to be present at the Hf/SiO interface to create Vt-shifting electrical dipoles, and we see no evidence of migration that far.

The extra thickness in the SOI is the clue to what we think is going on in this part. The extra thickness is actually a layer of epitaxial SiGe, which changes the relationship with the gate metal and shifts the Vt, instead of using a dopant in the hi-k. Some work was done on this topic at SEMATECH a few years ago [1], and of course AMD and IBM were members and would have received the results.

The schematic in Fig 4 shows conceptually what happens; the valence band of the substrate is shifted because of the Ge, and also due to the compressive strain applied by the embedded SiGe source/drain and the nitride stress layer.

Fig. 4  Schematic of Band Diagram for Transistor with SiGe Channel [1]

Fig. 5 illustrates the drive current improvement for a >10% SiGe channel in the SEMATECH device, which will also include the effect of the inherent improved hole mobility in the SiGe.

Fig. 5  Drive Current Improvement in SiGe-Channel Device

That accounts for the PMOS; the NMOS was still a bit of a mystery, since one would still expect a dopant at the hi-k/oxide interface, and we see none. All we see is TiN, and Intel uses that as their PMOS work-function metal, which which on the face of it  doesn’t make sense. However, more SEMATECH [2, 3] work indicates that the work function of TiN can be manipulated by adjusting the growth conditions and thickness, enough to shift it from the NMOS to the PMOS regime.

In fact, SEMATECH’s ESSDERC paper from 2005 [3] agrees nicely with what we see in the AMD and Intel parts. The Llano has a ~2nm TiN layer in the NMOS, whereas Intel uses ~2nm layer plus a 1nm Ta-based cap and another ~4nm TiN on top of that in their PMOS. Fig. 6 indicates that this extra material could be enough to move the work function in Intel’s transistor from NMOS to PMOS.

Fig. 6   Effective Work Function of TiN electrode when 10-nm thick ALD TiN and TaN Films are Used as Overlayers on ~3.6 nm TiN Layer [3]

We actually had a clue a couple of years ago, if we had known what we are looking at. In a CICC paper [4] GlobalFoundries showed an image (Fig. 7) of a transistor that looks as though it had a SiGe channel — but of course they didn’t say so!

Fig. 7  Experimental  GLOBALFOUNDRIES Transistors [4]

Of course all of the above is pure speculation, but if the literature is correct it, does hang together and account for the difference between this latest HKMG product and the others we have seen. Now, will IBM, Samsung, and the other alliance members do the same thing?

References

1. H.R. Harris et al., Band-Engineered Low PMOS VT with High-K-Metal Gates Featured in a Dual Channel CMOS Integration Scheme, Symp. VLSI Technology 2007, pp 154-155

2. K Choi et al., Growth Mechanism of ALD-TiN and the Thickness Dependence of Work Function, Symp. VLSI Technology 2005, pp 103-104

3. K. Choi et al., The Effect of Metal Thickness, Overlayer and High-k Surface Treatment on the Effective Work Function of Metal Electrode, ESSDERC 2005, pp 101-104

4. S. Krishnan et al., Advanced SOI CMOS Transistor Technologies for High-Performance Microprocessor Applications, CICC 2009

Intel Enlarges Process Lead over Their Competition

22-nm Trigate Transistors Discussed

At a morning session at the Intel Developer Forum Tuesday, Mark Bohr tooted the Intel trumpet and put a slide up to emphasise their lead over the other leading semiconductor companies:

Intel Process Evolution Since 90-nm

One can quibble a bit about the odd month here or there for the dates, but essentially things have been as they say — they were the first with embedded SiGe for PMOS strain, they were a node ahead of everyone else at HKMG, and if the trigate launch comes to pass as planned at the end of this year, they will be years ahead with their version of the FinFET.

The main focus of the talk was Intel’s upcoming 22-nm trigate transistor technology to be used for the Ivy Bridge processors due out in the New Year. Essentially it was a re-run of the May announcement, with a little more about the SoC version and a look forward to 14-nm in (presumably) 2013.

Intel Schematic of Trigate Transistor in Inversion
Transistor Delay vs Voltage (pale grey line is planar 22-nm)
Source: Intel

Mark said that they made the choice for trigate back in 2008, when it became clear that the performance benefit from the fully depleted triple-gate structure (compared to 22-nm planar) was significant enough to justify the additional effort and cost of another step-function change in process architecture.

Compared with the 32-nm equivalent, the trigate gives a 37% performance increase at a lower voltage or a 50% power reduction at constant performance. Somehow Intel does this with no extra mask levels and only 2-3% additional cost (although extra litho steps are used, because of the need for double patterning).

Of course, I was keen to hear when we’ll be able to get hold of some of these chips, after all they’re going to be fascinating to take apart!. According to Mark, they are "just about ready to start production" in Q4, with public availability in the first half of next year. They are definitely sampling, since Ivy Bridge Ultrabooks are on show here. The strict two-year clock appears to have slipped slightly, since previous launches have been in November; but we quibble, since Intel has a clock — their competitors make an announcement, and then we wait!

Which brings us to the roadmap; as you can see in the first graphic above, 14 nm is predicted in 4Q13 (which is itself a subtle change, since it was 15-nm a couple of years ago — Intel seems to be aligning itself with the other companies which have gone the 28 — 20 — 14 nm route).

Intel is also continuing the parallel development of SoC processes down to 14 nm:

New Process Roadmap  Source: Intel

Talking to the guys on the floor here, Cedar Trail (32-nm SoC) netbooks and mini-desktops will be out for the Christmas market, and I gather the intent is to reduce the gap between the CPU and SoC processes to a year or so from the current two — three.

Given the extension to 14 nm, Intel must have already verified that the transistor-related SoC features (low leakage and high-voltage transistors, and the different varieties of SRAM) work with trigates, the rest are all back-end related so should just suffer the normal scaling problems.

Unfortunately it appears that there will not be a paper on the 22-nm process at IEDM this year, so we will have to wait for Ivy Bridge chips to come on to the shelves to get a few more clues — it should be an interesting spring!

A SEMICON West snippet: AMAT launches new products, prepares for 450mm

SEMICON West is usually taken as a barometer for the industry, and my subjective impression is steaming along nicely, but no record breaking years coming up! According to Tom Morrow of SEMI, this year’s preregistrations were flat, but there about 10% more booths than last year.

I kicked off the show by sitting in at the Applied Materials (AMAT) press and analysts breakfast. As usual AMAT had a flurry of press releases preceding the show, and eight new products and product updates are being launched. A couple of years ago AMAT was putting more emphasis on their solar and display divisions, but this year silicon processing is again getting a high profile.

We had a series of presentations from Mike Splinter, Randhir Thakur, Steve Ghanayem, and Bill McClintock, and then Q-and-A from the analysts present.

Mike S. did the corporate overview: he saw the industry outlook as soft in the short term, but was basically upbeat since the industry drivers are still there — Moore’s law scaling, 3D transistors (in logic, flash and DRAM), and pushing them all, the mobile revolution. On the solar side, he predicted that solar modules will cross the $1/Watt threshold sometime this year, and hit $0.80/W next year, so cost reductions will help drive that end of the business.

Randhir Thakur then reviewed the product launches at the show, putting them into the context of the recent and upcoming changes in chip processing. Rather than list the new products, here’s the slide:

Steve Ghanayem focused on the Centura gate stack tool — essentially an ALD chamber has been added into the Centura system to give it high-k capability, all within vacuum:

He put a lot of emphasis on the cluster nature of the tool, so that the wafers only see vacuum between the process steps, claiming that exposure to atmosphere reduces mobility and increases threshold voltage spread.

The last technical presentation (Bill McClintock) covered off the new Black Diamond 3 (BD3) and Nanocure 3 extreme low-k dielectric and curing combination, giving a dielectric constant (k) of 2.2, down from k=2.5 in the previous generation. One of the things he pointed out (that I hadn’t thought about) was that the pre-metal dielectric layer at the bottom of the metal stack has to survive more than 150 process steps before wafer out in today’s 10-12 metal-layer processes, never mind the stresses of the packaging and assembly sequence.

So the challenges are formidable as the k-value is pushed down, to get both physical and material integrity; AMAT claims that by going to a closed-pore structure, with tighter pore size distribution, they can achieve k=2.2.

According to Bill, we can expect to see BD3 at the 22/15 nm nodes, so a couple of years yet before we see it in high-volume products.

Then we got to the Q-and-A session. Ironically, the first question was not about any of the product launches — it was about the spend on 450mm next year! Mike Splinter was reluctant to give a specific number, but he did say it would be "well over $100 million," mostly on early test systems in-house. Not exactly small change, all the same. A later question prompted the statements that "450 is going to happen," and that they are closely linked to the leading customers that will drive the move there. They are clearly now viewing 450mm as a strategic way of gaining market share when it does come.

Other questions covered off potential product expansion, and of course the future demand from foundries in what seems to be a softening market.

Randhir Thakur identified AMAT’s flowable CVD, Siconi clean and the Raider copper deposition tools as having found more applications than originally intended. The flowable CVD was targeted on one application, but ended up replacing CVD fill for STI, and other CVD steps with high conformality requirements. Siconi clean has evolved from a PVD clean, but has now moved into CVD and epi areas, any area where interfaces are critical. The Raider copper tool was developed from a Semitool product for packaging, but now has potential for damascene copper on die.

When it comes to the foundries, it appears that the fab shells are ready, and the message for the equipment companies is to be ready — things may be soft at the moment, but they could come back very quickly. Demand is controlled by the consumer market, and that has proved remarkably resilient considering some of the economic challenges in the last year or so.

All in all, an interesting session, both in the industry and technical senses. AMAT has the webcasts and presentations up on their investor website for until August 12, 2011.

TSMC HKMG is Out There!

I have to apologise for a hiatus in posting due to pressure from the day job, but this week is Semicon West week, so it seems appropriate to announce that we’ve started analysing TSMC’s 28-nm gate-last HKMG product, in this case a Xilinx Kintex-7 FPGA, fabbed in TSMC’s HPL process.

Having seen two generations of Intel’s HKMG parts (the 45-nm Xeon and 32-nm Westmere) using gate-last technology, it’s inevitable that we’ll compare those with the TSMC process.

The Kintex family is the mid-range group in the latest 28-nm generation 7-series of FPGAs from the company. These are optimised for the highest price/performance benefit, giving the performance of the previous Virtex-6 parts at half the price.
The Kintex-7 has eleven layers of metal (Fig. 1); the 1x layers run from metals 1-4, with a pitch of ~96 nm, the smallest we have ever seen.
Fig. 1 General Structure of Xilinx Kintex-7

Contacted gate pitch is ~118 nm in our initial analysis, with minimum gate length of ~33 nm, though since this is replacement gate there is no way of knowing absolutely the original poly gate width, which defines the source/drain engineering.

Plan-view imaging (Fig. 2) indicates that TSMC has implemented the restricted design rules that have been much discussed in the gate-first/gate-last debate. Regular, uni-directional patterning of functional gate and dummy gate lines helps out the lithography, but inevitably reduces packing density compared with Manhattan layout schemes.
Fig.2 Plan-View Image of Gates and Active Silicon
By the look of it, double patterning with a gate plus a cut mask has been used. FPGAs are usually laid out in a more relaxed manner than dense logic, so here we can see lots of dummy gates, and also dummy active regions.

The gate structure itself definitely has some similarities with Intel’s 45-nm, as we can see from figures 3 and 4.
Fig.3 Intel 45-nm (left) and TSMC/Xilinx 28-nm NMOS Transistors
Fig.4  Intel 45-nm (left) and TSMC/Xilinx 28-nm PMOS Transistors

In both it appears that the buffer oxide, the high-k layer and a common work-function material are put down before the sacrificial polysilicon gate. Then the source/drain engineering is performed, and dielectric stack deposited and planarized back to the polysilicon; and the sacrificial gate is removed, and the NMOS/PMOS gate stacks are put in and planarized.

Of course there are also differences – TSMC is not using embedded SiGe for PMOS strain, and there is an additional high-density metal layer in the PMOS gate. There is also no distinct dielectric capping layer in the TSMC structure, and there is an extra sidewall spacer (likely part of the source/drain tuning). The wafers are also rotated to give a <100> channel direction.
Intel stated that they applied stress to NMOS devices using the gate metal stack and the contacts; TSMC could be doing the same, although the contacts are spaced further from the gate edge. If there is PMOS stress, the mechanism is unclear, though it is possible that the extra high-density layer in the gate could be for that purpose. However, this part is fabbed in the HPL low-power process, and typically we do not see e-SiGe in such processes.
Analysis is ongoing – more details to come, and possibly a comparison with the AMD Llano gate-first HKMG part, that’s in our labs at the moment.

N.B. We are at Semicon West, at Booth 2337 – drop by and get a coupon for a free die photo!

Intel Goes Tri-Gate at 22-nm!

In a pair of press and analyst briefings this morning, Mark Bohr and Steve Smith announced that Intel will indeed be using a 3D transistor structure for their 22-nm product, settling one of the big questions about Intel’s process development over the last few years – do they stay planar or not? (And, incidentally, settling a bet between me and Scott Thompson – Scott wins!)

The big debate at IEDM last year about advanced CMOS was whether transistor structures would move to a 3D structure (finFET, tri-gate, whatever label you choose), or use ultra-thin SOI layers to attain fully depleted operation. The debate was not resolved – I was definitely left with the impression that the adherents in both camps held to their opinions, which probably means we will have two process groupings, much as we have with the gate-first/gate-last high-k/metal gate (HKMG) structures.

Intel have come down on the side of tri-gate – apparently the decision was taken in 2008, after their researchers had showed that the gate-last HKMG gate structure would work in 3D, and that the planar version could not give enough of a performance boost. So for the last three years they’ve been developing the process and getting it manufacturable for the production of the Ivybridge product line later this year.



Intel’s Research and Development Sequence to Reach the Tri-Gate 22-nm Node


I may have lost the bet about planar, but my gut feel that their HKMG process could be extended to 22-nm seemed to be right, since Mark confirmed that they are using gate-last (replacement gate) technology, with evolutions of existing NMOS and PMOS strain technology. Immersion lithography and double patterning will be used where necessary, and no extra mask layers are needed so the additional cost is only 2 – 3%. And apparently it’s scalable to 14 nm!

The schematic below shows a gate formed on three sides of three fins, to give more drive strength than available from one fin:

Schematic of Tri-Gate Across Three Fins (Source – Intel)

When translated to gate-last HKMG, it looks like this in this Intel image from 2007 (the section is through three gates, with three fins buried under oxide running across the field of view):

Gate-Last HKMG Tri-Gate Transistors (Source – Intel)

And now a new image from today’s briefing, showing an array of transistors with six fins in the centre, and some with two fins at the top right and bottom left:

Intel Tri-Gate Transistors (with STI and gate mold oxide removed) (Source – Intel)

Clearly this means a whole new set of design and layout paradigms, and we can see evidence here of double patterning using fin and gate masks, with cut masks to define the individual fins and gates.

During the briefings, Mark also scotched the rumour that appeared a few weeks ago about a hybrid process, where the SRAM is tri-gate and other areas are planar – all of the chip area will be tri-gate. In addition a parallel SoC process is being developed so that the Atom line of products can be extended to 22-nm.

For us commentators, going tri-gate was always a possibility for Intel; they have been publishing papers on the topic for almost ten years, with a flurry of them five years ago – here’s an image from a press briefing in 2006:

TEM Image of HKMG Tri-Gate Transistor, Sectioned Through the Fin(Source – Intel)

Their R-D-M (Research-Development-Manufacturing) methodology has been well established for quite a while now, and enabled them to keep to their schedule of a new process generation every two years. Based on comments today, we can expect to see 22-nm production in the second half of this year, and product on the shelves in the New Year.

Then we’ll see what it really looks like!

A Shameless Plug for ASMC

Winter is finally starting to fade in Ottawa, and the early signs of spring are showing. The maple sap is running, the first migrant birds have arrived, the frogs are peeping, and we have evening daylight. On the conference calendar, spring means that ASMC (IEEE/SEMI Advanced Semiconductor Manufacturing Conference) is on the horizon, this year in Saratoga Springs, New York on May 16 -18. There, spring should be well advanced, and it will be a great time of year to visit the Empire State.

As the name says, ASMC is an annual conference focused on the manufacturing of semiconductor devices – in this it differs from other conferences, since the emphasis is on what goes on in the wafer fab, not the R&D labs, and the papers are not exclusively research papers.

I’m plugging ASMC because it seems to be one of the more under-rated conferences, unlike IEDM and the VLSI symposia, which get the media attention for leading-edge R&D and processes. However, it’s the nitty-gritty of manufacturing in the fab that gets the chips out of the door, and this meeting discusses the work that pushes the yield and volumes up and keeps them there.

I always come away impressed by the quality of the engineering involved; not being a fab person myself any more, it’s easy to get disconnected from the density of effort required to equip a fab, keep it running and bring new products/processes into production. Usually the guys in the fab only get publicity if something goes wrong!

This year, in addition to the 50-plus papers, there are keynotes from Norm Armour of GLOBALFOUNDRIES (GloFo), Gary Patton of IBM, and Peter Wright of Tradition Equities, as well as a panel discussion on partnerships in semiconductor manufacturing, moderated by Dave Lammers. There are also tutorials, on 3D (by James Lu of Rensselaer Poly), and EUV (by Obert Wood of GloFo), and an invited session of ISMI papers.

The technical sessions include:

  • Factory Optimization
  • Advanced Metrology
  • Advanced Equipment, Materials and Processes
  • Advanced Process Development and Control
  • Advanced Lithography
  • Defect Inspection and Yield Optimization
  • Data Management

Of course, I’m biased to some extent because we’ll be giving a paper there again. I can’t make it this year, but a colleague of mine, Ray Fontaine, is presenting on "Recent Innovations in CMOS Image Sensors". This will be the seventh year running we’ve given a paper, the manufacturing and equipment engineers that attend seem to like seeing what their competitors are doing. In this case Ray will run through some of the changes in the camera chips that we all take for granted in our phones these days.

Other papers that caught my eye may give us some clues as to what to expect in the lithographic field; the IBM/Glofo/Toshiba alliance has one on contact patterning strategies (paper 6.3), and another cooperative paper by IBM/JSR/KLA Tencor/Tokyo Electron on double patterning (6.1), and an IBM/ASML contribution on advanced overlay control (2.5). And on the materials processing side, there are three papers on low-k dielectrics from GloFo/KLA Tencor (2.4), UAlbany/Air Liquide (3.5), and Novellus (poster in session 4); and a couple on nickel silicide by GloFo (5.3) and Ultratech (poster in session 4); and a clue to the mysteries of high-k dielectrics from UMC/National Cheng Kung U (3.4).

More stategically aimed discussions are by Infineon (1.1) on the challenges in having a global supply chain, Sumita Bas of Intel will be speaking on sustainable/green in the chip business (1.3), and two talks by SEMATECH, one on 450 mm manufacturing (ISMI session), and the other on 3D/TSV manufacturing (3.1).

Out of the conference room, there’s a poster session and reception on the Monday evening, and on the Tuesday, Dave Lammers’ panel session, "Models for Successful Partnerships in Semiconductor Manufacturing". Partnership is one of the buzzwords in chipmaking these days, and the panelists we have should know it well; Ari Komeran from the industry development side of Intel, Michael Fancher from Albany, Olivier Demolliens, head of LETI-NANOTEC in France; and Dr Walid Ali, from ATIC in Abu Dhabi.

After the panel session, what could be a highlight of the conference, a tour of the Luther Forest Technology Campus, including a look at GLOBALFOUNDRIES (Norm Armour’s) new Fab 8, followed by a reception at the Canfield Casino.

Register soon – rates go up on May 8th!

Panasonic Gate-First HKMG also First Out of the Gate

As I suggested a few months ago, we put some credence in Panasonic’s press release last September that they would be shipping their first 32-nm HKMG parts last October. Samsung had announced their Saratoga chip, and both Altera and Xilinx have displayed silicon from TSMC, but until last Friday (18 March), none have said that they were shipping product. As of Friday Xilinx announced that they were shipping their Kintex-7 product, the first of their 7-series of FPGAs.

Earlier this month our faith in Panasonic was rewarded, and we found the chip! It took a few false starts buying Panasonic products that we tore down and threw away, but now we have a verified 32-nm, gate-first, high-k metal-gate (HKMG) product. The supply chain was a bit longer than we had hoped, but as promised the chip was shipped with a week 41 date code, in October.

So, for the curious, this is what a transistor looks like:

Panasonic’s 32-nm HKMG NMOS Transistor

We can see the TiN metal gate at the base of the polysilicon, and the thin line of high-k at the base of the TiN. Also noticeable are a dual-spacer technology (sometimes referred to as differential offset spacers), and a thin line of nitride over the source/drain extension regions (possibly indicating a nitrided oxide under the high-k). The salicide is the usual platinum-doped nickel silicide. Less visible are mechanisms of applying strain, other than the nitride layer over the gate; embedded SiGe and dual-stress liners are not used.

All of which is typical for Panasonic – their 45-nm product did not appear to use any enhanced strain techniques, and the only concession to PMOS enhancement was wafer rotation to give a 1-0-0 channel direction. The emphasis is different from Intel; rather than raw performance, the targets are increased integration, die size reduction/reduced cost, and now we have high-k, reduced leakage/lower power. The September press release does say that transistor performance is improved by 40%, but it also claims 40% power reduction and a 30% smaller footprint.

Here’s a 45-nm transistor for comparison:

Panasonic’s 45-nm Generation Transistor

And, for good measure, Intel’s 32-nm device:

Intel 32-nm NMOS Transistor

The part itself uses a nine-metal (eight Cu, one Al) stack with a hybrid low-k/extra-low-k stack. Die size is ~45 mm2 in a conventional FC-BGA package. Minimum metal pitch is specified as 120 nm [1], and we have found 125 nm in our early investigations.

Panasonic 32 nm General Structure

Analysis is ongoing – stay tuned for more details, and of course we’ll be doing reports!

[1]S. Matsumoto et al., Highly Manufacturable ELK Integration Technology with Metal Hard Mask Process for High Performance 32nm-node Interconnect and Beyond”, IITC 2010

Apple’s A5 Processor is by Samsung, not TSMC

Forty-eight hours ago we obtained an iPad 2 and brought it back to the lab, and took it apart to have a look at Apple’s A5 processor chip. We’ve come to the conclusion that the main innovation in the new iPad is the A5 chip. Flash memory is flash memory (multi-sourced from Samsung and Toshiba in the iPads we’ve seen), the DRAM in the A5 package is 512 MB instead of 256 MB, and the touchscreen control uses the same trio of chips as the iPad 1 – not even a single chip solution as we’ve seen in the later iPhones. And the 3G version uses the same chipset as the Verizon iPhone launched a few weeks ago. This is the mother-board from a 32-GB WiFi-only iPad 2:

Motherboard from 32-GB iPad 2




The A5 can be seen in the centre of the board. If we look at the package we can identify the Apple’s APL0498 marking for the A5 (the A4 is APL0398), and also 4 Gb of Elpida mobile DRAM. Date codes are 1107 for the A5 and 1103 for the memory – only a few weeks in the supply chain here!


Apple A5 from iPad 2



The x-ray images show us that we have the usual package-on-package (PoP) structure, with two memory chips in the top part of the PoP, and the APL0498 processor on the lower half.


X-Ray Image of A5 Package-on-Package

The two rows of dense black dots on the outside of the image are the solder balls from the memory chips in the top half of the package (connecting with the bottom half), and the less dense dots are the solder balls on the bottom half of the package connecting the A5 chip to the iPad board below. If you squint really hard you can see smaller dots about five rows in from the edge which are the flip-chip solder balls on the A5 die – and they take up quite a large proportion of the area, showing that this is a good-sized die.
The die photo and die mark are shown here:
Die Photo of Apple’s A5 Chip from the iPad 2
APL0498E01 Die Mark of Apple A5 Chip


The x-ray is right – the A5 die is more than twice as large as the A4, at 10.1 x 12.1 mm (122.2 mm2), vs 7.3 x 7.3 mm (53.3 mm2) – here’s the A4 chip for comparison:
Apple A4 Die Photo


Given that the A5 is a dual-ARM core, and has more graphics capability than the A4, more than doubling the size is to be expected, but it’s also a clue that this is still made in 45-nm technology.
So after the web speculation that TSMC might be fabbing the A5 rather than Samsung, we had to take a look, and the quickest way is to do a cross-section and compare it with the A4 from last year’s iPad.
So here’s the A5:
SEM Cross-Section of Apple A5
It’s a nine-metal layer part, with eight levels of copper and one aluminum. Zooming into the transistor level:
SEM Cross-Section of Transistors and M1 in A5 Processor
And now the A4:

SEM Cross-Section of Transistors and M1 – M4 in A4 Processor

At this scale even electron microscopes start to run out of steam, so not the clearest of images in either case, but good enough to see the similar shape of the transistor gates and the dielectric layers. So at least this sample of the A5 is fabbed by Samsung, just as all Apple’s processor chips have been for the last while.

Many thanks to the guys in the lab who’ve worked through the weekend to get this information – Chipworks is not really in the media business, but there’s always a buzz when a hot new consumer part comes out.

And on a different note, commiserations and condolences to our Japanese colleagues, they have much more important things of concern than the details of the iPad 2.

How to Get 5 Gbps Out of a Samsung Graphics DRAM

It’s well known that electronics games buffs like their image creation as realistic (or at least as cinema-like) as possible, which in image-processing terms means handling more and more fine-grained pixel data as fast as possible. That means more and more stream processors and texture units in the graphics processor to handle parallel data streams, and faster and faster memory to funnel the data in and out of the GPU.

We recently pulled apart a Sapphire Radeon HD5750 graphics board, containing an AMD/ATI RV840 40-nm GPU, running at 700 MHz, and supported by eight Gb (1 GB) of Samsung GDDR5 memory. This card is a budget card, but the ATI chip still boasts 1.04 billion transistors, 720 stream processors and 36 texture units, can compute at ~1 TFLOPS with a pixel fill rate of 11 Gpixel/s, and can run memory at 1150 MHz with 74 GB/sec of memory bandwidth. I’m not a gamer, but those numbers are impressive to me!

When we started looking at the memory chips, and decoded the part number, we found that we had Samsung’s fastest graphics memory part, claimed to run at 5 Gbps. Graphics DRAMs are designed to run faster anyway, but 5 Gbps is three times faster than the fastest regular DDR3 (Double-Data Rate, 3rd Generation) SDRAM, which can do 1.6 Gbps.*

So what makes this one so blazing fast? Beginning with the x-ray, the difference between a Graphics DDR5 when compared with a 1Gb DDR3 (K4B1G0846F-HCF8) part starts to show up. If we look at an x-ray of the DDR3 chip, we can see that it has the conventional wire-bonding down the central spine:

Plan-View X-ray of Samsung 1 Gb DDR3 SDRAM

When we compare the K4G10325FE-HC04 GDDR5 we can see first that it’s a flip-chip device (no wires), and if we squint hard enough we can also see that the bumps are distributed across the die as well as along the spine.

Plan-view X-ray of Samsung 1 Gb GDDR5 Part from ATI Radeon

This is confirmed in the die photograph:

Die Photo of Samsung 1 Gb GDDR5 SGRAM

Which compares with the die photo of the 1-Gb DDR3:

Die Photo of Samsung 1 Gb DDR3 SDRAM

The die layout is clearly optimized to reduce RC delays from the memory blocks to the outside world. The next question for me is the nature of the flip-chip bonding; is it regular solder bumps or gold stud bumps? A cross-section solves that problem – solder, on plated-up copper lands.

Cross-sectional Images of Samsung GDDR5 Chip in Package

A quick x-ray spectroscopy analysis tells us that the solder is silver-tin lead-free, confirming the package marking.

So the answer to our question is actually fairly obvious – lay out the die to reduce input/output line lengths, and thereby RC delays on the chip, and replace bond wires with bumps to minimize RC delays in the package. A nice exposition of basic principles used to optimize performance.

The next step would be to co-package the memory chips with the GPU to reduce lateral board delays, and we have seen that in products such as the Sony RSX chip in the PS3 gaming system. And after that, lay out the GPU for through-silicon vias – but that will be another story..

For those with an interest in the memory interface circuitry in the RV840, my colleague Randy Torrance has posted a discussion on the Chipworks blog.

* At the time of writing!