Intel's missteps with the 10nm process saw it lose its process tech leadership to TSMC, not to mention cede performance leadership to AMD in the CPU market. As such, all eyes are on the company as its 'Intel 4' process, which we'll refer to as 'I4,' comes to market in 2023.
Last time, Intel tried to scale too aggressively with its 10nm node and reach a 2.7X scaling goal. That led to constant delays due to the incorporation of multiple new technologies simultaneously, some of which obviously didn't meet development targets. For I4, intel is taking a more modular approach and introducing newer technologies step by step as it progresses from node to node, thus helping it achieve a more gradual cadence that it hopes will avoid the delays we've seen in the past.
Intel is developing multiple nodes in parallel to deliver on its promise of five nodes in four years, and Intel 4 is the second step in that journey. First, let's take a closer look at the Meteor Lake die, then dive into the details of the I4 presentation.
Meteor Lake will use Intel's Foveros 3D packaging tech, just as we saw with the Lakefield processors. However, this will be Intel's first foray into high-volume manufacturing with this leading-edge packaging tech.
Intel will connect the four die (called 'tiles' in Intel parlance) to an interposer through TSV connections. Intel hasn't disclosed if this interposer will be active or passive or if it will hold caches or other control circuitry. Intel will mount four tiles atop this interposer: the compute tile, I/O tile, SOC tile, and graphics tile.
Intel has specified that compute tile will use I4 but hasn't said which nodes it will use for the other tiles. During its Analyst Day earlier this year, the company shared the slide in the above album that lists TSMC's N3 (3nm) node with the Meteor and Arrow Lake processors. This is largely thought to comprise the graphics tile. Time will tell.
As with Alder Lake, the Meteor Lake chips have an x86 hybrid architecture. In this case, we have six p-cores and eight e-cores. The exploded view of the compute die shows us six blue-colored performance cores (p-cores), used for latency- and performance-sensitive work, on the left of the die. To the right, we see two four-core clusters of efficiency cores (e-cores) in purple. These cores step in for background and multi-threaded tasks. The center of the chip contains the L3 caches and interconnect circuitry. Intel has yet to provide a further description of the differences between the SoC and I/O tiles, with the former a likely candidate for memory controllers and PCIe interfaces, while the latter could be for Thunderbolt and other PCH-types of interfaces.
Intel isn't giving us much to work with here, but the company shared far more expansive details on the I4 process node that makes the compute die tick.
Intel 4 Process Node
Intel, like its competitors, usually bakes two versions of each process node — a high-density library that looks to squeeze in the most transistors as possible at the expense of performance, and a high-performance library that trades off some transistor density to provide more performance. Naturally, Intel and its competitors always refer to the high-density library for the density metrics they use in marketing. Still, most of the flagship high-performance chips you see on the market actually use the less-dense library.
Quite surprisingly, Intel isn't creating a high-density library for its I4 node. Intel hasn't explained why; instead, it simply said it will focus solely on performance products for I4. Intel hasn't given any technical reasons for excluding a high-density library from this process node, but it will likely result in some speculation. Notably, Intel recently announced that it would delay its Granite Rapids Xeons from 2023 to 2024 due to switching the design from I4 to I3 — perhaps this is to take advantage of a working high-density library for some of the products.
The I4 node is forward compatible with I3, so designs can be moved between the two without going through the usual time-consuming steps of porting an architecture. Intel says that I4's successor, 'Intel 3,' will come with both high-performance and high-density libraries. The I3 process will also have enhanced transistors and interconnects, along with more EUV layers to further simplify the design. The I3 node will be the first offered to Intel's customers through its Intel Foundry Services (IFS).
After I3, Intel will move to the angstrom era with the 20A and 18A nodes, both of which will introduce even more exotic new tech, like new RibbonFETs (gate all around/nanosheet) and PowerVia (backside power delivery) tech.
The I4 node is Intel's first node to extensively use EUV lithography to simplify manufacturing, and we can see the results in the second and third slides in the above album. Intel's previous-gen process requires multiple immersion lithography steps to process some layers of the stack, but EUV allows the company to use one exposure to etch a single pattern. This reduces the number of steps in the process flow by 3 to 5x for that portion of manufacturing.
Naturally, EUV results in fewer defects, thus providing higher yields. It also increases processing speed significantly, but it has other benefits, too. For example, the underlying metal stack must also be aligned at each step in the manufacturing flow. Thus, EUV helps with yield issues that occur due to misalignments because the layers only have to be aligned once for that section of the manufacturing flow rather than multiple times. This further improves yield.
Intel uses EUV in both the front and back end of the manufacturing flow. As seen in the third slide, the result is that I4 has 5% fewer process steps and a 20% lower total mask count than I7. As you can see by the extrapolated result in the center of the charts, without EUV, I4 would require more steps than I7. Unfortunately, Intel hasn't divulged the exact number of layers it etches with EUV lithography.
Intel's Contact-Over-Active-Gate (COAG) debuted with the I7 process and increased density by moving the contact from the edge/outside of the gates and placing it on top of the gates. The second generation of this tech helps to further improve density in the I4 process. Likewise, Intel removed dummy gates from I7 but improved that technique with I7 by removing a diffusion grid between the arrays. Intel also went from four fins to three.
The I4 node has 18 metal layers compared to the I7 node's 17 metal layers, with enhanced copper introduced into the lower metal layers to improve electromigration/reliability while maintaining performance (more on that below). We also see reduced pitch throughout the entire stack. (The two thick metal layers are for power routing.)
The I4 process has two different types of SRAM cells. It's well known that SRAM doesn't scale as fast as logic. Intel has disclosed a .77x scaling for its High Density Cell (HDC) but hasn't disclosed the scaling metric for the High Current Cell (HCC) that will be used in performance-oriented designs.
Interconnects, the tiny wires that connect transistors, continue to become smaller over time, now only being the width of a few electrons. As such, they've become one of the main barriers to increasing transistor density, as smaller transistors simply require smaller wires. Intel switched to using cobalt instead of copper with its I7 process node, resulting in less performance. It was also rumored to be part of the reason for the incessant delays that cost the company its leadership position.
Intel disclosed that it uses enhanced copper in the M0 to M4 layers to improve interconnect performance and shared slides (second and third in the above album) that show the improvements it has made with its wire designs in the critical lower layers. Here we can see two of these approaches with the I7 node — one with pure cobalt with a tantalum barrier, and another with a tantalum nitride barrier over copper alloy. These two approaches each have significant tradeoffs for either resistance (performance) or reliability (electromigration).
The I4 process uses an 'enhanced copper' design that leverages a tantalum barrier with cobalt cladding over pure copper. This design provides the best of both performance and reliability.
Finally, Intel divulged that it has doubled its MIM capacitance density over the I7 process. As a reminder, this is the Metal-Insulator-Metal (MIM) capacitor that Intel branded as 'SuperMIM" with the then-10nm process. This helps combat Vdroop, which helps ensure longer sustained frequency potential by eliminating localized chip brownouts during power-intensive work, like SIMD instructions. As a result, sustained clock speeds should improve dramatically.
Intel's Hillsboro facility will be the first to produce I4 devices, and the Ireland campus is the obvious next candidate due to the fact that it is the only other known Intel campus with an EUV machine. We'll learn more about Intel 4 as it comes closer to market in Meteor Lake, which Intel says arrives on the market in 2023.
0 Comments