Another Intel 7nm Chip Delay – What Does it Mean for Aurora Exascale?

The saga of Intel’s inabilities to deliver a 7nm process chip and a supercomputer called Aurora to Argonne National Laboratory opened new chapters yesterday with Intel CEO Bob Swan’s statements that the company’s 7nm “Ponte Vecchio” GPU, integral to its Aurora exascale system scheduled for delivery next year, will be delayed at least six months.

More than two years ago, Intel announced it had scrapped plans to build a pre-exascale system called Aurora (A18) for Argonne – a system scheduled for delivery in 2018 that was to have incorporated Cray’s “Shasta” supercomputing architecture and have peak performance of 180 petaflops.

Instead, Intel, as prime contractor, moved on to a new version of Aurora (A21), with full exascale (a billion billion operations per second) compute power combining Intel 10nm CPUs and “Ponte Vecchio” 7nm GPUs, along with the Cray Shasta architecture for 2021, a centerpiece of the U.S. exascale strategy.

But yesterday’s news not only is another in a long line of Intel chip (both 10nm and 7nm) delays in recent years – as reflected in an immediate drop in the price of Intel shares by more than 15 percent despite an otherwise stellar earning report – it includes the first admission by Intel that it may outsource production of 7nm chips to a third-party semiconductor foundry. This would likely be TSMC (Samsung is another possibility), already producing 7nm CPUs and GPUs for AMD, whose processor innovations and overall price/performance has enabled it to take HPC and data center processor market share from Intel.

Among the Wall Street analysts to downgrade Intel stock was Cowen’s Matthew Ramsay, whose Intel commentary, entitled “Yes, It’s That Bad,” includes this statement, as reported by MarketWatch: “Road map uncertainty on 7-nm (both timing and in/out-sourcing manufacturing strategy) will leave Intel’s customers and investors frustrated and simultaneously ensure that the company cannot catch TSMC-enabled competitors’ (namely AMD and Nvidia) silicon process leadership in any bounded time frame as it stands now.”

But is Ramsay taking too gloomy a view, and does Intel’s renewed 7nm difficulties necessarily mean a repeat of the first Aurora (A18) failure?

HPC industry analyst Steve Conway, senior adviser, HPC market dynamics, Hyperion Research, sees the latest developments as a relatively short-term pause, not a crippling blow.

“This would move it (Aurora) to late 2021, but possibly push it into early 2022, so it’s not a major delay – but a delay all the same,” Conway told us. “I don’t know what that does to contractual terms, but it looks as if what’s happened is that they’ve figured out the design pretty well but there’s some defect that they need to correct before it can go into full production – not just that GPU, but all of their seven nanometer parts.”

“It looks very much as though Intel is basically on top of it, with the design and so forth,” Conway added. “They have to just correct a defect. These are very, very complicated parts and it’s not unusual to have either some kind of a defect that’s probably short of a re-spin.”

Swan said as much yesterday.

“We have identified a defect mode in our 7-nanometer process that resulted in yield degradation,” said Swan. “We’ve root-caused the issue and believe there are no fundamental roadblocks, but we have also invested in contingency plans to hedge against further schedule uncertainty.”

…yesterday’s news not only is another in a long line of Intel chip (both 10nm and 7nm) delays in recent years – as reflected in an immediate drop in the price of Intel shares by more than 15 percent despite an otherwise stellar earning report – it includes the first admission by Intel that it may outsource production of 7nm chips to a third-party semiconductor foundry. This would likely be TSMC (Samsung is another possibility), already producing 7nm CPUs and GPUs for AMD, whose 7nm chips and overall price/performance has enabled it to take HPC and data center processor market share from Intel.

As for outsourcing its fab capabilities, Conway said it’s most likely a temporary measure until Intel can iron out its own production capabilities.

“Intel will presumably manufacture part of the output and then the rest will be outsourced,” he said. “And that’s just for an initial period of time.”

Still, Conway sees this as enhancing AMD’s position relative to Intel in the HPC market, along with AMD’s efforts to build the 2 exaflop El Capitan system with HPE/Cray for Lawrence Livermore National Lab, and the 1.5 exaflop Frontier system, also in partnership with HPE/Cray, for Oak Ridge National Lab.

“It’s another silver dollar falling into AMD’s hat,” said Conway. “At this point, AMD with Epyc (7nm CPU) has really done well in part because of memory bandwidth, but it’s also a combination of the pricing for similar SKUs, the AMD pricing is often lower. So the combination of that is giving AMD a boost.”

Both Intel and Argonne National Lab had yet to respond to inquiries as of press time.

Comments

  1. Leo Kuan says

    I am not sure how much Steve Conway knows about chip process engineering and how foundry like TSMC works. He took the words from Intel CEO who tried to put a good face on the issue to Wall St. If just a simple problem that have a good solution for 6 month delay then why they bother seriously consider 3-rd party? Even if TSMC has the capacity, they will have to co-redesign and tape out based on the foundry’s 5nm process. It takes time and also why TSMC would put huge invest in capacity for Intel temp solution unless Intel will keep doing business with TSMC and they pay for like TSMC’s 5nm fab in Arizona.

    Also insider info indicates that Intel 7nm current status is worse than their 10nm, which took several years of delays from 2016.