Intel’s delays in shipping a major new microprocessor

0

Last May, Sandra Rivera, CEO of chip giant Intel, received some terrible news.

Engineers have worked for more than five years to develop a powerful new microprocessor for data center computing tasks, and are confident they’ve finally found the right product. But during a regular morning meeting to discuss the project, there were signs of a potentially serious technical problem.

The problem was so severe that the microprocessor, codenamed Sapphire Rapids, was delayed – the latest in a series of setbacks for one of Intel’s most important products in recent years.

“We are disappointed,” said Ms. Rivera, executive vice president of Intel’s data center and artificial intelligence group. “It was a tough decision.”

The launch of Sapphire Rapids has been pushed back to mid-2022 until Tuesday, two years later than originally planned. The lengthy development of the product, which combines four chips in a single package, underscores some of the challenges facing Intel’s overhaul as the United States seeks to dominate mainstream computing.

Since the 1970s, Intel has been a big player in the small silicon wafers that control many electronic devices, best known for a variety of so-called microprocessors that act as the electronic brains in many computers. But the Silicon Valley company has in recent years lost its long-standing leadership in the manufacturing technology that helps determine how fast the chips can compute.

Patrick Gelsinger, who became Intel’s CEO in 2021, promised to restore its manufacturing capabilities and build new factories in the United States. He was a leading figure last summer when Congress debated and passed legislation aimed at reducing US dependence on chip manufacturing in Taiwan, which China claims as its territory.

The rough development of Sapphire Rapids affects Intel’s ability to bounce back to deliver future chips on time. It’s a problem that could affect dozens of PC manufacturers and cloud service providers, not to mention the millions of customers who use online services powered by Intel technology.

“We need a steady and predictable cadence,” said Kirk Skaugen, executive vice president of server sales at Lenovo, China, which plans 25 new systems based on the new processor. “Sapphire Rapids is the beginning of this journey.”

The pressure is on for Intel. Along with declining demand for chips used in personal computers, the company faces fierce competition in its most profitable business, server chips. That question has worried Wall Street, which has seen Intel’s market value drop more than $120 billion since Mr. Gelsinger took office.

Intel plans to hold an online event Tuesday to discuss Sapphire Rapids, named after a section of the Colorado River. Officially, this product is called the 4th generation Intel Xeon Scalable processor.

In an interview, Mr. Gelsinger said Sapphire Rapids has a hit, despite the delay. In 2021, it tapped Ms. Rivera to lead its development division, where she will use lessons learned to transform the way Intel’s products are built and tested. He said Intel has conducted several internal reviews of what happened to Sapphire Rapids and “we’re not done yet.”

Sapphire Rapids began in 2015 with discussions among a small group of Intel engineers. This product was the company’s first attempt at a new approach to chip design. Companies now typically pack tens of billions of tiny transistors into each piece of silicon, but rivals such as Advanced Micro Devices and others have begun making processors from just a few chips integrated into plastic packages.

Intel engineers came up with a four-chip design with 15 processor “cores,” each acting as a separate calculator for general-purpose computing tasks. The company also decided to add additional circuit blocks for special tasks, including artificial intelligence and encryption, and to communicate with other components such as data storage chips.

The interactions between the many elements are “extremely complex,” says Shlomit Weiss, who leads Intel’s engineering group. “Complexity usually begets complications.”

The Sapphire Rapids team struggled with bugs, flaws caused by design errors, or manufacturing issues that caused the chip to miscalculate, run slowly, or stop working. They were also affected by the delay in the production process.

But in December 2019, engineers took a step called “tape”. This is when the electronic files containing the completed design are sent to the factory to manufacture the sample chips.

The chip samples arrived in early 2020 during the Covid-19 enforced lockdown. Engineers quickly acquired Sapphire Rapids computing cores to communicate with each other, said project leader Nevin Nassif. But it was more work than expected.

One of the key tasks was “verification,” a testing process in which Intel and its customers run software on model chips to simulate computing tasks and detect errors. After defects are found and repaired, the designs can be returned to the factory to make new test chips, which typically takes more than a month.

Repeating this process resulted in missed deadlines. Ms. Nassif said Sapphire Rapids was designed to compete with AMD’s Milan processor, which is expected to be introduced in March 2021. But it wasn’t ready in June, with Intel announcing a delay until next year to allow for further testing.

That’s when Ms. Rivera stepped in. The longtime Intel executive built a successful business in networking products before being named chief human resources officer in 2019.

“We needed to get the death row mojo back,” Gelsinger said. “I needed someone to step into the fire and fix this thing for me.”

In October 2021, Ms. Rivera and a senior designer set up weekly meetings on Mondays at 7 a.m. about the state of Sapphire Rapids. Those rallies showed steady progress in finding and fixing bugs, he said, adding to confidence that production will begin next year. Second quarter of 2022.

Then last May, the problem was discovered. Ms. Rivera did not describe it in detail, but said it affected the processor’s performance. In June, it used an investor story to announce a delay of at least a quarter, delaying Sapphire Rapids from the November launch of a rival AMD chip.

“We were ready to ship,” Ms. Nassif said. The latest delay was “tragic given all the effort.”

Ms. Rivera learned several lessons from her failures. The first is that Intel crammed too much innovation into Sapphire Rapids rather than delivering a less ambitious product sooner.

He also concluded that the team should spend more time refining and testing their design using computer simulations. Finding errors in chip models is cheaper and would have removed features to simplify the product, Ms. Rivera said. Since then, he has moved on to strengthen Intel’s modeling and validation capabilities.

“We had a lot of these muscles that allowed us to atrophy,” Ms. Rivera said. “Now we’re rebuilding.”

He also found that Intel had more product plans than its engineers and customers could easily handle. As such, it has streamlined its product roadmap, including pushing the Sapphire Rapids successor from 2023 to 2024.

More broadly, Ms. Rivera and other Intel executives have pushed the organization to develop better processes for documenting technical issues and sharing that information inside and outside the company.

Some Intel customers report improved connectivity.

“Did everything go well?” No,” said Lenovo’s Skaugen, who once ran Intel’s server chip business. “But we’re a lot less surprised than we used to be.”

Tech

All news on the site does not represent the views of the site, but we automatically submit this news and translate it through software technology on the site rather than a human editor.

Leave A Reply

Your email address will not be published.