3x better than Meteor Lake, MT and ST performance >20%

As noted during Hot Chips, Intel Lunar Lake processors will provide 3x better latency and higher throughput compared to Meteor Lake.

Intel Lunar Lake CPU Cache and Core Latency Optimizations Improve Single- and Multi-Threaded Performance by Over 20%

Intel's Lunar Lake processors will be available on September 3, and the company has already detailed them during the Tech Day at Computex 2024. The next-generation SoCs consist of various core IPs, such as an entirely new processor architecture with Lion Cove as the P-Core design and Skymont as the E-Core design.

Compared to the Meteor Lake processors that were released earlier, Intel's Lunar Lake processors contain two types of cores. The Lion Cove cores come in a standard P-Core configuration, while the Skymont E-cores come in a Low-Power or LP-E configuration. The P-Cores and LP-E cores are also on different tiles with the P-Cores on the Compute tile and the LP-E cores on a dedicated Low-Power island. In Meteor Lake, the processors came with a three-core architecture with Redwood Cove P-Cores and Crestmont E-Cores on the Compute tile, with 2 additional Crestmont LP-E cores on the SOC tile.

This interaction between the two tiles caused higher latencies when using LP-E cores on Meteor Lake. With Lunar Lake, Intel also increased the cache and bandwidth between the chips, which resulted in lower latency.

In the Load to Exploit Memory Latency scenario, Lunar Lake P cores show lower latency across the board across a range of buffer sizes from 2KB to 100MB. LP-E cores tell a different story, as Meteor Lake and Lunar Lake offer similar latency up to 32KB, but as the buffer size increases, Lunar Lake offers much lower throughput, with the delta increasing to 150%.

One of the main reasons why latency has improved so much is due to the overall throughput increase, which is up to 2.8x for the low-power Lunar Lake cores compared to Meteor Lake. Intel also shared a latency coherence chart between clusters, showing Lunar Lake P-cores with an average latency of about 25ns and 55ns for the LP-E cores. That’s a 3x improvement over Meteor Lake processors.

In terms of bandwidth, Intel Lunar Lake delivers over 128GB/s of main memory bandwidth and scales up to 16GB/s, while Meteor Lake processors deliver just over 64GB/s of bandwidth and scales down to less than 8GB/s.

Low overhead results in better utilization of multiple threads via Intel Thread Director Technology, resulting in efficient utilization of the core cluster and acceleration in multi-threaded applications. Lunar Lake is advertised as delivering over 20% multi-threaded performance over the lower-core-count Meteor Lake, as well as over 20% single-threaded performance at Fmax and the same ST performance as Meteor Lake at half the power. Other improvements for Lunar Lake processors include:

More than twice the performance per watt

  • AI throughput is more than 3x that of NPU, GPU and CPU
  • For graphics applications, Lunar Lake delivers up to 50% performance gains, significantly improving the user experience for gamers and content creators.
  • To save battery life, Lunar Lake reduces SOC power by up to 40%, which is an important step for mobile devices that users will notice.

Expect more details on Lunar Lake next month as Intel prepares to officially launch its thin and light client platform.

Share this story

Facebook

Twitter

Source link

Leave a Comment