Intel`s Newest Itanium - Increasing the Itanium Cache
(Page 3 of 4 )
Essentially this is a trial run for the new Itanium2 bus, but we’ll have to see in real world settings just how reliable the faster system clock is. Given that this is their flagship enterprise product, we can assume that it has proven quite capable in pre-release testing. However, nothing beats real world applications as a test bed. It also will make chipset and memory cost easier in the future by allowing for designs to pop up now that should be able to support both processor designs (single and dual core) in the future.
As was mentioned when I started this article, this release is more about the future than it is the "now." It all comes back to next true bump in the Itanium design. Today's addition to the lineup is merely the last stepping stone to the real deal. The monster known as Montecito is to carry 1.72 billion transistors on a 90nm process, a nearly 3x increase in transistor count. Moore's Law is still in effect without a doubt.
Despite the change in process technology, the die manages to be even larger than the Madison 9M core, which was pushing how large you could physically make a single chip as it was. The new one is to have a surface area of 596mm2 where the Madison 9M is "only" 480mm2.
What do you gain with such mind numbing numbers to describe the physical dimensions? Montecito will possess changes to the cache structure to further improve performance, starting with the L2 cache. The current design (Madison) is what's known as a "unified" cache, with both instructions and data fitting in anywhere in the 256KB L2 memory map. The new design changes to having a specific 256KB area for data and a 1MB area for instructions in each core (two of each on the whole die). The L2 cache has lower latency than the L3, which should improve performance on its own (as there is now 1.25 MB onboard instead of the previous 256KB). Also non-unified caches have shown to have better performance across a wide variety of code. What is quite interesting is that the instruction division of the cache is much larger than that of the data compartment, the opposite of typical x86 CPUs and the Itaniums own L1 cache structure.
In addition to the second execution core, the other part of the design that takes up a huge part (the most significant portion number wise) of the transistor count is the 24MB of L3 cache on the die. That's 12MB per core if it were to be split evenly; at this point it has been assumed, through what information Intel has released, that there will be two L3 caches, each one being separate from the other (from a design/software standpoint, physically they will be packed side by side).
Next: Grained Multithreading and Itanium's Future >>
More Computer Processors Articles
More By DMOS
| Recommended by Dev Hardware |
|---|
|