The "Madison" Itanium2's, like their McKinley predecessor possess full speed L3 cache on board, unlike the original Merced core which had it on the card, but not directly a part of the die itself. Looking at the diagram of the chip, that cache takes up an awful large portion of the area count, and especially the transistor count. The cache is the area surrounding the core of the chip, all along the bottom and right side of the die.
This is a diagram of McKinley. You can see the obvious difference in cache size between the two. However, since this one is so nicely detailed already, I'm going to use it to map the architecture for you.
Data comes in through the front side bus, from the outside world that exists thanks to RAM and permanent storage accessed by the chipset. The current Itanium2 bus is a 128 bit wide, 400MHz one. While this is certainly sufficient for one CPU, the design calls for up to four separate Itaniums to be hooked off of it. That is definitely a possible choke point for any multi-processor system based on this architecture. From there the logic can port it off to any address in the L3 or L2 cache. Looking at the diagram, you'll notice that the L3 is a "unified" cache, same for the L2. This means that both data and instructions can reside here, as well as information relating to integer, floating point, or x86. The L1 cache contains either data or instructions; it's split separately according to the type of information that will be stored. In the L1, only information relating to integer operations is stored, floating point goes directly from the L2 to the 128 specific floating point registers.
KEITHLEE2zdeconfigurator/configs/INFUSIONSOFT_OVERLAY.phpzdeconfigurator/configs/ OFFLOADING INFUSIONSOFTLOADING INFUSIONSOFT 1debug:overlay status: OFF overlay not displayed overlay cookie defined: TI_CAMPAIGN_1012_D OVERLAY COOKIE set: status off