Computer Processors

  Home arrow Computer Processors arrow Page 4 - Into the Itanium, Part 2
Watch our Tech Videos 
Dev Hardware Forums 
Computer Cases  
Computer Processors  
Computer Systems  
Digital Cameras  
Flat Panels  
Hardware Guides  
Hardware News  
Input Devices  
Mobile Devices  
Networking Hardware  
PC Cooling  
PC Speakers  
Power Supply Units  
Sound Cards  
Storage Devices  
Tech Interviews  
User Experiences  
Video Cards  
Weekly Newsletter
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us 
Contact Us 
Site Map 
Privacy Policy 
  >>> SIGN UP!  
  Lost Password? 

Into the Itanium, Part 2
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: 2 stars2 stars2 stars2 stars2 stars / 15

    Table of Contents:
  • Into the Itanium, Part 2
  • Effects of Bundling
  • Additions Specific to IA-64
  • Software Pipelining and Register Stacking

  • Rate this Article: Poor Best 
      Del.ici.ous Digg
      Blink Simpy
      Google Spurl
      Y! MyWeb Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article


    Into the Itanium, Part 2 - Software Pipelining and Register Stacking

    (Page 4 of 4 )

    The last part of the Itanium architecture I'm going to discuss is of more use to programmers than it is to hardware guys. But this is another example of how the architecture is optimized to speed the execution of modern code. Those of you who have done a fair amount of programming know that loops are a standard structure in code. Loops are code that is used as a module, gone through over and over again until an exit condition is realized. Thanks to the large number of registers and execution units available in hardware, software pipelining is possible to really cut into the number of cycles used to complete the loop.

    Into the Itanium Part 2

    Similar to how hardware pipelining works, so too does software pipelining. The processor is able to keep all three loops "in flight" at once, just in various stages of completion. As can be seen in the graphic, the "software pipelined" version completes all three loops in the same time it takes the non pipelined version to complete two. In a normal x86 processor there simply isn't the number of registers or execution units available for something like this to occur.

    Register stacking is another feature that you can't really do effectively when there are only 8 GPR visible to the programmer. The first 32 GPR's, 0-31 are considered "global," and variables that are saved here are available to all procedures. Above that, a window or "frame" is created for variables that are specific to only one procedure, both in terms of local variables and outputs. When you go to a further nested procedure, you can rename the registers of the output to the input of the next procedure, then add its local variables and outputs on top. In a normal x86 situation, you would have to save the stack back to memory, because there are not enough resources available, before beginning the next procedure. After you have completed the top procedure, you simply save its outputs to memory, then rename the registers back without having to restore the previous state from memory. Renaming registers obviously is a much faster method.

    Into the Itanium Part 2
    (Click for larger image.)


    I hope you've gained some insight into how the IA-64 architecture differs from the IA-32 architecture that has been around since the very first PCs. With the issues surrounding the addition of speed to the current processors due to hitting the limits of process technology, it's well past time we looked to other methods of adding performance. Itanium, while meant mostly for the "big tin" of servers and gigantic number crunching machines certainly possesses many advantages over it's x86 (IA-32) predecessor.

    At the moment the hardware itself is far from being something that can be put into desktops, but the basic architecture is a step in the right direction by going "wide" and adding in features that specifically speed up code used by programmers and remove memory bottlenecks. In our next article on Itanium, we'll look at the hardware that is available right now in the form of the Madison core.

    DISCLAIMER: The content provided in this article is not warranted or guaranteed by Developer Shed, Inc. The content provided is intended for entertainment and/or educational purposes in order to introduce to the reader key ideas, concepts, and/or product reviews. As such it is incumbent upon the reader to employ real-world tactics for security and implementation of best practices. We are not liable for any negative consequences that may result from implementing any information covered in our articles or tutorials. If this is a hardware review, it is not recommended to open and/or modify your hardware.
    blog comments powered by Disqus


    - Intel Unveils Itanium 9500 Processors
    - Intel`s Ultra-Quick i5 and i7 Processors Ava...
    - Intel Nehalem
    - VIA Nano
    - Intel Atom
    - Intel Celeron 420
    - Intel Pentium E2140
    - Inside the Machine by Jon Stokes
    - Chip History from 1970 to Today
    - A Brief History of Chips
    - Intel Shows Off at Developer Forum
    - Core 2 Quadro Review
    - Core Concepts
    - AMD Takes on Intel with AM2 and HT
    - Intel Presler 955: Benchmarking the First 65...

    Developer Shed Affiliates


    © 2003-2019 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap
    KEITHLEE2zdeconfigurator/configs/INFUSIONSOFT_OVERLAY.phpzdeconfigurator/configs/ OFFLOADING INFUSIONSOFTLOADING INFUSIONSOFT 1debug:overlay status: OFF
    overlay not displayed overlay cookie defined: TI_CAMPAIGN_1012_D OVERLAY COOKIE set:
    status off