Computer Processors

  Home arrow Computer Processors arrow Page 3 - Into the Itanium, Part 2
Watch our Tech Videos 
Dev Hardware Forums 
Computer Cases  
Computer Processors  
Computer Systems  
Digital Cameras  
Flat Panels  
Hardware Guides  
Hardware News  
Input Devices  
Mobile Devices  
Networking Hardware  
PC Cooling  
PC Speakers  
Power Supply Units  
Sound Cards  
Storage Devices  
Tech Interviews  
User Experiences  
Video Cards  
Weekly Newsletter
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us 
Contact Us 
Site Map 
Privacy Policy 
  >>> SIGN UP!  
  Lost Password? 

Into the Itanium, Part 2
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: 2 stars2 stars2 stars2 stars2 stars / 15

    Table of Contents:
  • Into the Itanium, Part 2
  • Effects of Bundling
  • Additions Specific to IA-64
  • Software Pipelining and Register Stacking

  • Rate this Article: Poor Best 
      Del.ici.ous Digg
      Blink Simpy
      Google Spurl
      Y! MyWeb Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article


    Into the Itanium, Part 2 - Additions Specific to IA-64

    (Page 3 of 4 )

    Most people are aware of how Branch Handling is done on modern CPUs. For a brief recap of that: if/else statements are used when you aren't sure which of two or more possible scenarios are going to happen. For example, something like this might show up at some point in your code:

    if (a > b)
      c = c + 1
      else d = d*e + f

    This is what's called a "control" dependency. Coming into that area, the processor doesn't know what to do. What it has to do, essentially, is guess. Now, modern CPUs like the Pentium4 have become very accurate at guessing which one is going to occur. However, when they guess wrong, a pipeline flush has to occur. And with the exceedingly long number of stages found in the Prescott core, that's a real pisser to performance.

    What IA-64 does instead is turn a "control" into a "data" dependency. This is done with something call "predication." There is one separate register in the core, with each bit able to be set separately, which of course leaves 64 mini registers. Within an instruction, you can manipulate these predicate bits, and use them to make decisions. Instead of having branches and jumps dependent on performing a compare, both paths are loaded into the processor in parallel. While those are in the pipeline, the predicate bits determine which one is made use of, and which one is treated as a "NOP" or "no operation." For example, our code above instead becomes:

    p1, p2 = compare (a>b)
    if (p1) c = c + 1
    if (p2) d = d*e + f

    In that first line, if the statement is true then p1 (the predicate bit in position 1) is set to a value of 1, while p2 is set to zero. The opposite is true if the statement is false. Now the instruction is dependant on the values of p1 and p2, which can all be done in parallel, avoid jumps or branches, and best of all, avoid a pipeline stall/flush if the CPU tries to execute the wrong one.

    Into the Itanium Part 2
    (Click for larger image.)

    Speculation is another added feature of the Itanium architecture. Control and Data speculation are a way to help hide memory transactions. Waiting on memory is a good way to kill performance. For every current processor, the speed of main memory is usually quite a bit slower than the CPU itself. So when the CPU asks for something from memory, it has to sit and wait for that data to come back before it can do anything. Since the CPU runs faster, it has to waste multiple cycles, which causes inefficiency obviously. Even when something is only in one of the higher cache levels, there is still latency involved in finding that piece of information, and then bringing it into the register file.

    To get around that, Itanium allows for the compiler to shift around load commands to "hide" them earlier in the code. I mentioned above that for some bundles, the compiler simply isn't able to find enough stuff to put in parallel as far as execution is concerned, and inserts "nops" to fill up the bundle. One option is to move a "load" call from memory that happens later into that empty space. This way, when the data is needed later, instead of waiting on it, it's already available.

    "Data" speculation is where a load is made before a store that originally preceded it. Since that store might have made the preloaded data "dirty," a check is needed to ensure that has not occurred. Instead of the normal code for a load where it was supposed to be, a check is inserted to see if the data is valid or not. If it's fine, you save a memory access. If the store did change the data, you go and perform a recovery load, and just don't gain any performance. A control speculation is similar, except it's used to protect loads that are involved in branches.

    More Computer Processors Articles
    More By DMOS

    blog comments powered by Disqus


    - Intel Unveils Itanium 9500 Processors
    - Intel`s Ultra-Quick i5 and i7 Processors Ava...
    - Intel Nehalem
    - VIA Nano
    - Intel Atom
    - Intel Celeron 420
    - Intel Pentium E2140
    - Inside the Machine by Jon Stokes
    - Chip History from 1970 to Today
    - A Brief History of Chips
    - Intel Shows Off at Developer Forum
    - Core 2 Quadro Review
    - Core Concepts
    - AMD Takes on Intel with AM2 and HT
    - Intel Presler 955: Benchmarking the First 65...

    Developer Shed Affiliates


    © 2003-2019 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap
    KEITHLEE2zdeconfigurator/configs/INFUSIONSOFT_OVERLAY.phpzdeconfigurator/configs/ OFFLOADING INFUSIONSOFTLOADING INFUSIONSOFT 1debug:overlay status: OFF
    overlay not displayed overlay cookie defined: TI_CAMPAIGN_1012_D OVERLAY COOKIE set:
    status off