Computer Processors
  Home arrow Computer Processors arrow Page 3 - x86-64: The Golden Handcuffs
Dev Hardware Forums 
Computer Cases  
Computer Processors  
Computer Systems  
Digital Cameras  
Flat Panels  
Gaming  
Hardware Guides  
Hardware News  
Input Devices  
Memory  
Mobile Devices  
Motherboards  
Networking Hardware  
Opinions  
PC Cooling  
PC Speakers  
Peripherals  
Power Supply Units  
Software  
Sound Cards  
Storage Devices  
Tech Interviews  
User Experiences  
Video Cards  
Mobile Linux 
APP Generation ROI 
IBM® developerWorks 
Weekly Newsletter
 
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid 
Request Media Kit
Contact Us 
Site Map 
Privacy Policy 
Support 
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
COMPUTER PROCESSORS

x86-64: The Golden Handcuffs
By: DMOS
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: 2 stars2 stars2 stars2 stars2 stars / 22
    2005-01-26

    Table of Contents:
  • x86-64: The Golden Handcuffs
  • CISC vs RISC
  • x86-32, IA-64, And Now x86-64
  • Conclusion

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      Del.ici.ous Digg
      Blink Simpy
      Google Spurl
      Y! MyWeb Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article
     
     
    ADVERTISEMENT


    x86-64: The Golden Handcuffs - x86-32, IA-64, And Now x86-64


    (Page 3 of 4 )

    As I mentioned before, x86 is considered a CISC type design. Having experience with much simpler RISC architectures, as well as more organized CISC ones like the Motorola 6800, I must admit that Intel’s design is very intimidating. It’s also cursed with having to be backward compatible for code from late 70s. As noted in a fantastic lecture at Stanford by one of Intel’s former Chief Architects (http://stanford-online. stanford.edu/courses/ee380/040218-ee380- 100.asx), it’s a pain to carry on old parts of the design you know no one uses. However, taking a gamble with a mainstream processor--on the odd chance someone who makes a popular program suddenly uses that old instruction--is not worth the consequences.

    Almost 10 years ago, Intel and Hewlett Packard (HP) caught on to the fact that it was time to start from scratch, and go another direction. What came out of this was Itanium. It was meant to be a 64-bit replacement for high end computing, and eventually work its way downwards to be a complete line replacement for x86.

    Itanium, in its ISA design, fixes many of the shortcomings of x86 that have come to light in the last 30 years. Intel, it seems, made a mistake in designing for the high end, as it seems at this point impossible to shift Itanium (some call it the Itanic) to a more friendly cost point for those that don’t need to study weather patterns or tectonic plate shifts in the Earth’s crust.

    Additionally, Intel mistook the desire of consumers to continue to use old code on their new machines. As Mr. Colwell mentioned in the lecture, people are weird. The general public is very attached to the programs they are used to, and steadfastly refuse to move on to something new (regardless of how good it might be).

    Along came AMD, needing a gimmick to make up for another Joe Six-pack’s “MHz RULE!” myth. Since Intel has decided to devalue MHz, and march along at ever increasing speed regardless of what that does to efficiency, AMD needed a “blue crystal” to sell their product: 64-bit. So they went on and extended x86’s registers, much like what has happened before going from 16 to 32, and 8 to 16 before that. And in the process they have extended our suffering with the “golden handcuffs” of x86 for another 10 years.

    In the workstation market, and the server one as well, 64-bit is certainly needed. The reason? 32-bit addresses only allow for up to 4GB of memory. To those of you who now think that 1GB is the least you’ll buy for a new computer at home, you should keep in mind that this was surpassed long ago by systems doing CAD work, simulation, and other development. The fact of the matter is that any machine based off of Xeon or Athlon MPs had to use hacks to get around the 4GB limit for the system. Even then, 4GB was the limit that could be donated to a single program. In a large database for example, that’s still not sufficient.

    The solution? Make a move to 64-bit addres-sing and pointers. Sun, IBM and others made this move a long time ago. That allows RAM addressing in the level of terabytes, for both the system as a whole, and specific programs.

    “But isn’t 64 twice as fast as 32?” Not quite. Fielding questions like this make me wonder how Joe Six-pack can invest thousands of dollars into something, and not have a clue how it works. Looking at it from a current standpoint, for floating point (FP) operations, which can make use of the extra precision, we already have 64-bit. As you probably know, FP numbers are numbers with a decimal in them. Computers are by design inaccurate working with these types of numbers, since they can’t hold fractions. The more precision (more bits) you give them, the less of a problem “rounding errors” become.

    FP was not originally in the x86 architecture (which gives another indication of its age), and was added later as a separate co-processor with another instruction set called x87. Since that time, they have been added on die, and are now an integrated part of any x86 based architecture.

    x87 is 80 bits wide and stack based. It also suffers from terrible performance relative to other FP implementations. SSE (Streaming SIMD [Single Instruction Multiple Data] Extensions) and SSE2 were added later to bump up multimedia application performance. As a bonus, they also do a much better job of FP operations, both performance and precision wise. SSE2 is already 128 bits wide, as shown in figure 1, and it holds two 64 bit floating point values, or 4 32 bit ones per register. So going to a full 64 bit architecture gains you nothing that way.


    Figure 1 SSE2


    Lately, both AMD and Intel have encouraged compliers and coders to make use of SSE/2 instead of x87 for the obvious reasons. But what about for integer math? At this point, most operations fit nicely into 32 bits. There will of course be some advantages that will be gained automatically: previous operations that had to do two operations and save across two spaces to take care of a value that does not fit into 32 bits (any number greater than 4,294,967,295) will now be able to be done in one shot. For now, those are rare, but to quote Field of Dreams, “build it, and they will come.” Eventually there will be commercial programs to exploit this. Just not now.What AMD did with their upgrade to x86, is add more visible registers. As you can see in the next picture, in addition to extending each register to 64 bits, they also doubled the number of SSE and GPRs.

    Figure 2 AMD'S SSE's and GPRs

     

    As I said before, having more registers available to the compiler can help lower the number of load/store calls, which are slow. Less of that kind of instruction also brings down the number of lines of code. However, because there is need for an extra byte in each instruction to define the 64 bit instructions (the RAX extension), optimized code ends up being slightly larger, because of the extra space required for 64 bit values. According to AMD, code size is up on average about 5%, while the number of instructions is down by 15% when recompiled for x86-64. With larger cache sizes being found on processors these days, as well as cheap storage, this isn’t much of an issue.

    More Computer Processors Articles
    More By DMOS


     

    COMPUTER PROCESSORS ARTICLES

    - Intel`s Ultra-Quick i5 and i7 Processors Ava...
    - Intel Nehalem
    - VIA Nano
    - Intel Atom
    - Intel Celeron 420
    - Intel Pentium E2140
    - Inside the Machine by Jon Stokes
    - Chip History from 1970 to Today
    - A Brief History of Chips
    - Intel Shows Off at Developer Forum
    - Core 2 Quadro Review
    - Core Concepts
    - AMD Takes on Intel with AM2 and HT
    - Intel Presler 955: Benchmarking the First 65...
    - Computer Chip Scam, Pentium Pirates






    © 2003-2009 by Developer Shed. All rights reserved. DS Cluster 2 Hosted by Hostway
    For more Enterprise Application Development news, visit eWeek