bulletin board beginners section your ultimate bios guide email the Lostcircuits staff cpu back to main page links mainboards memory networking price guide SITEMAP tech advice email us your webnews video webnews

LOSTCIRCUITS

SHORTCUTS:
State of the Industry and Overview
Intel's E7500 Server Platform
AMD Keeps Hammering Away
Nimbley HP For McKimbley, ServerWorks
ClearCube, Hitachi's Water Cooled Notebook
Entering the PhotonAge: FiberOptics for Biometrics
USB 2.0, Serial ATA and Serial ATA 2
All Quiet On The Memory Front?
The Cube, Conclusions
 Intel Developer Forum Spring 2002   
A Phoenix From The Ashes Of The Recession
(March 4, 2002, by MS)
AMD Strikes Back

While things were going fine and dandy at IDF, the second main event took place in the Palomar Hotel across the street where AMD were forging the next rams to breach the Intel dominance in the processor market using Claw- and Sledgehammers. What AMD has accomplished is nothing short of stunning. Even though the approach to the Hammer family is evolutionary, meaning that the actual processor core was carried over from the existing K7 design, it is a rare accomplishment that the very first silicon, that is Rev. A0 is already fully functional, at least to the point where the CPU is capable of running a standard operating system and not just do one addition and one multiplication at the time. The most novel feature of the hammer family is the integration of the memory controller into the CPU itself which allows to execute command and address generation at CPU clock speed, meaning that the memory access lantencies can be reduced by as much as a factor of 20 from a chipset based memory controller (depending on the final clock speed of the CPU which we estimate to start above 2.5 GHz).


AMD showed two systems, one running standard 32 bit WindowsXP out of the box and doing MSWord and Excel operations, the second running 64-bit Linux with simultaneous execution of a 64bit and a 32bit bouncing ball graphics operation while running webserver applications in the background. Pretty impressive, to say the least, despite the fact that the AGP Tunnel was not fully functional and a PCI graphics card was needed as fallback solution.


The Hammer Family
On the left is the ClawHammer, on the right is SledgeHammer with its higher pincount achieved by filling the center and adding another row of pins in the periphery to increase the count from 754 to 940 pins. Finally, AMD also has converted back to the integrated heatspreaders to make an end to all those cracked dies and faciltate the mounting of cooling devices of all sorts.

Under the hood of the Hammer family is the first mainstream implementation of Silicon-On-Insulator (SOI) using a 130 nm copper interconnect process. As already mentioned, in terms of core architecture, nothing has changed, we are still looking at 6 integer units and three FPUs, however, the translation lookaside buffers have been further increased and optimized and the pipeline depth has been increased by two stages each to 12 (integer) and 17 (FPU) to enable the higher targeted clock speeds. Needless to say that the integrated memory controller needs additional pins, the same goes for the HyperTransport I/O interface with 6.4 GB/sec bandwidth that further serves as interface for the inter-processor links. The ClawHammer is targeted towards the desktop market which, by extension, means that the processor will be used in single or dual configuration. This is further reflected in the memory bus width and memory support meaning that ClawHammer will support unbuffered memory on a 64 bit wide bus.

The SledgeHammer will become the 4-8 way server equivalent using a dual channel DDR controller (128 bit data, 16 bit ECC) and requiring Registered DDR. Unlike the nVidia solution, the dual channel DDR interface does not require an arbitrator (Crossbar Controller) since the CPU bus is 128 bit wide meaning that the point to point connection runs without additional latencies. The SledgeHammer configuration requires additional I/O connections which explains the higher pincount (940 as opposed to the 754 pins of ClawHammer). An increasingly important feature for servers is "Chip Kill", meaning that even in the event of total failure of one memory chip, the system contains enough redundancy to "kill" this particular DIMM and move the data to the next available module. This allows to further hotswap the defective unit and go back to normal operation without any downtime.

The AMD reference board a.k.a. Solo: Note the position of the CPU close to the DIMM slots with the AGP tunnel on the far side of the memory slots which is possible because the memory controller is an integral part of the CPU in the Hammer family

The beauty of the architecture is not only the versatility of the interconnect but also the fact that, in order to implement the 64 bit mode only the front end of the processor needed to undergo some major respinning, that is the instruction decoders and control units wheras everything further downstream remains fairly unaltered. The side effect is that in 32 bit operation, the only parts of the CPU not fully utilized are the additional decoders while the core itself remains 100 % useable for 32 bit operations. In other words, there is no performance penalty involved when running 32 bit operations. On a clock for clock basis, the estimated performance gain through the enhanced TLBs is in the order of 5% compared to the Athlon XP, moreover, the reduced memory controller latencies result in an additional 15-20 % performance gain so that the net improvement over the Athlon XP is in the order of 20-25%.

From Bridges to Tunnels

In order for being able to showcase the Hammer family processors in action, a CPU is not enough, and what is even more impressive than delivering a CPU is the fact that the chipset or one of its possible future iterations was ready in time (as far as we know, silicon A1) for the counterstrike. Briefly, the 8000 series of chipset components comprises the desktop version a.k.a. 8151 AGP graphics tunnel featuring AGP 1x-8x (32 bit; 533 MHz = 2.7 GB/s bandwidth) compatibility, a 6.4GB/s upstream HyperTransport link to the CPU and a reduced pin count 800 MB/s downstream link to the AMD 8111 I/O hub. For server applications, AMD offers the 8131 PCI-X tunnel with both PCI-C 64bit /133 and PCI-X 64 bit 100 MHz interfaces, supporting Gbit Ethernet as well as Ultra 320 SCSI. AGP is not supported by the PCI-X tunnel which makes sense.

Certainly interesting is the verbal transition from a North Bridge to an AGP Tunnel. Granted that bridges are always at risk to be hit by e.g. the Titanic which makes the tunnel solution somewhat safer but what if the tItanic sinks or runs on ground .... ?

So much about the main event and the show-stealer, on the exhibit floor, life went on.

next page:    => HP: The Nimbley zx1 Platform For McKimbley, ServerWorks =>

Click here! If you enjoyed reading this article and found it useful, please consider making a small donation to LostCircuits.
Thank you!

General disclaimer: This page only reflects the author's personal opinion and assumes no responsibility whatsoever regarding any of the contents or any damages that may occur explicitly or implicitly from reading the contents of this site. All names and trademarks mentioned in this review are the exclusive property of the respective parent companies.
All contents of this site are protected by international copyright laws. Reproduction of the contents even in parts is not allowed