Navigate:

Advice
Beginners
BIOS Guide
CPUs
Links
Mainboards
Memory
Network
Storage
Video/Sound Cards

Contact
Forum
SiteMap
Sponsors
WebNews
Home
. .

Prices:

Mainboards

ABIT
ASUS
Chaintech
Shuttle
Soyo
Tyan

CPU
Intel
P4 2.4C-800
P4 2.6C-800
P4 2.8C-800
P4 3.0-800
P4 3.2-800

AMD
AthlonXP
XP 1700+
XP 2000+
XP 2400+
XP 2500+
XP 2700+
XP 3000+
XP 3200+

Athlon64
Athlon64 3200+
Athlon64 FX-51

Opteron
Opteron 240
Opteron 242
Opteron 244
Opteron 246

Memory

Corsair
Crucial
Kingston
Mushkin
OCZ

Search Prices:


























































































































LOSTCIRCUITS

SHORTCUTS:
Top page
The Core
SRAM vs. "1 Transistor SRAM"
Clamshell For Control of Impedance
Putting it all Together
 HP PA-8800 RISC Processor   
SMP On One Chip
(Review by MS, October 19, 2001)
Summary

At last week's MicroProcessor Forum, HP's David J. C. Johnson unveiled the details of HP's latest RISC processor destined to redefine performance in Server-Class processors. Following a relatively simple strategy, the PA-8800 processor combines two PA-8700 cores on a single chip to enable symmetric multiprocessing (SMP) on a single processor. Aside from bumping the core speed up to an initial 1 GHz, enhancements include the addition of combined 35 MB L1+L2 cache. The L1 cache consists of 2 blocks of 750 KByte Instruction and data caches for each core for a total of 3 MB. A huge 32 MB L2 cache is placed off-chip on the same cartridge in the form of four 72 Mbit chips using EMS "1 Transistor SRAM technology". Conservative estimates about the performance of the new PA-8800 processor are in the range of 900/1000 SPEC 2000 int/fp units and 800,000 transactions per minute in server applications.


Block-diagram of the HP PA-8800 dual core RISC processor featuring a 128 bit, 400 MHz data rate bus interface with 6.4 GB/sec bandwidth. Each core has separate 750 kByte instruction and data L1 caches. The 32 MB L2 cache is off-chip and shared by both logic cores. In detail,the L2 cache is made up of four 72 Mbit "1 Transistor SRAM" or ESRAM chips using clam shell mounting (detailed explanation below).

.

Symmetric MultiProcessing (SMP) is in most cases associated with the physical presence of several CPUs within a given system otherwise, multiprocessing would not be possible. Multiprocessor systems have some problems, though, the most critical being the cache coherency, meaning that each CPU needs to make sure that the data or instruction copy in its cache is valid. The commonly used protocols to verify cache coherency are MESI or AMD's version MOESI as described recently in this MP article. The drawback is still that no CPU can utilize the valid data contained in the other CPU's cache. One possible workaround for this problem is the addition of a backside shared cache as proposed by Multi Node Microsystems, however, this concept has not been implemented in real designs yet.

With the new PA-8800 RISC processor, HP is going a different route, that is, instead of using physically separate processors, the new concept involves placing two entire PA-8700 CPU cores in the same package. There is a certain sacrifice in terms of flexibility with such a concept since a single CPU cannot be purchased, on the other hand, since nobody uses a single CPU in a dual system anyway, it is actually a smart move that further solves a variety of problems, particularly since it enables the use of a shared L2 cache with high bandwidth access to the cache by both cores, which eliminates the bottleneck of the system / memory bus for access of valid data. A similar approach was taken by IBM with their Power4 processor that also uses a dual core on a single die.

There are some similarities between the IBM Power4 and the HP PA-8800 RISC processor. Both are running at 1 GHz clock frequency or greater and are built on SOI (Silicon-on-Insulator) technology. Minor difference between the two processors relate to the manufacturing process (180 nm SOI copper interconnect, 7 metal layers in the Power4; 130 nm SOI copper interconnect, 8 metal layers in the HP PA-8800 RISC)

Major differences between the IBM Power4 and the HP PA-8800 RISC processor are in the cache architecture. The Power4 uses 64 kB L1 cache per core and a shared 1.5 MB L2 cache with processor-to-L2 bandwidth of over 100 GB/sec. In addition, the IBM Power4 features an off-chip L3 cache of 32 MB.

The HP PA-8800 L1 cache is probably the biggest L1 that ever existed so far with separate 750 KBytes of data and instruction cache for each core. This results in no less of 4 blocks of ¾ MB density each for a total of an unprecedented 3 MB L1 cache, physically twice as much as the combined L1+L2 on IBM's Power4. Accordingly, the transistor count of the HP-PA8800 is with 300 Million transistors almost twice as high as the 170 Million transistors of the IBM Power4 and results in a die size of 23.6x15.5 mm2 or 361 mm2. The L2 cache of the PA-8800 is off-chip and consists of four 72 Mbit "1 Transistor SRAM" chips developed by Enhanced Memory Systems.

next page:    => The Core =>

Click here! All advice and educational articles on LostCircuits are free, but if you feel you can, please make a small donation to us!
Thank you!

General disclaimer: This page only reflects the author's personal opinion and assumes no responsibility whatsoever regarding any of the contents or any damages that may occur explicitly or implicitly from reading the contents of this site. All names and trademarks mentioned in this review are the exclusive property of the respective parent companies.
All contents of this site are protected by international copyright laws. Reproduction of the contents even in parts is not allowed except after written permission by the author and referral to this site.
Copyright 2002 - 2008 LostCircuits