Navigate:

Advice
Beginners
BIOS Guide
CPUs
Links
Mainboards
Memory
Network
Storage
Video/Sound Cards

Contact
Forum
SiteMap
Sponsors
WebNews
Home

. .


CPU
Intel
P4 840 D
P4 820 D
P4 630
P4 640
P4 650
P4 660
P4 670

AMD
Athlon64
3500+
3700+
3800+
4000+
X2-3800+
X2-4200+
X2-4400+
X2-4600+
X2-4800+

1-Way Opteron
Opteron 144
Opteron 146
Opteron 148
Opteron 150
Opteron 152

2-Way Opteron
Opteron 240
Opteron 242
Opteron 244
Opteron 246
Opteron 248
Opteron 250
Opteron 252

2-Way Dual Core Opteron
Opteron 270
Opteron 275

nVidia
GF 7800GT
GF 6800GT
GF 6600GT

ATI
R X850 XT PE
R X850 XT
R X800 XT PE
R X800 XT
R X800 XL

Memory

Corsair
Crucial
Kingston
Mushkin
OCZ

What are you
shopping for?







































































LOSTCIRCUITS

SHORTCUTS:
Top page
Registers and Buffers
Blueprints, Components
and Modules

Test Setup, Verification
Sandra, Cachemem
First Results
Some Differences
The Big Difference
Conclusions

Please Give Us Some Feedback

Memory Pricing

 Registered ECC DDR400    
for the Athlon64 FX and Opteron
(Review by MS, October 3, 2003)


A Need For Registers

What is the purpose of a registered module and what sets it apart from an un-registered module? It is actually quite simple, as the name says, a registered DIMM has a register chip, which is a pass-through buffer for address and command data. The purpose is that the chipset will not see the entire phalanx of memory chips on the module, in addition, the memory clock signal will not have to drive all those chips. Rather, there is a single chip per physical bank (heretoforth called "rank") that will take the address and command signals, translate them and reroute them to the respective memory chips. At the same time, a PLL (phase lock loop) chip on the registered DIMM will use the original clock input to generate a second clock signal running synchronously with the original clock.


For the clock, it does not matter when it is generated, whether the signal is output with one or even several cycles delay does not matter either as long as it is phase-locked with the original clock, that is, the edges are matched up so that they come to lie on top of each other. The reason is very simple, it is just a square wave with a rising and a falling edge and no further information content. For the address and command signals, timing is critical since the commands need to coincide with the clock boundaries, that is, they need to fall onto the rising edges of the clock signal in order to provide correct setup and hold times and to warrant proper signal integrity.

Buffered "Address Translation"

There are two ways to accomplish rerouting address and commands. The first is to use a buffering scheme where buffering means the use of temporary buffers / translation schemes that do address conversion on the fly, meaning within the same clock cycle. Since buffered DIMMs were never really fashionable, some mainboard manufacturers, foremost of all ABIT, used the buffers in the memory path from the chipset to the DIMM slots. It should be clear that even the so-called on-the-fly conversion will require a minimum amount of time, which means that there will be a signal skew of some sort. It is possible to compensate for this skew by adding a "negative delay" to the signals, meaning that the addresses are generated a little earlier than the clock so that the delay and the "negative delay" cancel each other out, however, there is a limitation as to how much can be done using this technology and, therefore, it is suitable only for lower operating frequencies.

Registers Instead Of Buffers

The second technique is much cleaner, however, it has the disadvantage of inserting an additional wait state into the signaling scheme. In short, instead of trying to squeeze the signals into the same clock cycle, the register waits with the output until the next rising clock edge. This way, there is no skew at all, and a single clock cycle delay on the address and command bus is something that only comes into play in a random access situation, where no pipelined outputs are under way simultaneously. The latter would be able to mask some of the initial access latencies.

In terms of application-specific performance, this means that there will be applications that will see no difference at all, as long as streaming data transfers are dominating. There will be other applications that will show a more pronounced performance hit, anything using more random data accesses or else a high amount of writes amongst the transfers will sacrifice more performance. In other words, there is no rule of thumb, the performance hit will need to be evaluated on a case-by case basis.

On the other hand, the more precisely controlled signals may allow to compensate for clock skew as it naturally occurs as a function of load in higher density system memory configurations. Therefore, it may be possible to run at tighter memory latencies by sacrificing this one single access latency cycle. All in all, yes, there is a drawback to Registered DIMMs but there are hidden benefits as well, so what it comes down to is weighing one issue against the others and taking the lesser evil or best of two worlds depending on the point of view.

next page:    => The DIMM Blueprint =>

Click Here!

If you enjoyed reading this article and found it useful, please consider making a small donation to LostCircuits.
Thank you!

General disclaimer: This page only reflects the author's personal opinion and assumes no responsibility whatsoever regarding any of the contents or any damages that may occur explicitly or implicitly from reading the contents of this site. All names and trademarks mentioned in this review are the exclusive property of the respective parent companies.
All contents of this site are protected by international copyright laws. Reproduction of the contents even in parts is not allowed except after written permission by the author and referral to this site.
Copyright 1998 - 2007 LostCircuits