Navigate:

Advice
Beginners
BIOS Guide
CPUs
Links
Mainboards
Memory
Network
Storage
Video/Sound Cards

Contact
Forum
SiteMap
Sponsors
WebNews
Home

. .


CPU
Intel
P4 840 D
P4 820 D
P4 630
P4 640
P4 650
P4 660
P4 670

AMD
Athlon64
3500+
3700+
3800+
4000+
X2-3800+
X2-4200+
X2-4400+
X2-4600+
X2-4800+

1-Way Opteron
Opteron 144
Opteron 146
Opteron 148
Opteron 150
Opteron 152

2-Way Opteron
Opteron 240
Opteron 242
Opteron 244
Opteron 246
Opteron 248
Opteron 250
Opteron 252

2-Way Dual Core Opteron
Opteron 270
Opteron 275

nVidia
GF 7800GT
GF 6800GT
GF 6600GT

ATI
R X850 XT PE
R X850 XT
R X800 XT PE
R X800 XT
R X800 XL

Memory

Corsair
Crucial
Kingston
Mushkin
OCZ

What are you
shopping for?







































































LOSTCIRCUITS

SHORTCUTS:
Top page
Registers and Buffers
Blueprints, Components
and Modules

Test Setup, Verification
Sandra, Cachemem
First Results
Some Differences
The Big Difference
Conclusions

Please Give Us Some Feedback

Memory Pricing

 Registered ECC DDR400    
for the Athlon64 FX and Opteron
(Review by MS, October 3, 2003)


Streaming Sandra

Usually, the standard memory test to do before anything is SiSoft Sandra using Buffering and Block Prefetching, which gives some sort of ballpark figures and, furthermore, adds a little bit of stress to the memory. Since the default uses prefetching to get a maximum utilization of the bus and, moreover, stays in page as long as possible, access latencies play only a minor role for the results. It is still possible to see the impact of changes in mostly tRP and tRCD whenever a page miss occurs.


Buffering Enabled

All modules perform close to the expectations with the Kingston and Samsung trailing marginally behind the competition.

Something rather remarkable happens here. The Kingston and Samsung DIMMs take a severe beating with an almost 20% performance loss compared to the TSOP-based Legacy and Mushkin modules.

Theoretically, this should not be possible, since any memory module is a merely passive device that can either function or fail but not run slower than another DIMM at the same setting. Keep in mind, however, that we are looking at registered DIMMs and the register itself can do a number of strange things. One possibility would be that the Register skips one beat and forwards the address and control signals not on the following but on the second rising clock (command and addresses are only given on the rising edge, in other words, they are running in single data rate mode as opposed to the DDR protocol employed by the data bus).

Without the use of a scope, there is no possibility to find what the real story is, however, it is still possible to get some idea by looking at the access latencies.

Cachemem 2.65

Essentially, what we need to do here is to compare three different modes of operation, or rather, three different modules in a single 3D-access latency plot. This is easier said than done but we found a workaround by using the Mushkin 2:3:2 stride vs. block size as baseline and subtract the matrices for the Kingston and the Legacy DIMMs running at the same latencies, that is 2.5:3:3, respecitvely (or else 3:3:3 which was qualitatively indistinguishable from the 2.5:3:3 setting but is not shown here). In other words, the graph below shows the delta, that is the extra CPU cycles (rather than ns intervals) of each module (Kingston vs. Legacy) at the same setting compared to the baseline established using the Mushkin DIMMs.

Cachemem 2.65 Latency differences to the reference in form of the Mushkin DIMM. Lower is better. Even though it is not 100% conclusive, it appears as if there is one full extra memory cycle latency on every page miss. At a page size of 4096 addresses, a stride length of 256k will allow up to sixteen in page accesses, a stride length of 512 up for eight and so on, meaning that with every doubling of stride length, the relative number of page misses will also double. At the same time, the block size needs to be factored in with respect to hitting the page boundaries. Bottom line is that the graph shows an exponential increase of the access latency difference with longer strides that hits roughly 20 CPU cycles at its maximum. Keep in mind that 22 CPU cycles equal one memory clock cycle here, which means that it looks like one full clock is wasted at every page miss. Since both the BIOS GUI and CPUz report the exact same latency settings, this extra cycle has to originate / disappear somewhere else and the prime candidates are either the PLL or else the Registers but it is beyond our capabilities to either verify or dispel these possibilities.

next page:    => Gaming Performance =>

Click Here!

If you enjoyed reading this article and found it useful, please consider making a small donation to LostCircuits.
Thank you!

General disclaimer: This page only reflects the author's personal opinion and assumes no responsibility whatsoever regarding any of the contents or any damages that may occur explicitly or implicitly from reading the contents of this site. All names and trademarks mentioned in this review are the exclusive property of the respective parent companies.
All contents of this site are protected by international copyright laws. Reproduction of the contents even in parts is not allowed except after written permission by the author and referral to this site.
Copyright 1998 - 2007 LostCircuits