|
Advice Beginners BIOS Guide CPUs Links Mainboards Memory Network Storage Video/Sound Cards Contact Forum SiteMap Sponsors WebNews Home
|
. | . |
|
CPU Intel P4 840 D P4 820 D P4 630 P4 640 P4 650 P4 660 P4 670 AMD Athlon64 3500+ 3700+ 3800+ 4000+ X2-3800+ X2-4200+ X2-4400+ X2-4600+ X2-4800+ 1-Way Opteron Opteron 144 Opteron 146 Opteron 148 Opteron 150 Opteron 152 2-Way Opteron Opteron 240 Opteron 242 Opteron 244 Opteron 246 Opteron 248 Opteron 250 Opteron 252 2-Way Dual Core Opteron Opteron 270 Opteron 275 nVidia GF 7800GT GF 6800GT GF 6600GT ATI R X850 XT PE R X850 XT R X800 XT PE R X800 XT R X800 XL Memory Corsair Crucial Kingston Mushkin OCZ |
LOSTCIRCUITS |
|
| Registered ECC DDR400 for the Athlon64 FX and Opteron | |
| (Review by MS, October 3, 2003) |
Streaming Sandra
Usually, the standard memory test to do before anything is SiSoft Sandra using Buffering and Block Prefetching, which gives some sort of ballpark figures and, furthermore, adds a little bit of stress to the memory. Since the default uses prefetching to get a maximum utilization of the bus and, moreover, stays in page as long as possible, access latencies play only a minor role for the results. It is still possible to see the impact of changes in mostly tRP and tRCD whenever a page miss occurs.
Buffering Enabled

All modules perform close to the expectations with the Kingston and Samsung trailing marginally behind the competition.

Something rather remarkable happens here. The Kingston and Samsung DIMMs take a severe beating with an almost 20% performance loss compared to the TSOP-based Legacy and Mushkin modules.

Theoretically, this should not be possible, since any memory module is a merely passive device that can either function or fail but not run slower than another DIMM at the same setting. Keep in mind, however, that we are looking at registered DIMMs and the register itself can do a number of strange things. One possibility would be that the Register skips one beat and forwards the address and control signals not on the following but on the second rising clock (command and addresses are only given on the rising edge, in other words, they are running in single data rate mode as opposed to the DDR protocol employed by the data bus).
Without the use of a scope, there is no possibility to find what the real story is, however, it is still possible to get some idea by looking at the access latencies.
Cachemem 2.65
Essentially, what we need to do here is to compare three different modes of operation, or rather, three different modules in a single 3D-access latency plot. This is easier said than done but we found a workaround by using the Mushkin 2:3:2 stride vs. block size as baseline and subtract the matrices for the Kingston and the Legacy DIMMs running at the same latencies, that is 2.5:3:3, respecitvely (or else 3:3:3 which was qualitatively indistinguishable from the 2.5:3:3 setting but is not shown here). In other words, the graph below shows the delta, that is the extra CPU cycles (rather than ns intervals) of each module (Kingston vs. Legacy) at the same setting compared to the baseline established using the Mushkin DIMMs.

Cachemem 2.65 Latency differences to the reference in form of the Mushkin DIMM. Lower is better. Even though it is not 100% conclusive, it appears as if there is one full extra memory cycle latency on every page miss. At a page size of 4096 addresses, a stride length of 256k will allow up to sixteen in page accesses, a stride length of 512 up for eight and so on, meaning that with every doubling of stride length, the relative number of page misses will also double. At the same time, the block size needs to be factored in with respect to hitting the page boundaries. Bottom line is that the graph shows an exponential increase of the access latency difference with longer strides that hits roughly 20 CPU cycles at its maximum. Keep in mind that 22 CPU cycles equal one memory clock cycle here, which means that it looks like one full clock is wasted at every page miss. Since both the BIOS GUI and CPUz report the exact same latency settings, this extra cycle has to originate / disappear somewhere else and the prime candidates are either the PLL or else the Registers but it is beyond our capabilities to either verify or dispel these possibilities.
next page: => Gaming Performance =>
If you enjoyed reading this article and found it useful, please consider making a small donation to LostCircuits.