|
Advice Beginners BIOS Guide CPUs Links Mainboards Memory Network Storage Video/Sound Cards Contact Forum SiteMap Sponsors WebNews Home
|
. | . |
|
CPU Intel P4 840 D P4 820 D P4 630 P4 640 P4 650 P4 660 P4 670 AMD Athlon64 3500+ 3700+ 3800+ 4000+ X2-3800+ X2-4200+ X2-4400+ X2-4600+ X2-4800+ 1-Way Opteron Opteron 144 Opteron 146 Opteron 148 Opteron 150 Opteron 152 2-Way Opteron Opteron 240 Opteron 242 Opteron 244 Opteron 246 Opteron 248 Opteron 250 Opteron 252 2-Way Dual Core Opteron Opteron 270 Opteron 275 nVidia GF 7800GT GF 6800GT GF 6600GT ATI R X850 XT PE R X850 XT R X800 XT PE R X800 XT R X800 XL Memory Corsair Crucial Kingston Mushkin OCZ |
LOSTCIRCUITS |
|
| High Performance DDR DIMMs Ups and Downs or "how do I keep my stick happy?" | |
| (Review by MS, July 17, 2001) |
DQS, DDR, SDRAM, Trace Delay and CAS latency
The big question is, why can DDR clock higher than SDRAM and maintain low CAS latency? It is easiest to explain by looking back at the old-fashioned SDRAM technology. What does CAS-2 mean? CAS-2 means that the chipset issues a read command and on the second rising edge of the clock, the data have to be ready to be output from the SDRAM DIMMs. So far so good but there is a central system clock that simultaneously addresses all peripherals.
At 166 MHz bus speed, the each clock cycle (tCK) is only 6 ns. Take a CAS-2 part again and the chips have to be ready to start data output after 2 x tCK - setup time (2 ns). This means that the CAS has to work within 10 ns.
DDR and clock forwarding
The difference in the DDR platform is subtle but very important. DDR uses a clock forwarding signal. That is, instead of an independent timer, the strobe signal is forwarded with the command bus and for a CAS-2 part, the data have to be ready at the second rising clock edge again. The key issue here is that there is no delay of the command with regard to the DIMM clock signal because they are synchronized with each other.
What does this mean in real life? Looking again at 166 MHz (tCK=6 ns) a CAS-2 part needs to output data on the second rising clock edge after the command. However, in this case, the command arrives together with the clock, therefore, there is no delay that would shave off time. In other words, the CAS needs to be faster than 12 ns rather than faster than 10 ns in an SDRAM part. Keep in mind that the 2 ns may be an exaggeration, real world trace delays will vary from one board to another. For simplicity reasons, we'll stick with this number for now.
Clock for Clock: SDRAM vs. DDR
It all comes down to the lapidary statement that SDRAM DIMMs need to select rows and columns faster at a clock by clock comparison than a DDR DIMM. Going back to the 166 MHz memory frequency, by taking into account the advantage of the clock forwarding protocol and doing the math, a rough approximation is that a 166 MHz DDR DIMM needs to roughly as fast as a 143 MHz SDRAM DIMM. 143 MHz, on the other hand, is by now almost a commodity speed, that is, even cheap standard parts can run at that frequency. Therefore, the high frequencies achieved are not as surprising as they originally seemed.
Early Command reduces effective CAS Latency on Page Hits
On a page hit, the latency that decides how early the critical word is output is the CAS latency. on standard SDRAM controllers, the read command usually can be issued after the 3rd word out, meaning that in a burst of 4 words, a CAS latency of 2 can be hidden behind the output of the 4th word. To rephrase, while the output buffers are releasing the 4th word, the command bus can already issue the next read command that will result in data output on the second or third rising edge of the memory clock signal, depending on the CAS latency of 2 or 3, respectively. This early read command can result in uninterrupted data output (at CAS-2) since the one penalty cycle is hidden by the output of the last word of the previous burst.
With DDR, the situation is similar since the critical factor is not the number of words but the trashing of information in the output buffers. In clear text, this means that the read command can be issued early, that is exactly one CAS latency before the end of the burst without trashing the data in the output buffers. This, however, also means that the only difference between a CAS-2.5 and a CAS-2 part in DDR is 1/2 additional penalty cycle on a random access, whereas in in-page accesses, there is no additional penalty since the extra 1/2 clock latency is hidden behind the early command.
To reiterate, in an SDRAM situation, the additional latency of a CAS-3 part cannot be hidden, even in page hit situations, in a DDR scenario, there is no real impact of CAS-2 vs. CAS 2.5 while staying in page. Staying in page with DDR, however, underlies somewhat different rules with than with SDRAM but we'll get there shortly.
next page: => tRCD as limiting performance factor =>
If you enjoyed reading this article and found it useful, please consider making a small donation to LostCircuits.