Navigate:

Advice
Beginners
BIOS Guide
CPUs
Links
Mainboards
Memory
Network
Storage
Video/Sound Cards

Contact
Forum
SiteMap
Sponsors
WebNews
Home
. .


Prices:





What are you
shopping for?







































































LOSTCIRCUITS

SHORTCUTS:
Table Of Contents


 LostCircuits BIOS guide    

What You Never Wanted To Know But Constantly Dared To Ask

(by MS, Timeless)


Kyle over at [H]ardOCP was the first to draw the attention to the DRAM command rate and its potential importance for performance. His initial observation in the review of the MSI KT266 was if the command latency was set to 1T, there was an approximately 25% increase in memory bandwidth in SiSoft Sandra. In business applications there is relatively little gain in performance but these applications are running mainly from the cache. However, the situation changes with gaming applications. First, though, let's look at the technical background of this parameter:


DRAM Command Rate

DRAM Command Rate is synonymous for address and commmand decode latency of the chipset. Briefly, behind this parameter is the selection of the proper physical bank within the entire memory space in any given system. Any DIMM can have two physical banks and only one bank can be accessed at any time. The choice of which bank is opened requires decoding of the address. In a scenario where only one single-sided DIMM is present in the system, the choice is the default since there are no other possibilities.

If a double-sided DIMM is used instead, the controller has to make an intelligent choice with respect to the issueing of the so-called chip select (CS) command to select the correct bank within which the information is stored. In case, there are two double-sided DIMMs in the system, there are four possible choices and only one is correct. Decoding a larger memory space will take more time and that is why most chipsets currently offer a variable CMD rate where the choices are 1T or 2T. Since the command sequence (Bank Activate, Read) is issued in a fixed timing sequence, this additional latency applies only for the initial access, whereas all subsequent commands are queued according to the latencies set in the BIOS. Therefore the CMD latency only affects random accesses or the time until the first word is output (tRAC). Streaming memory accesses, especially with prefetching enabled are hardly affected.

We found, though, that the CMD rate is the most important performance factor in unified memory architecture designs, that is, if integrated graphics are used without the addition of any dedicated display caches like those used in the i810E chipset. The reason is rather obvious, graphics depend on sustained memory bandwidth rather than peak bandwidth and the bandwidth required increases proportional with the pixel space and the color depth. In addition, there is a certain locality to memory allocated as graphics memory but there is also a lot of randomness. As a result, the initial access rate (tRAC) becomes a critical factor for the overall performance of integrated graphics and the performance delta will grow with the resolution. To give an example, at 1024 x 768 x 32 bpp, we have seen as much as 60% increase in frame rates when running Quake3 Arena Timedemo as benchmark (SIS 650 chipset).

One notable exception to variable CMD rate latencies is the Intel i845(DDR) family, that does not offer a 2 T CMD rate. The consequences are that the chipset can accomodate only two double-sided DIMMs for a maximum of 2 GB total system memory.

One scenario where the CMD rate is critical is the use of Registered DIMMs. A Registered DIMM uses a so called register chip that translates the addresses and redistributes them to the total number of memory chips present on this particular module. The benefit is that instead of e.g. eight physical banks with 9 chips each, the memory controller only has to drive four register chips. The disadvantage is that a register runs on the same clock edges as the system bus but since decoding takes a certain amount of time, the address and command bus is propagated with one cycle delay on the next rising clock edge. For this reason, the chipset has to specify a CMD rate of 2T whenever registered DIMMs are used, otherwise, the controller will expect the data output one cycle too early and eventually crash.

Whether or not it is possible to run the Command Rate at 1T depends on a variety of factors, though.

As an example we are taking the EPoX 8KHA to look at the impact of setting the Command rate to either 1T or 2T:

Additional settings that are evaluated are the impact of changing the CAS latency from 2.5 to 2 cycles and bank interleaving from disabled to 4 bank. There are some rather interesting performance patterns:

next page:    => Benchmarks and the KT333 / KT400 chipsets =>

Click here! All advice and educational articles on LostCircuits are free, but if you feel you can, please make a small donation to us!
Thank you!

General disclaimer: This page only reflects the author's personal opinion and assumes no responsibility whatsoever regarding any of the contents or any damages that may occur explicitly or implicitly from reading the contents of this site. All names and trademarks mentioned in this review are the exclusive property of the respective parent companies.
All contents of this site are protected by international copyright laws. Reproduction of the contents even in parts is not allowed except after written permission by the author and referral to this site.
Copyright 2002 - 2008 LostCircuits