Navigate:

Advice
Beginners
BIOS Guide
CPUs
Links
Mainboards
Memory
Network
Storage
Video/Sound Cards

Contact
Forum
SiteMap
Sponsors
WebNews
Home

. .


CPU
Intel
P4 840 D
P4 820 D
P4 630
P4 640
P4 650
P4 660
P4 670

AMD
Athlon64
3500+
3700+
3800+
4000+
X2-3800+
X2-4200+
X2-4400+
X2-4600+
X2-4800+

1-Way Opteron
Opteron 144
Opteron 146
Opteron 148
Opteron 150
Opteron 152

2-Way Opteron
Opteron 240
Opteron 242
Opteron 244
Opteron 246
Opteron 248
Opteron 250
Opteron 252

2-Way Dual Core Opteron
Opteron 270
Opteron 275

nVidia
GF 7800GT
GF 6800GT
GF 6600GT

ATI
R X850 XT PE
R X850 XT
R X800 XT PE
R X800 XT
R X800 XL

Memory

Corsair
Crucial
Kingston
Mushkin
OCZ

What are you
shopping for?







































































LOSTCIRCUITS

SHORTCUTS:
Top page
(DDR)-SDRAM chip overview
Row and Column select
the devil in the traces
Photo Gallery
Functional anatomy of EDDR
Cache and Power Saving
Changing the Commands - Conclusion
 Inside the EDDR Chip   
Combining DRAM storage and SRAM speed
(Review by MS, November 27, 2000)


Summary

Currently, the memory bus constitutes the most severe performance bottleneck in personal computers. Conventional DRAM architecture can be vamped up to operate at higher clock frequencies and to include Double Data Rate transfer protocols to increase the peak bandwidth of the memory bus. The most crucial performance handicap, that is latency, however, is not addressed by simply making faster chips. Implementing a small amount of SRAM functioning as row cache while leaving the actual storage to the DRAM array carried over from traditional designs offers the best of both worlds. This article looks into some design issues and illustrates, using stop action pictures, how increased real world bandwidth can be achieved. The performance analysis and some predictions of how average bandwidth is crucial for system performance is posted on HardOCP


Processor speed is experiencing an almost exponential growth. The raw power of the CPU itself closely follows the increments in clock speed, at least with regard to synthetic benchmarks operating in a semi-independent fashion from the rest of the system components, particularly the memory subsystem. However, in real life situations, we see a different picture emerging, showing a performance ceiling caused by insufficient data availability. The main bottleneck is found in the memory bus. To overcome this problem, the main approach has been to increase the bandwidth. Unfortunately, though, increasing bandwidth only provides a temporary solution, which, furthermore, targets only specific applications with a high locality of data and relying on consecutive page hits. In reality, page hits only constitute somewhere around 30% of all read requests, the majority of transfers still originates with page misses, causing several penalty cycles or latencies to occur until the correct data can be transferred to the CPU.

In all DRAM operations, there are three different kinds of latency. Briefly, after a bank activate command has been issued, a row within the DRAM array is selected by the Row Address Strobe (RAS) and activated. This process requires a certain amount of time, and a read command or column select command via the Column Address Strobe (CAS) cannot be issued before the entire row is ready to release the data to the adjacent sense amplifiers. Therefore, the time until the CAS can be activated is called the RAS-to-CAS Delay time (tRCD).

The next step involves the selection of a specific column address. As already mentioned, this is done by the column address strobe (CAS), which is essentially a small switch for selecting the correct column. This selection of a specific column, once again, takes a certain time which also includes setting the column select line high, latching the data into the sense amplifiers and moving the data out of the array to the global data lines. The signal strength needs to be kept as low as possible to avoid electrical crosstalk between neighboring wires. In turn, weaker signals travel more slowly and have limited reach. Therefore, in most cases, a secondary sense amplifier is embedded into the pathway to avoid deterioration of the signal integrity. All these processes require a certain amount of time which is called CAS latency.

In case a page miss is encountered while a page is still open, the DRAM array has to be restored to its native state which involves moving the data back to the cells of origin and precharging the entire array before any new command is accepted. The time required is called Precharge time or tRP.

In summary, we are looking at 3 independent latency categories, tRCD, CAS and tRP. Each one of these entities can consume either 2 or 3 bus cycles, respectively, at least in SDRAM. In DDR DIMMs, the situation is slightly different in that latencies can also span over halves of bus cycles since data transfer occurs at both the rising and the falling edge of the clock.

next page:    => DRAM Bank Organization =>

Click Here!

If you enjoyed reading this article and found it useful, please consider making a small donation to LostCircuits.
Thank you!

General disclaimer: This page only reflects the author's personal opinion and assumes no responsibility whatsoever regarding any of the contents or any damages that may occur explicitly or implicitly from reading the contents of this site. All names and trademarks mentioned in this review are the exclusive property of the respective parent companies.
All contents of this site are protected by international copyright laws. Reproduction of the contents even in parts is not allowed except after written permission by the author and referral to this site.
Copyright 1998 - 2007 LostCircuits