Navigate:

Advice
Beginners
BIOS Guide
CPUs
Links
Mainboards
Memory
Network
Storage
Video/Sound Cards

Contact
Forum
SiteMap
Sponsors
WebNews
Home
. .

Prices:
CPU
Intel
P4 2.4C-800
P4 2.6C-800
P4 2.8C-800
P4 3.0-800
P4 3.2-800

AMD
AthlonXP
XP 1700+
XP 2000+
XP 2400+
XP 2500+
XP 2700+
XP 3000+
XP 3200+

Athlon64
Athlon64 3200+
Athlon64 FX-51

Opteron
Opteron 240
Opteron 242
Opteron 244
Opteron 246

Memory

Corsair
Crucial
Kingston
Mushkin
OCZ

Search Prices:








































































LOSTCIRCUITS

SHORTCUTS:
Top Page
TwinBank, Width Vs. Latency, DASP
IGP, MCP and Fixed PCI Frequency
At One Glance
Features
Jumpers, Connectors, BIOS
Installation, Problems and Test Configuration
SiSoft Sandra
Expendable, Quake3 Arena, Ulead
Overclocking, Conclusion
 ASUS A7N266E   
Chipset Wars
(Review by MS, February 18, 2002)
TwinBank Memory Architecture

For nVidia, the solution to speed up the integrated graphics comes in the form of the crossbar memory controller known from the GeForce3 in a somewhat simplified form. Briefly, Twin Bank architecture linked together by the Crossbar memory controller means that the chipset contains two separate memory controllers (MC0 and MC1) linked together by a Single Intelligent Arbiter to distribute the bandwidth provided by each controller to each device that requests data. In contrast to the ServerWorks chipset, the TwinBank memory architecture consists of two independently operating 64 bit buses that can be used as single devices offering the theoretical bandwidth of 2.1 GB/sec (single controller) known from every other current DDR memory controller (at 133 MHz clock). However, they can also be combined to 4.2 GB/sec total memory bandwidth funneled through a combined 128 bit bus width.


Schematic presentation of the TwinBank Architecture with its two independent memory controllers that are managed by a so-called Single Intelligent Arbiter to distribute the memory bandwidth to the respective targets.

.

The two independent controllers are also transparent to the operating system and are loaded as separate devices with a dedicated master controller as shown below.

Screenshot from the Device Manager in Win2K. The first controller is highlighted and the second one as well as the parent device are listed below.

.

Measuring Memory Bandwidth and Double-Redundancy Systematic Errors

Having a wide memory bus will certainly help with memory bandwidth, however, there is an inherent problem in the form of the Arbiter. That is, no matter how good and fast the arbiter works, it will introduce additional latencies on the chipset level. Combined with the fact that the front side bus to the CPU is only 64 bit, we can expect that neither the combined TwinBank bandwidth nor a single bank approach will provide the same bandwidth that we see in simpler controllers relying on a point to point protocol. This a priori handicap of the NV420 chipset will show in standard memory bandwidth benchmarks such as StreamD or SiSoft Sandra.

However, on second thought, what is measured in this case is also a systematic error for the simple reason that the SiSoft and Stream only measure data transfer between the CPU and the system memory whereas the entire nForce architecture is laid out as a distributed memory bus if one can call it that. In other words, the emphasis in the nForce design has not been to provide a 128 bit memory interface to the CPU only, rather, the design might be described as a double redundant bus with enough overhead for other devices to plug into without sacrificing CPU memory bandwidth. Needless to say that a standard memory bandwidth benchmark is not capable of addressing redundancy, keep this in mind when looking at raw bandwidth benchmark results.

Dynamic Adaptive Speculative Pre-Processor (DASP)

The DASP is yet another interesting feature built into the IGP. Basically the DASP is a small area in the die dedicated to predict the next set of data that will be requested on the basis of either locality or speculative coherency and to issue prefetch of data. I don't have any details on the exact parameters but it pretty much looks like a standard garden variety of the known prefetch buffer designs. According to nVidia in streaming applications such as StreamD or SiSoft Sandra as much as 30% performance increase can be expected, in other applications relying more heavily on the CPU cache which, by extension, will generate more randomness in main memory accesses, the gains will naturally drop to some 5%. We'll have some data to corroborate these claims later.

next page:    => HyperTransport, MCP and ACR =>

Click here! If you enjoyed reading this article and found it useful, please consider making a small donation to LostCircuits.
Thank you!

General disclaimer: This page only reflects the author's personal opinion and assumes no responsibility whatsoever regarding any of the contents or any damages that may occur explicitly or implicitly from reading the contents of this site. All names and trademarks mentioned in this review are the exclusive property of the respective parent companies.
All contents of this site are protected by international copyright laws. Reproduction of the contents even in parts is not allowed