|
Advice Beginners BIOS Guide CPUs Links Mainboards Memory Network Storage Video/Sound Cards Contact Forum SiteMap Sponsors WebNews Home |
. | . |
|
Prices: CPU Intel P4 2.4C-800 P4 2.6C-800 P4 2.8C-800 P4 3.0-800 P4 3.2-800 AMD AthlonXP XP 1700+ XP 2000+ XP 2400+ XP 2500+ XP 2700+ XP 3000+ XP 3200+ Athlon64 Athlon64 3200+ Athlon64 FX-51 Opteron Opteron 240 Opteron 242 Opteron 244 Opteron 246 Memory Corsair Crucial Kingston Mushkin OCZ |
LOSTCIRCUITS |
||
| ASUS A7N266E Chipset Wars | ||
| (Review by MS, February 18, 2002) |
For nVidia, the solution to speed up the integrated graphics comes in the form of the crossbar memory controller known from the GeForce3 in a somewhat simplified form. Briefly, Twin Bank architecture linked together by the Crossbar memory controller means that the chipset contains two separate memory controllers (MC0 and MC1) linked together by a Single Intelligent Arbiter to distribute the bandwidth provided by each controller to each device that requests data. In contrast to the ServerWorks chipset, the TwinBank memory architecture consists of two independently operating 64 bit buses that can be used as single devices offering the theoretical bandwidth of 2.1 GB/sec (single controller) known from every other current DDR memory controller (at 133 MHz clock). However, they can also be combined to 4.2 GB/sec total memory bandwidth funneled through a combined 128 bit bus width.

Schematic presentation of the TwinBank Architecture with its two independent memory controllers that are managed by a so-called Single Intelligent Arbiter to distribute the memory bandwidth to the respective targets.
The two independent controllers are also transparent to the operating system and are loaded as separate devices with a dedicated master controller as shown below.

Screenshot from the Device Manager in Win2K. The first controller is highlighted and the second one as well as the parent device are listed below.
Measuring Memory Bandwidth and Double-Redundancy Systematic Errors
Having a wide memory bus will certainly help with memory bandwidth, however, there is an inherent problem in the form of the Arbiter. That is, no matter how good and fast the arbiter works, it will introduce additional latencies on the chipset level. Combined with the fact that the front side bus to the CPU is only 64 bit, we can expect that neither the combined TwinBank bandwidth nor a single bank approach will provide the same bandwidth that we see in simpler controllers relying on a point to point protocol. This a priori handicap of the NV420 chipset will show in standard memory bandwidth benchmarks such as StreamD or SiSoft Sandra.
However, on second thought, what is measured in this case is also a systematic error for the simple reason that the SiSoft and Stream only measure data transfer between the CPU and the system memory whereas the entire nForce architecture is laid out as a distributed memory bus if one can call it that. In other words, the emphasis in the nForce design has not been to provide a 128 bit memory interface to the CPU only, rather, the design might be described as a double redundant bus with enough overhead for other devices to plug into without sacrificing CPU memory bandwidth. Needless to say that a standard memory bandwidth benchmark is not capable of addressing redundancy, keep this in mind when looking at raw bandwidth benchmark results.
Dynamic Adaptive Speculative Pre-Processor (DASP)
The DASP is yet another interesting feature built into the IGP. Basically the DASP is a small area in the die dedicated to predict the next set of data that will be requested on the basis of either locality or speculative coherency and to issue prefetch of data. I don't have any details on the exact parameters but it pretty much looks like a standard garden variety of the known prefetch buffer designs. According to nVidia in streaming applications such as StreamD or SiSoft Sandra as much as 30% performance increase can be expected, in other applications relying more heavily on the CPU cache which, by extension, will generate more randomness in main memory accesses, the gains will naturally drop to some 5%. We'll have some data to corroborate these claims later.
next page: => HyperTransport, MCP and ACR =>