|
Advice Beginners BIOS Guide CPUs Links Mainboards Memory Network Storage Video/Sound Cards Contact Forum SiteMap Sponsors WebNews Home |
. | . |
Prices: Mainboards ABIT ASUS Chaintech Shuttle Soyo Tyan CPU Intel P4 2.4C-800 P4 2.6C-800 P4 2.8C-800 P4 3.0-800 P4 3.2-800 AMD AthlonXP XP 1700+ XP 2000+ XP 2400+ XP 2500+ XP 2700+ XP 3000+ XP 3200+ Athlon64 Athlon64 3200+ Athlon64 FX-51 Opteron Opteron 240 Opteron 242 Opteron 244 Opteron 246 Memory Corsair Crucial Kingston Mushkin OCZ |
LOSTCIRCUITS | ||
| Sapphire RADEON X1900 XTX Arguably the fastest out there.. | ||
|
(Review by MS, February 20, 2006) |
| ATI X1600 XT |
The Ring Bus Memory Architecture
As mentioned earlier, cache is expensive. At the same time, the requirements of the local frame buffer, that is the on-board graphics memory have increased dramatically over the past few years. Increasing complexity of scenes that finally bring some approximation of photorealism to the latest games requires more on-board memory to hold texture and geometry data. At the same time, the average screen resolution in high-end gaming applications has increased from 800x600 only about 4 years ago to a minimum of 1280 x 1024. At the top end, resolutions of 2000x 1500 are no exception anymore. Screen resolution plays an important role with respect to memory requirements since a 32-bit color depth uses 4 bytes per pixel and frame. For example, at 1600 x 1200 pixels resolution, each frame uses 8 MB of memory traffic. At 100 frames per second, this means 800 MB/sec, which is way below the possibilities of current graphics cards but keep in mind that this is only the fill rate and does not even take into account antialiasing. At 4 x AntiAliasing, this number increases to 3.2 GB sec. Keep in mind that we are only looking at 100 fps here whereas peak frame rates can easily exceed this value.
All of the above is essentially old news but the bottom line is that there is a huge requirement for memory bandwidth, if for nothing else but the fact that the GPU can only process data that are available from memory. Admittedly, there are additional caches on the GPU that speed things up but in the grand scheme, whatever memory architecture has been used in the past has been pushing the limits. In other words, there was a dire need for a new architecture that not only would be able to plug the existing holes but also would have some headroom for future improvements. Enter the Ring Bus.

Channel granularity at 256-bit total bus width: Ring Bus (top) vs. 4-channel Crossbar switch (bottom)
(Illustration courtesy of ATI)
The alternative is the use of separate discreet channels. In system memory configurations, dual channel memory architectures were introduced with the nForce 2 and Intel's Granite Bay chipset. The difference between the two is primarily that the Granite Bay chipset and its descendents use a direct mapping approach, in which each of the memory controller's channels can access one specific area of physical memory, similar to the mapping shown above for the direct mapped cache.
The nForce2 controller on the other hand used a crossbar switch which allowed either channel to access each area of the memory space, albeit, not simultaneously. This type of mapping is also called associative mapping (an illustration of associative mapping was shown on the last page for the caching strategies but the same principle applies here as well).
In the world of graphics controllers, crossbar switches and associative mapping have been around for several years but there are limitations of how many channels can be implemented without incurring unreasonable arbitration latencies. As a consequence, the width of each channel had to increase with the increase of the entire memory bus interface - in the case of a 256 bit interface, each of the channels finally grew to 64 bit width.
Once again, 64-bit wide channels are acceptable but incur a lack of granularity and, by extension, of efficacy of bus utilization. On the other hand, more than four channels are unacceptable because of the level of associativity and complexity of the design. Bottom line is that there is a need for an entirely new architecture. A new design, on the other hand, also offers the possibility of implementing a new signalling protocol, specifically designed to work with the new hardware.
| Sapphire RADEON X1900XTX |
Next Page: => Ringbus Details =>
If you enjoyed reading this article and found it useful, please consider making a small donation to LostCircuits.