Navigate:

Advice
Beginners
BIOS Guide
CPUs
Links
Mainboards
Memory
Network
Storage
Video/Sound Cards

Contact
Forum
SiteMap
Sponsors
WebNews
Home
. .

Prices:

Mainboards

ABIT
ASUS
Chaintech
Shuttle
Soyo
Tyan

CPU
Intel
P4 2.4C-800
P4 2.6C-800
P4 2.8C-800
P4 3.0-800
P4 3.2-800

AMD
AthlonXP
XP 1700+
XP 2000+
XP 2400+
XP 2500+
XP 2700+
XP 3000+
XP 3200+

Athlon64
Athlon64 3200+
Athlon64 FX-51

Opteron
Opteron 240
Opteron 242
Opteron 244
Opteron 246

Memory

Corsair
Crucial
Kingston
Mushkin
OCZ

What are you
shopping for?



































































































































































LOSTCIRCUITS

SHORTCUTS:
Top page
specs
features-DVD
video capture-SPDIF
about benchmarks
Pixel Tapestry-HyperZ-buffer
Charisma Engine
FSAA-bump mapping
setup
overclocking-3DMark2000
Incoming
Final Reality
MDK2
Quake3 Arena
FSAA
conclusion
 ATI All-In-Wonder Radeon 32 MB DDR
The Return of the ATi-Knights
(Review by MS)

PIXEL TAPESTRY™ architecture

One feature offsetting the Charisma engine from the rest of the bunch is that it possesses 3 texture units per rendering pipeline. Most graphics processors can apply one or two textures to any given pixel per clock cycle. If multiple textures are overlaid on the same pixel, that is, e.g. the base texture + gloss + specular maps to generate a real life appearance of the object rendered, the number of passes required to generate and combine the individual textures depends on the capabilities of the engine. More advanced GPUs have more than one pipeline, for example, the GeForce has four pipelines receiving input from texture units each. The Radeon Charisma engine has two pipelines with fed by three texture units each. In other words, it is capable of applying up to three textures in a single pass. Most current games are not capable of taking advantage of this feature yet, similarly most video benchmarks are optimized for multiples of 2 textures. This is something that needs to be considered when looking at e.g. 3D Mark2000 which uses four textures for the fill rate test. In this case, the Radeon can either do 3 + 1 textures or 2 + 2 textures but in either scenario, it will need two passes, just as the GeForce which, with its two texture units, is less sophisticated but can handle MadOnion benchmarks with greater efficiency.


Hyper Z-buffer

Depth is the feature that makes 3D applications 3-dimensional. Therefore, it is not surprising that the Z plane (depth) plays an utmost important role in the generation of 3-dimensional scenes. Overlapping layers do the rest to make the Z-buffer the probably busiest part of the memory subsystem, consequently, also using up the majority of bandwidth. There are several ways of how traffic can be reduced, that is, bandwidth can be conserved.

Z-Buffer compression

Z-buffers can be divided into two separate units, the first being the internal Z-buffer integrated into the graphics engine itself, the second being the external Z-buffer which is stored in the local frame buffer. An analogy would probably be the L1 and backside L2 cache of the Slot CPUs, not in terms of clock speed but regarding the overall speed. That is, the external Z-buffer is substantially slower than the integrated internal Z-buffer but it is also way slower. This is compensated for by subdividing the Z-buffer into 8 or 64 pixel blocks which are stored in compressed format. The benefits are two fold, in that compressed data take up much less space and further occupy less memory bandwidth for data transfer to the internal Z-buffer where they are decompressed if needed. The compression factor is somewhere between ½ and ¼, resulting in some 2-4 fold better usage of the external Z-buffer and similarly faster data transfer to the internal Z-buffer. At low resolutions, Z-buffer compression may not yield that much benefit, however, with increasing resolution, video data exceed the space of the local frame buffer and need to be stored within the AGP aperture of the system memory. Under these conditions, any compression will increase the data transfer to the internal Z-buffer and offset the drop in fill rate usually caused by falling back to a unified memory architecture.

Fast Z-buffer clear and Hierarchical Z-buffer

Before any new frame can be rendered the Z-buffer needs to be cleared. Clearing every single block of the internal and external Z-buffer (zero-fill of all blocks) requires an additional write step taking up precious time and bandwidth. ATi has devised a smart workaround for this problem. Similar to formatting a Hard Drive the conventional way as opposed to a low-level format, the data in the external Z-buffer are merely tagged as erased, which then causes the internal Z-buffer only to zero-fill all blocks. This process safes substantial time and conserves bandwidth, however, only few games currently are demanding enough to take advantage of fast Z-buffer clear.

Another way to reduce bandwidth is the use of a hierarchical Z-buffer. A hierarchical Z-buffer is basically a low resolution matrix of visible pixels that can be used to compare the depth of any pixel before it is rendered to the hierarchical map and, whatever does not match is being thrown out. The hierarchical Z-buffer employs 64 pixel tiles (8 x 8), i.e., very low resolution which, can lead to visual artifacts if such tile coincides with edges of objects at different Z-planes since the wrong pixels may be rejected. A hierarchical Z-buffer can only be used to compare 3D maps after the triangle setup was established but prevents rendering of invisible pixels.

Next Page:    => There is more than one way to skin a vertex =>

Click Here!

If you enjoyed reading this article and found it useful, please consider making a small donation to LostCircuits.
Thank you!

General disclaimer: This page only reflects the author's personal opinion and assumes no responsibility whatsoever regarding any of the contents or any damages that may occur explicitly or implicitly from reading the contents of this site. All names and trademarks mentioned in this review are the exclusive property of the respective parent companies.
All contents of this site are protected by international copyright laws. Reproduction of the contents even in parts is not allowed except after written permission by the author and referral to this site.
Copyright 1998 - 2008 LostCircuits