[an error occurred while processing this directive]

L O S T C I R C U I T S

| WEB NEWS | Home | Beginners | BBS | CPU | Memory | Mainboard | Network | Video | Tech Advice /
LC-Guides
| Links | Price Guide | SiteMap | Contact | TellUsNews | The LostCircuits RC5 Team! | Poll |
[an error occurred while processing this directive]
 ASUS P4T (Intel i850 chipset, dual channel Rambus)
Heavyweight Championship Material (Review by MS)
top page | i850 chipset inside | specs | what you get | HIP6301 / ASUS ASIC | Layout | BIOS | setup / overclocking | business performance, memory scores | gaming performance | conclusion


i850 overview

The i850 chipset is, essentially, the most advanced member of the family of i8XX chipsets. Like its lesser family members, it features Intel's Hub architecture, meaning that the entire chipset is split into three parts.

  • the i82850 memory controller hub (MCH).
  • the i82801BA I/O controller hub (ICH).
  • the firmware controller (FWH), extended BIOS with integrated random number generator.
As stated in numerous articles before, the term Hub architecture was coined to distinguish the patented Intel way of providing a 266 MB/sec pathway between the MCH and the ICH2 from the, still, prevailing PCI bus used in a standard North - South Bridge combination. The ICH and FWH are identical as in the rest of the Intel chipset family and, therefore, don't deserve much attention here.


Intel i850 chipset components comprise the i82850 MCH and the 82801BA ICH with integrated UATA/100 and network controller

The main difference is the i82850 MCH, featuring a dual Rambus channel interface like the I 840 chipset but tripling the chipset to CPU interface by using a quad data rate (QDR) protocol at 100 MHz clock speed to provide an effective data rate of 400 Mbit per pin and second. Based on a 64-bit wide data path, the quad data rate interface, therefore, has a peak bandwidth of 3.2 GB/second.

We all know by now that peak bandwidth is rather irrelevant when it comes to actual performance. More critical are latencies, especially at higher bandwidth. Rambus memory is, unfortunately, stricken with very high latencies and, therefore, the chipset itself or else the CPU have to provide a means of compensation for the performance penalty. Intel is certainly not blind and the latency issues may be downplayed but are certainly taken serious in the background. One way to compensate for access penalties is to use prefetching of anticipated data and to incorporate a so-called in order queue into the chipset.

The IOQ

What hides behind the term in-order queue is a small amount of cache within the chipset, serving as a pipeline to buffer outstanding transactions. In other words, we are looking at a prefetch buffer, the efficiency of which (since the data have to be in order) highly depends on the locality and order of the data arriving.

The 4 level IOQ in the VIA 694X-based boards has been the first instance where performance measurements were possible and even using a relatively low latency SDRAM interface, the performance delta between a four transaction prefetch as opposed to a non-prefetch (IOQ level = 1) was in the order of approximately 10 % in graphics applications (Tyan trinity review).

With the i850 MCH, Intel has taken the IOQ one step further by extending the depth to 8 levels, therefore allowing eight outstanding transactions to be prefetched. Through this buffer, initial latencies can be masked since the data are already prefetched in the buffer and, therefore, are available without penalty cycles.

The drawback of an IOQ is that it does depend on the locality of data stored in the system memory. Prefetch can be done only for data that are stored within the same page (or memory row), since it involves adjacent column addresses that can only be captured if the row is already open. On the other hand, since the P4 also has an extremely deep pipeline of 20 levels, the thought of complementing it by another prefetch buffer makes sense. If there is a page miss, the entire scheme is hosed anyway so it doesn't matter too much if the IOQ needs to be cleared along with the CPU pipelines. Intel, however are claiming approximately 80% page hits (of all memory accesses) depending on how well their branch prediction algorithms are working.

Whether this number is correct or not largely depends on the application, unless there is software specifically written or optimized for the SSE2 instructions of the P4, real life numbers are certainly much lower. Reason enough to cause some caution about the overall system performance of the P4. In other words, we are looking at a highly optimized system capable of delivering outstanding performance under optimized conditions but as soon as anything goes wrong, the recovery latencies are quite brutal. There are a few design tricks to ameliorate the problems associated with the hyperpipelined architecture, referred to by Intel as NetBurst Microarchitecture featuring Advanced Dynamic Execution (out of order execution), Rapid Execution Engine (integer units are running at double clock speed), 400 MHz data bus and so on.

More articles on this subject from competent contemporaries like Paul DeMone with his column can be found on RealWorldTechnologies, so this should be left here, instead, let's jump right over to the candidate in question: The ASUS P4T.

=> at one glance =>
BBS | Home | Mainboards | CPU | Networking | Memory | Video | Price Guide | Tech Advice


General disclaimer: This page only reflects the author's personal opinion and assumes no responsibility whatsoever regarding any of the contents or any damages that may occur explicitly or implicitly from reading the contents of this site. All names and trademarks mentioned in this review are the exclusive property of the respective parent companies.

All contents of this site are protected by international copyright laws. Reproduction of the contents even in parts is not allowed except after written permission by the author and referral to this site.
Copyright 2000 - 2001 LostCircuits

Click Here!