|
Advice Beginners BIOS Guide CPUs Links Mainboards Memory Network Storage Video/Sound Cards Contact Forum SiteMap Sponsors WebNews Home |
. | . |
Prices: Mainboards ABIT ASUS Chaintech Shuttle Soyo Tyan CPU Intel P4 2.4C-800 P4 2.6C-800 P4 2.8C-800 P4 3.0-800 P4 3.2-800 AMD AthlonXP XP 1700+ XP 2000+ XP 2400+ XP 2500+ XP 2700+ XP 3000+ XP 3200+ Athlon64 Athlon64 3200+ Athlon64 FX-51 Opteron Opteron 240 Opteron 242 Opteron 244 Opteron 246 Memory Corsair Crucial Kingston Mushkin OCZ |
LOSTCIRCUITS |
|
| Pentium4 3.06 GHz GHz HyperThreading and the Non-Parallel Universe | |
| (Review by MS, Nov. 18, 2002) |
A cynical way of looking at HyperThreading would be as follows: HyperThreading is an excellent tool for pooling processor resources by creating logical processors. The aim is to simultaneously execute several independent threads of poorly written code that would otherwise leave large chunks of CPU resources idle.
A glorifying point of view would be that it is possible to run several applications at the same time and they will all be executed without performance hit as long as they are complementary in their requirements. In addition, most software is already multithreaded and, thus, it is not necessary to have several applications running at the same time to experience any benefits. The simple task of home-video editing, that is, working simultaneously on video and audio components is enough to bring out the full advantage of HT.
A realistic point of view is that one needs to very carefully chose the applications that are running simultaneously, that is, Photoshop 7.0 plus MacAffee virus scanner or WinZip plus Comanche4 to see an advantage. In other words, don't try to be smart and run WinZip and MacAffee at the same time. Likewise, Photoshop7.0 and Comanche4 don't show any real performance edge and Comanche4 and Quake3 simply don't run simultaneously regardless of HT or not. Which leaves us with multithreaded applications where, by definition, the internal threads complement each other. In reality, it is still somewhat different in that the average video encoding takes up more resources than its audio counterpart, which translates into the scenario that the audio portion can be comfortably fit into the resource gaps of the video editing workload. It's like eating Swiss cheese, the holes don't require any additional munching, as long as you eat them with the cheese.
Additional structures needed to be added to the die to accommodate the additional instructions and to determine and track the status of each instruction within the general concept of TLP. The additions result in a total of 5% die overhead which does not unnecessarily drive up cost beyond the reasonable.
On the design or die level, HT requires a few but not too many changes, that is, mostly the addition of a few buffers as the Trace Cache Fill Buffers, Trace Cache Next IP, Instruction Streaming Buffers and Instruction TLB. Other additions include the register Alias Tables and the Return Stack Predictor and finally the Next Instruction Pointer. I could try to explain everything in detail here but first of all, it would get rather long and second, it would probably be wrong anyway. Essentially, what the additional structures will accomplish is that the additional instructions used for digesting several threads simultaneously, a.k.a. HT, need to be loaded and held in temporary buffers. At the same time, the processor has to tag whatever has already been done and what has not yet been executed, along with their exact address location in the cache by means of TLBs. The additions result in approximately 5% die overhead and some redesign but hardly cost anything in terms of continuous operating expenses, that is additional wafer costs.
Theory is theory and practice is practice. How much performance increase is it that we will see in standard benchmarks. How much of a performance hit will we see in other benchmarks and will it be necessary to turn-on / turn-off HT in the BIOS depending on what applications are on the agenda of the day? The last part of the question can be answered with a simple "No", the first two items aren't that simple but we have some answers there, too.
next page: => The Test System(s) =>