|
Advice Beginners BIOS Guide CPUs Links Mainboards Memory Network Storage Video/Sound Cards Contact Forum SiteMap Sponsors WebNews Home |
. | . |
Prices: Mainboards ABIT ASUS Chaintech Shuttle Soyo Tyan CPU Intel P4 2.4C-800 P4 2.6C-800 P4 2.8C-800 P4 3.0-800 P4 3.2-800 AMD AthlonXP XP 1700+ XP 2000+ XP 2400+ XP 2500+ XP 2700+ XP 3000+ XP 3200+ Athlon64 Athlon64 3200+ Athlon64 FX-51 Opteron Opteron 240 Opteron 242 Opteron 244 Opteron 246 Memory Corsair Crucial Kingston Mushkin OCZ |
LOSTCIRCUITS
|
|
| AMD Athlon64 "Venice" May Low Power be with you! | |
|
(Review by MS May 2, 2005) |
| AMD Athlon 64 4000+ |
Decoupling Capacitors
Taking a trip down memory lane, one of the disappointments in the history of AMD processors was the release of the original Thoroughbred cores that drew a lot of power and were running awfully hot. The entire situation has always been somewhat reminiscent of Mr.Herbert Neugebauer, chief of the Mercedes F1 racing team walking up to the “Commendatore” Enzo Ferrari in 1955 and pointing at the exhaust manifold comparable to a hi-tech torch and commenting that that a flamethrower was the wrong way to burn all that power that should rather be delivered to the tires. That was in the days of Juan Manuel Fangio, now we have Michael Schumacher and instead of the Thoroughbred revisions, we are looking at the Winchester vs. Venice core. In this case, no additional metal layer was added as in the case of the Thoroughbred, however, we know that additional decoupling capacitors were added to clean up signals and power and ground planes.
One issue that has plagued dual channel memory operation wherever it was implemented (except for the nForce2) was that within one channel, both memory slots needed to be populated with identical modules. In the case of the P4, this makes perfect sense in that the banks are simply combined to make a single 128-bit wide memory unit out of two 64-bit wide subunits. The sense in this case is the simplicity of the idea. On the other hand, regardless of which way one looks at it, the integrated AMD memory controller is a bit more sophisticated and can do more than a simple addition of 64+64.
Keep in mind also that the memory management unit, which is part of the actual CPU core performs virtual address mapping, meaning that virtual addresses that are unique on a per program basis can be translated to any kind of physical address within the memory array. Therefore all that is necessary is that the controller knows what memory devices are available where, in order to work in a dual channel mode. Arguably, this is a bit more complex than simply mirroring the physical addresses ŕ la Intel over two separate channels but it also offers a much more flexible architecture with less redundancy because data are not read based solely on where they are in the array but rather based on whether they are needed or not. In practice, it depends on the application at hand whether there is a performance difference between the two methods, if there is a high locality of data, then the difference should be negligible.
SSE3 instructions minus MONITOR and MWAIT (Hyperthreading only)
We mentioned it in earlier reviews, SSE3 can provide elegant ways to reduce instructions, particularly those that deal with the conversion of X87 floating point values into integer numbers. The trick in this case is that instead of trying to determine whether the relevant value needs to be rounded up or rounded down (by chopping the decimals), SSE3 instructions will ignore the floating point control word (FCW) to simply chop off the decimals. The result is that the average of all converted FPU numbers is offset by –05, which, for most practical purpose applications has no relevance at all. In addition, SSE3 instructions allow processing of Structure of Arrays (SOA) type of data for 3D geometry calculations as opposed to arrays of structures that are used to define Vector4 structures (xyz-alphablending) in most current graphics applications. What that means is that instead of compiling differential vertices, the individual components are processed, that is, all x values are added / subtracted, the same goes for the y, z and alpha values and only the final result is used to generate the vertex. Missing from the SSE3 instruction set as implemented in the Venice core are the MONITOR and MWAIT instructions that are relevant only for HyperThreading.
|
Athlon64-3500+ (Venice Core) |
next page: => Test Setup and Data Acquisition =>
All advice and educational articles on LostCircuits are free, but if you feel you can, please make a small donation to us!