Danny and I were invited to a session given by Intel in San Francisco during the Intel Developers Forum happening this week at the Moscone Center. While I cannot currently comment on the exact nature of that session as some of it was under NDA, I did make a couple of observations that were interesting. I'm sure the roadmap stuff is off-limits here, but some other things are probably OK (he said while glancing over his shoulder)... however if asked, I'll deny ever saying this ;-)...
The talks were rather dry, but I've come to expect that from Intel. Multi-core was the hot button topic. They kept re-iterating that Moore's Law was still a long way from finally hitting the wall. So how are they doubling the transistor count? Simple, double the number of CPU cores on the die. This tells me that CPUs themselves have probably reached the top as far as their complexity. Part of this is because of power consumption. As the oxide layers get thinner and thinner, the amount of current leakage increases. Strained insulator technology has helped mitigate this somewhat, but it is still getting worse. So by doubling the number of CPUs on the die, they can now better manage power because whole CPU cores can be shutdown to save power.
Other things of note were the 64bit items. They had an independent industry analyst get up and do a spiel that basically threw a whole bucket of cold water on the 64bit stuff and concentrated on the fact that multi-core is the hot item of the year. He touted the fact that with multi-core, the cost of mult-core systems can now become a common reality for the consumer market. Unlike 64bit that requires a complete rebuild of the applications (sans the whole 64bit .NET issue, in which .NET was only given a cursory glance), multi-core can benefit existing applications that already take advantage of multi-threading and multi-cpu. They kept touting video transcoding, gaming, collaboration, VOIP as applications that can immediately see the benefits of multi-core systems being ubiquitous.
As for the 64bit stuff, the Intel folks actually said that they have seen many cases where 64bit applications were actually slower than their 32bit counterparts! They attributed this to overburdened caches and overall code-bloat, especially in pointer intensive applications. They also stated that the advantages of 64bit was mainly increased memory addressability and not some mythical boost in performance. Sure, for memory intensive applications (read SQL database servers, video transcoding servers, etc..), the increased memory address space can lead to better overall performance, but that is not where the vast majority of applications spent their time. In fact they are pushing multi-core along with better multi-threaded applications as being the path to performance, irrespective of “bitness.”
Opteron, the chip from AMD, was mentioned several times, but AMD, the company, was only mentioned once, by the analyst dude. The Athlon 64 and Athlon 64FX chips were never mentioned, I suppose because they are the mainstream competitor to the new EM64T stuff from Intel. Actually, Danny and I were quite surprised that they even acknowledged that AMD even existed. They did this mainly to head off those obvious questions that are bound to come up. In fact they said that is why they mentioned it.
What does this mean for Delphi and Borland? Well, first of all, it certainly gives us more ammunition with which to go to management with proposals to schedule time for a alternative memory managers that better take advantage of these newer architectures. As for 64bit, we are still looking at the market as for when the best time to jump in would be. Clearly it is coming.. how fast and how soon... well... even Intel can't predict that. The talk from the analyst dude was quite interesting from a market trend point of view. It was what he didn't say that was most interesting. He didn't tell folks to jump right now on the 64bit bandwagon, but rather jump on the multi-core, multi-threaded bandwagon! In fact he played the whole “rebuild your application for 64bit” as a negative.
My web servers and db servers are already multi-threaded. Getting multi-cpus in the same foot-print as before would be welcome. What I haven't seen are 64-bit dual core chips hitting the market, yet. And so I find it more than a coincidence that 64-bit is down-played. When dual core 64-bit cpus are release we'll may see a new "spin" come from industry analysts.
ReplyDeleteAs for pointer-intensive applications taking a performance hit, yes we have already been getting reports that this is the case. However, the peformance hit is within the 64-bit context; thus, performance gains made by the 64-bit architecture overall are off-set by the "intensive use of pointers" hit in some applications. What this most probably means in real terms is that for some applications there will not be a performance increase, but I see no reason to expect slower performance. Herb Sutter found this to be the case when compiling his VC++ compiler - the same code base - as Win32 and Win64 mode. Have a look at his blog. A pointer intensive app to be sure.
-d
Dennis,
ReplyDeleteaccording to Intel, the multi-core Pentium Ds (Smithfield) will all be 64-bit enabled (EM64T), same to the multi-core Xeons.
-Peter
Dennis,
ReplyDeleteAs for the pointer intensive applications taking a hit, I think that those problems will eventually be mitigated by tuning the caches or making them more adaptive to the mode in which they are running.
Also, as Peter mentioned, all the multi-core CPUs will be EM64T enabled. Another thing to note about this, is that these are going to be HT cores, so to the OS one chip will appear to be four CPUs. However thread scheduling will be a little different because there is the notion of shared pipe-line for the HT core and shared caches between the cores. Unlike a normal SMP system, these cores are still sharing a lot of silicon.
Allen re: HT multi-cores
ReplyDeleteHm, the shared hardware of the HT cores will make it very difficult to predict exactly where and how the performance increase, if any, will be acheived. The equation isn't linear. It will really depend on the kind of application in question and if Hyper-threading is being intelligently utilized. More to learn...
-d
Dennis,
ReplyDeleteIndeed it will. However, again, cache tuning and other internal trickery will help mitigate this. What *is* interesting is that it seems that Intel is leaving NUMA up to third-parties rather than trying to build it into their systems like AMD has already done with their Opteron and Athlon 64 line.
<the Intel folks actually said that they have seen many cases where 64bit applications were actually slower than their 32bit counterparts>
ReplyDeleteI think they are not giving much importance to 64bits now is because intel's current 64bit x86 technology is not very good performance wise according to some websites.
No doubt 4GB limitation will start to hurt very soon even on the desktops.
ReplyDeleteAllen, re: NUMA & HT
ReplyDeleteNUMA is particularly intriguing in the HT context. But since you have multiple logical processors hitting the same local cache (hyper-threads per CPU on the same NUMA node as the memory cache), it seems to me that pointer distance issues are still in play. Not as egregious as the SMP architecture; but, architecturally replicated nevertheless(albeit with smaller distances). It would be interesting to know if anybody is contemplating creating logical local caches per hyper-thread... I don't see, at the moment, why that would be a bad Idea. We seem to have a mirror within mirror effect here, but the complexity isn't that bad.
-d
Here's an interesting article that is basically saying the same thing.
ReplyDeletehttp://www.gotw.ca/publications/concurrency-ddj.htm
According to the article below, not all the dual-core'd "Smithfield" chips will have Hyperthreading support enabled -- only the "Extreme Edition" parts, which so far have mainly been targeted at the gaming/performance market as far as I know.
ReplyDeleteAlthough I can see the benefits of multi-core'd CPU's, I can't but help think that (at least initially) it will end up causing Joe Average a headache. A perfect example of this comes from a work colleague who was telling me earlier today how a VPN client he has will not run on his computer (Intel P4) due to it detecting multiple processors! Seems ridiculous that anyone would code such a limitation in this day and age, but I guess many more problems like this (and even worse, for all those multi-threaded applications which have never seen a multi-cpu machine in their life) will start rearing their ugly head's in the near future... be warned!
<http://www.theregister.co.uk/2005/03/01/intel_pentium_d/>
M. Williams,
ReplyDeleteActually if you read his article which I already referred to ealier, you'll see that Sutter's VC++ compiler runs at the same speeds on both Win32 and Win64, because of the "intensive use of pointers" issue. There isn't a performance loss for such applications and there isn't a performance gain. So he isn't saying the same thing as the "industry analyst".
-d
Dennis,
ReplyDeleteIt wasn't the "analyst" that made the comment about the performance hit... it was the *Intel* dude. That was what I found rather odd, and was why I even bothered mentioning it. If it had been the analyst, it would have passed on by with narry a second thought.
Oh. I got the impression the "intel dude" and the "analyst" were the same guy... Sorry.
ReplyDeleteSeveral benchmarks show winxp64 around 30-40% faster in the same amd system. The wow32 layer permit to 32bit application run around 5% faster too :)
ReplyDelete>the Intel folks actually said that they have seen many cases where
ReplyDelete>64bit applications were actually slower than their 32bit counterparts!
Yep, but all benchmarking sites and reviews have pointed that this is an Intel problem, Athlon64 doesn't suffer from it anywhere near as much, and speed ups with recent 64bit compiler revision are the now norm.
What ever that analyst said is totally wrong! :-)
ReplyDeleteMicrosoft, Intel: The Time For 64-Bit is Now
http://www.internetnews.com/dev-news/article.php/3486791
What ever that analyst said is totally wrong! ;-)
ReplyDeleteMicrosoft, Intel: The Time For 64-Bit is Now
http://www.internetnews.com/dev-news/article.php/3486791
I can now buy a 24 GByte ram PC for <EUR10K.
ReplyDeleteRam is 1000 times faster than disk. I don't care
that 64 bit code is a bit slower than 32 bit. The factor 1000 is important, the factor 2 is not.