PC Review


Reply
Thread Tools Rate Thread

Intel might revive Hyperthreading with Nehalem

 
 
lyon_wonder
Guest
Posts: n/a
 
      21st Nov 2006
http://www.vr-zone.com/?i=4322

The next generation Intel processor based on the Nehalem architecture
is clearly exciting as VR-Zone has learned. Successor to quad core
Yorkfield which forms part of the 45nm Penryn architecture, Bloomfield
will come along and sit right on top of the 45nm Nehalem desktop
processors in mid 2008. Bloomfield will have 4 cores and is capable of
8 threads like the old Hyper-Threading technology but only more
advanced. Bloomfield will contain an integrated memory controller that
requires a new socket refresh called Socket B with 1366 contact pads.
 
Reply With Quote
 
 
 
 
Yousuf Khan
Guest
Posts: n/a
 
      21st Nov 2006
lyon_wonder wrote:
> http://www.vr-zone.com/?i=4322
>
> The next generation Intel processor based on the Nehalem architecture
> is clearly exciting as VR-Zone has learned. Successor to quad core
> Yorkfield which forms part of the 45nm Penryn architecture, Bloomfield
> will come along and sit right on top of the 45nm Nehalem desktop
> processors in mid 2008. Bloomfield will have 4 cores and is capable of
> 8 threads like the old Hyper-Threading technology but only more
> advanced. Bloomfield will contain an integrated memory controller that
> requires a new socket refresh called Socket B with 1366 contact pads.


I'm guessing that this will be true hardware multithreading as opposed
to the "exploit the time between our various inefficiencies" type of
multithreading that was Hyperthreading.

Yousuf Khan
 
Reply With Quote
 
 
 
 
David Kanter
Guest
Posts: n/a
 
      26th Nov 2006

Yousuf Khan wrote:
> lyon_wonder wrote:
> I'm guessing that this will be true hardware multithreading as opposed
> to the "exploit the time between our various inefficiencies" type of
> multithreading that was Hyperthreading.


Perhaps you'd care to explain the difference between the two?

AFAIK, all multithreading relies on exploiting the time between our
various inefficiencies to improve performance...

DK

 
Reply With Quote
 
krw
Guest
Posts: n/a
 
      26th Nov 2006
In article <(E-Mail Removed)>,
(E-Mail Removed) says...
>
> Yousuf Khan wrote:
> > lyon_wonder wrote:
> > I'm guessing that this will be true hardware multithreading as opposed
> > to the "exploit the time between our various inefficiencies" type of
> > multithreading that was Hyperthreading.

>
> Perhaps you'd care to explain the difference between the two?
>
> AFAIK, all multithreading relies on exploiting the time between our
> various inefficiencies to improve performance...


Can two threads be dispatched/completed simultaneously? Does one
thread have to wait for the other to flush? These are two
improvements on P4s implementation I can imagine.

--
Keith
 
Reply With Quote
 
joshk18@gmail.com
Guest
Posts: n/a
 
      27th Nov 2006

krw wrote:
> In article <(E-Mail Removed)>,
> (E-Mail Removed) says...
> >
> > Yousuf Khan wrote:
> > > lyon_wonder wrote:
> > > I'm guessing that this will be true hardware multithreading as opposed
> > > to the "exploit the time between our various inefficiencies" type of
> > > multithreading that was Hyperthreading.

> >
> > Perhaps you'd care to explain the difference between the two?
> >
> > AFAIK, all multithreading relies on exploiting the time between our
> > various inefficiencies to improve performance...


> Can two threads be dispatched/completed simultaneously?


AFAIK, yes. The only hardware structures that cannot be used by both
threads simultaneously are the trace cache and decoder (see Tullsen's
PACT03 paper).

> Does one
> thread have to wait for the other to flush? These are two
> improvements on P4s implementation I can imagine.


I don't know the answer to that, however, I suspect only certain types
of flushes impact both threads. For instance, a TC flush would hit
both threads. However, flushing the ROB and RS should only impact one
thread, since they are statically partitioned.

Again, I'll ask what is this 'true' multithreading that Yousuf
mentions? The P4 uses simultaneous multithreading and it's just as
real as the POWER5/6, or the SoEMT used in Montecito, Niagara and the
older IBM systems (northstar or pulsar) and Tera's systems.
Multithreading fundamentally relies on exploiting the difference
between average IPC and peak IPC, i.e. making up for inefficiencies in
a design. Where's the beef?

DK

 
Reply With Quote
 
krw
Guest
Posts: n/a
 
      27th Nov 2006
In article <(E-Mail Removed)>,
(E-Mail Removed) says...
>
> krw wrote:
> > In article <(E-Mail Removed)>,
> > (E-Mail Removed) says...
> > >
> > > Yousuf Khan wrote:
> > > > lyon_wonder wrote:
> > > > I'm guessing that this will be true hardware multithreading as opposed
> > > > to the "exploit the time between our various inefficiencies" type of
> > > > multithreading that was Hyperthreading.
> > >
> > > Perhaps you'd care to explain the difference between the two?
> > >
> > > AFAIK, all multithreading relies on exploiting the time between our
> > > various inefficiencies to improve performance...

>
> > Can two threads be dispatched/completed simultaneously?

>
> AFAIK, yes. The only hardware structures that cannot be used by both
> threads simultaneously are the trace cache and decoder (see Tullsen's
> PACT03 paper).
>
> > Does one
> > thread have to wait for the other to flush? These are two
> > improvements on P4s implementation I can imagine.

>
> I don't know the answer to that, however, I suspect only certain types
> of flushes impact both threads. For instance, a TC flush would hit
> both threads. However, flushing the ROB and RS should only impact one
> thread, since they are statically partitioned.
>
> Again, I'll ask what is this 'true' multithreading that Yousuf
> mentions? The P4 uses simultaneous multithreading and it's just as
> real as the POWER5/6, or the SoEMT used in Montecito, Niagara and the
> older IBM systems (northstar or pulsar) and Tera's systems.
> Multithreading fundamentally relies on exploiting the difference
> between average IPC and peak IPC, i.e. making up for inefficiencies in
> a design. Where's the beef?


The P4 cannot dispatch or complete two instructions from opposite
threads in the same cycle (I thought *star could). This may be
caused by the trace cache limitation you mention. I'm not sure
this is all that important and certainly adds complication.

--
Keith
 
Reply With Quote
 
David Kanter
Guest
Posts: n/a
 
      27th Nov 2006
krw wrote:

> The P4 cannot dispatch or complete two instructions from opposite
> threads in the same cycle (I thought *star could). This may be
> caused by the trace cache limitation you mention. I'm not sure
> this is all that important and certainly adds complication.


So you're claiming that the P4 SMT is not *real* because it does round
robin retirement?

Again, this discussion is in the context of Yousuf's ridiculous claim
(which I'd note he is too chicken, or unable to come out and back up).
I don't consider this evidence that hyperthreading is some sort of
half-assed or less than 'true' multithreading.

I'd also point out that by that criterion, the POWER5 is also not
*real* SMT; it can only dispatch instructions from one thread at a time
to form a group....

I guess according to Yousuf this means that the POWER5 doesn't have
*real* multithreading either.

DK

 
Reply With Quote
 
Robert Redelmeier
Guest
Posts: n/a
 
      27th Nov 2006
David Kanter <(E-Mail Removed)> wrote in part:
> krw wrote:
>> The P4 cannot dispatch or complete two instructions from opposite
>> threads in the same cycle (I thought *star could). This may be
>> caused by the trace cache limitation you mention. I'm not sure
>> this is all that important and certainly adds complication.

>
> So you're claiming that the P4 SMT is not *real* because
> it does round robin retirement?


Sorry to butt in, but strict RR would kill the main SMT
advantage: running another thread during the 200-300 clocks
that one is waiting for RAM fetch. RR would force all threads
to block because one couldn't retire.

This is why SMT isn't a universal win: it is highly app dependant.
If the app is reasonably optimized and doesn't suffer too many
memory stalls, there's not many scraps for a second thread.
Particularly not on a dispatch-limited arch like Pentium4.

> I'd also point out that by that criterion, the POWER5 is
> also not *real* SMT; it can only dispatch instructions from
> one thread at a time to form a group....


"Chunking" is not a problem so long as the CPU doesn't stall.

-- Robert

 
Reply With Quote
 
krw
Guest
Posts: n/a
 
      27th Nov 2006
In article <(E-Mail Removed)>,
(E-Mail Removed) says...
> krw wrote:
>
> > The P4 cannot dispatch or complete two instructions from opposite
> > threads in the same cycle (I thought *star could). This may be
> > caused by the trace cache limitation you mention. I'm not sure
> > this is all that important and certainly adds complication.

>
> So you're claiming that the P4 SMT is not *real* because it does round
> robin retirement?


Cool your jets, Dave. Do you have to make *every* thread personal?
There is a reason I've been trying to ignore you, but I thought
this was a chance to discuss. I guess not.

It is not *I* who is claiming a DAMNED THING! No sense in
continuing...

--
Keith

 
Reply With Quote
 
Yousuf Khan
Guest
Posts: n/a
 
      28th Nov 2006
David Kanter wrote:
> Again, this discussion is in the context of Yousuf's ridiculous claim
> (which I'd note he is too chicken, or unable to come out and back up).
> I don't consider this evidence that hyperthreading is some sort of
> half-assed or less than 'true' multithreading.


Whoa there buddy-boy, I'm just seeing this thread for the first time,
didn't even know there was a reply to my reply, until today. I just
don't have time to go through every thread and follow it up diligently.

Now I'll go ignore my sleep and read through this thread just to
continue this pointless debate. :-)

Yousuf Khan
 
Reply With Quote
 
 
 
Reply

Thread Tools
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Intel Nehalem vs Penryn, Should I wait? Teknowbabble DIY PC 2 10th Jun 2008 08:49 PM
Intel announces "Nehalem" - 8-core CPU with mem-controller + graphic capabilities AirRaid Video Cards 4 28th Apr 2007 10:22 PM
Intel announces "Nehalem" - 8-core CPU with mem-controller + graphic capabilities AirRaid AMD 64 Bit 4 28th Apr 2007 10:22 PM
Intel announces "Nehalem" - 8-core CPU with mem-controller + graphic capabilities AirRaid ATI Video Cards 4 28th Apr 2007 10:22 PM
Is Amd going to copy Hyperthreading ?? would like an Amd64 with hyperthreading :( The Other Guy. Computer Hardware 3 28th Nov 2004 01:13 AM


Features
 

Advertising
 

Newsgroups
 


All times are GMT +1. The time now is 10:13 AM.