IBM's new memory latency reduction technology for Opterons

Y

Yousuf Khan

Will allow IBM Opteron servers to keep running at full 667Mhz speeds
even with a full complement of 8 DIMMs.

Turbo-charged memory and Opterons rev up IBM servers - Computing
"IBM hopes to distinguish itself in Opteron servers through a
patent-pending capability it calls Xcelerated Memory Technology that
will boost memory access speeds by up to 15 percent over rivals, the
firm claimed. “It’s like a turbo-charger for memory,” said Stuart McRae,
IBM system marketing manager. “It gives us the full 128GB [RAM capacity]
at the full 667MHz speed. Others will be restricted to four Dimms per
CPU whereas we will have eight.”"
http://www.computing.co.uk/itweek/news/2161520/turbo-charged-memory-opterons

Yousuf Khan
 
T

Tony Hill

Will allow IBM Opteron servers to keep running at full 667Mhz speeds
even with a full complement of 8 DIMMs.

Turbo-charged memory and Opterons rev up IBM servers - Computing
"IBM hopes to distinguish itself in Opteron servers through a
patent-pending capability it calls Xcelerated Memory Technology that
will boost memory access speeds by up to 15 percent over rivals, the
firm claimed. “It’s like a turbo-charger for memory,” said Stuart McRae,
IBM system marketing manager. “It gives us the full 128GB [RAM capacity]
at the full 667MHz speed. Others will be restricted to four Dimms per
CPU whereas we will have eight.”"
http://www.computing.co.uk/itweek/news/2161520/turbo-charged-memory-opterons

Anyone care to through out a guess as to just what the hell this
technology actually does? I certainly can't figure it off the top of
my head and the above article doesn't give any details at all.
 
N

nobody

Will allow IBM Opteron servers to keep running at full 667Mhz speeds
even with a full complement of 8 DIMMs.

Turbo-charged memory and Opterons rev up IBM servers - Computing
"IBM hopes to distinguish itself in Opteron servers through a
patent-pending capability it calls Xcelerated Memory Technology that
will boost memory access speeds by up to 15 percent over rivals, the
firm claimed. “It’s like a turbo-charger for memory,” said Stuart McRae,
IBM system marketing manager. “It gives us the full 128GB [RAM capacity]
at the full 667MHz speed. Others will be restricted to four Dimms per
CPU whereas we will have eight.”"
http://www.computing.co.uk/itweek/news/2161520/turbo-charged-memory-opterons

Anyone care to through out a guess as to just what the hell this
technology actually does? I certainly can't figure it off the top of
my head and the above article doesn't give any details at all.

Yes, the whole thing looks fishy. Unless IBM made a change to the
memory controller, it wouldn't work as described. As we all know,
Opteron has a built-in controller. Even though theoretically possible
that AMD makes some special order chips for IBM, or IBM fabs a few
Opterons for themselves (after all AMD and IBM share the technology)
the real chance of it is less than a snowfall in NY tomorrow (ConEd
wish ;-)
The other possibility is that IBM implements their own memory
controller in the chipset. Sure Opteron can use an external memory
controller, but why? The overhead from going to memory via HT link
would eat up all the advantages the external controller may offer, and
then some.
Yet another possibility is a BIOS hack (can't call it anything else)
that forces the memory to run at 667, no matter how much installed.
But then, how do they deal with instabilities of such a setup? With
extra DIMMs installed, the memory clock drops down for a reason, and
corporate server aint your overclocked gaming rig, it's first, second,
and third about stability and only then about performance.
 
G

George Macdonald

Will allow IBM Opteron servers to keep running at full 667Mhz speeds
even with a full complement of 8 DIMMs.

Turbo-charged memory and Opterons rev up IBM servers - Computing
"IBM hopes to distinguish itself in Opteron servers through a
patent-pending capability it calls Xcelerated Memory Technology that
will boost memory access speeds by up to 15 percent over rivals, the
firm claimed. “It’s like a turbo-charger for memory,” said Stuart McRae,
IBM system marketing manager. “It gives us the full 128GB [RAM capacity]
at the full 667MHz speed. Others will be restricted to four Dimms per
CPU whereas we will have eight.”"
http://www.computing.co.uk/itweek/news/2161520/turbo-charged-memory-opterons

Anyone care to through out a guess as to just what the hell this
technology actually does? I certainly can't figure it off the top of
my head and the above article doesn't give any details at all.

Some kinda mbrd buffering, which they've managed to make compatible with
the AMD memory controller? Remember the FET switches on the 4-slot i440BX
mbrds?.. though that kinda thing, with DDR2, is way outside my area of
expertise. Hmm, is Keith gonna have to remain silent on this "patented
technology"?:)

I wonder which chipset IBM is using here??
 
Y

YKhan

Tony said:
Anyone care to through out a guess as to just what the hell this
technology actually does? I certainly can't figure it off the top of
my head and the above article doesn't give any details at all.

My guess, they've found a way to arrange the memory slots in a way that
keeps all of them equidistant from each of the Opterons. Thus
propagation delays are reduced. Also let's not forget that since these
are servers, these are registered DIMMs too, so the DIMMs' own internal
buffer would help them out.

Yousuf Khan
 
K

krw

fammacd=! said:
Hmm, is Keith gonna have to remain silent on this "patented
technology"?:)

Yep! If one knows nothing, remaining silent is a good idea. ;-)
....wrong end of the company, though Del might know more.
 
T

Tony Hill

My guess, they've found a way to arrange the memory slots in a way that
keeps all of them equidistant from each of the Opterons. Thus
propagation delays are reduced. Also let's not forget that since these
are servers, these are registered DIMMs too, so the DIMMs' own internal
buffer would help them out.

I suppose that's possible, but it hardly seems worthy of a patented
and a fancy name that starts with the letter 'X' :)

Here's an odd-ball solution: the big thing they are claiming is that
they can handle 8 DIMMs/CPU at DDR2-667 speeds. From what I can glean
off the marketing fluff there is nothing that will actually improve
performance any, only that the memory speed is 15% faster (ie it's
running at DDR2-667 speeds instead of DDR2-533).

So, keeping the above in mind, maybe they've hung another memory
controller off hypertransport? 4 DIMMs connect to the CPU directly, 4
hang off a HT memory controller. It's a terrible way of doing things
and almost certainly won't improve performance at all vs. having 8
DDR2-533 DIMMs connected, but it might give them a marketing
advantage.

Ok, it's a long-shot, but I just thought I would toss this option out.
 
Y

YKhan

Tony said:
I suppose that's possible, but it hardly seems worthy of a patented
and a fancy name that starts with the letter 'X' :)

Here's an odd-ball solution: the big thing they are claiming is that
they can handle 8 DIMMs/CPU at DDR2-667 speeds. From what I can glean
off the marketing fluff there is nothing that will actually improve
performance any, only that the memory speed is 15% faster (ie it's
running at DDR2-667 speeds instead of DDR2-533).

So, keeping the above in mind, maybe they've hung another memory
controller off hypertransport? 4 DIMMs connect to the CPU directly, 4
hang off a HT memory controller. It's a terrible way of doing things
and almost certainly won't improve performance at all vs. having 8
DDR2-533 DIMMs connected, but it might give them a marketing
advantage.

Ok, it's a long-shot, but I just thought I would toss this option out.

That would do more to hurt IBM's reputation than enhance it.

My next guess, would be maybe IBM created a repeater device? Not a
switch, because that would add latency, but a simple repeater which is
positioned equidistantly from all of the DIMMs, and then the Opteron's
memory controller would be equidistant only to the repeater. The
repeater would not reprocess the signal in any way other than to maybe
reboost the signal a bit.
 
J

jack

: On Wed, 02 Aug 2006 13:33:42 -0400, Yousuf Khan <[email protected]>
: wrote:
:
: >Will allow IBM Opteron servers to keep running at full 667Mhz speeds
: >even with a full complement of 8 DIMMs.
: >
: >Turbo-charged memory and Opterons rev up IBM servers - Computing
: >"IBM hopes to distinguish itself in Opteron servers through a
: >patent-pending capability it calls Xcelerated Memory Technology that
: >will boost memory access speeds by up to 15 percent over rivals, the
: >firm claimed. "It's like a turbo-charger for memory," said Stuart McRae,
: >IBM system marketing manager. "It gives us the full 128GB [RAM capacity]
: >at the full 667MHz speed. Others will be restricted to four Dimms per
: >CPU whereas we will have eight.""
:
:
: Anyone care to through out a guess as to just what the hell this
: technology actually does? I certainly can't figure it off the top of
: my head and the above article doesn't give any details at all.

On deck and "ready to bat," Daytripper???

j.
 
K

krw

: On Wed, 02 Aug 2006 13:33:42 -0400, Yousuf Khan <[email protected]>
: wrote:
:
: >Will allow IBM Opteron servers to keep running at full 667Mhz speeds
: >even with a full complement of 8 DIMMs.
: >
: >Turbo-charged memory and Opterons rev up IBM servers - Computing
: >"IBM hopes to distinguish itself in Opteron servers through a
: >patent-pending capability it calls Xcelerated Memory Technology that
: >will boost memory access speeds by up to 15 percent over rivals, the
: >firm claimed. "It's like a turbo-charger for memory," said Stuart McRae,
: >IBM system marketing manager. "It gives us the full 128GB [RAM capacity]
: >at the full 667MHz speed. Others will be restricted to four Dimms per
: >CPU whereas we will have eight.""
:
:
: Anyone care to through out a guess as to just what the hell this
: technology actually does? I certainly can't figure it off the top of
: my head and the above article doesn't give any details at all.

On deck and "ready to bat," Daytripper???

Wrong company. ;-)
 
J

jack

<snip>

: > : Anyone care to through out a guess as to just what the hell this
: > : technology actually does? I certainly can't figure it off the top of
: > : my head and the above article doesn't give any details at all.
: >
: > On deck and "ready to bat," Daytripper???
:
: Wrong company. ;-)

Yeah but, the guy's "organic knowledge base" is immense, and he really seems
to be wired in to all the leading (bleeding?) edge stuff!

j.
 
D

Del Cecchi

krw said:
Yep! If one knows nothing, remaining silent is a good idea. ;-)
....wrong end of the company, though Del might know more.

My curiosity is aroused. I will try to check it out when I get back to
work. On vacation at the moment.
 
D

daytripper

<snip>

: > : Anyone care to through out a guess as to just what the hell this
: > : technology actually does? I certainly can't figure it off the top of
: > : my head and the above article doesn't give any details at all.
: >
: > On deck and "ready to bat," Daytripper???
:
: Wrong company. ;-)

Yeah but, the guy's "organic knowledge base" is immense, and he really seems
to be wired in to all the leading (bleeding?) edge stuff!

j.

Shucks ;-)

Honestly, I just got back from some much needed r&r and marrying off one of my
sons, and this jumped up out of nowhere. But it has captured my curiosity.

Someone already suggested some motherboard buffer devices. I'd think dropping
simple fet switches down ala hot-plug slots wouldn't make this work - too much
capacitance from all those components. Otoh, perhaps multichannel fet-muxes
and radial wiring rules might do the trick, but the wiring to 8 dimms would
likely drive up the layer count.

If one is allowed to sacrifice latency for bandwidth, those could be clocked
fanout/fanin devices as well, which would allow somewhat simpler wiring rules.
Taken to the extreme, the moral equivalent of an AMB but for DDR2 dimms could
up the bandwidth at the expense of latency and extra cost.

Gonna have to do some digging on this one...

Cheers

/daytripper
 
K

krw

Shucks ;-)

Honestly, I just got back from some much needed r&r and marrying off one of my
sons, and this jumped up out of nowhere. But it has captured my curiosity.

Marrying off a son is R&R? I've had one on that line for a couple
of years. ...never seen any R&R. ;-)
Someone already suggested some motherboard buffer devices. I'd think dropping
simple fet switches down ala hot-plug slots wouldn't make this work - too much
capacitance from all those components. Otoh, perhaps multichannel fet-muxes
and radial wiring rules might do the trick, but the wiring to 8 dimms would
likely drive up the layer count.

....and you think layers is a problem? It is interesting, including
the lack of, other than marketing, detail.
If one is allowed to sacrifice latency for bandwidth, those could be clocked
fanout/fanin devices as well, which would allow somewhat simpler wiring rules.
Taken to the extreme, the moral equivalent of an AMB but for DDR2 dimms could
up the bandwidth at the expense of latency and extra cost.

Sure. Bandwidth is a matter of $$, latency is forever. ;-)
Gonna have to do some digging on this one...

Let us know if you see anything public. I can't even find anything
of interest private.
 
D

David Kanter

Let us know if you see anything public. I can't even find anything
of interest private.

Keith, do you know how does the buffering for the pSeries work?

The POWER5 also has an integrated memory controller and I know they can
use either DDR1 or DDR2, due to some external buffering (I discussed
this with some of the xSeries system designers). My guess is that they
migrated this technology down to Opteron system boards...

DK
 
T

Tony Hill

That would do more to hurt IBM's reputation than enhance it.

Yeah, I'm not really expecting them to actually try the above, I can't
really see it being at all worthwhile and would probably hurt
performance more than helping it.
My next guess, would be maybe IBM created a repeater device? Not a
switch, because that would add latency, but a simple repeater which is
positioned equidistantly from all of the DIMMs, and then the Opteron's
memory controller would be equidistant only to the repeater. The
repeater would not reprocess the signal in any way other than to maybe
reboost the signal a bit.

This definitely seems a bit more likely. Daytripper mentioned a
couple of options that might make sense along these lines. I would
guess that IBM doesn't need to worry too much about adding a couple
extra layers to the board, so perhaps they can do it with just tossing
some transistors at the problem and a bit of smart wiring. The
specifics of this are venturing well out of my area of expertise
though, so it's definitely possible that I'm missing something.
 
D

Del Cecchi

Tony said:
Yeah, I'm not really expecting them to actually try the above, I can't
really see it being at all worthwhile and would probably hurt
performance more than helping it.




This definitely seems a bit more likely. Daytripper mentioned a
couple of options that might make sense along these lines. I would
guess that IBM doesn't need to worry too much about adding a couple
extra layers to the board, so perhaps they can do it with just tossing
some transistors at the problem and a bit of smart wiring. The
specifics of this are venturing well out of my area of expertise
though, so it's definitely possible that I'm missing something.

I asked around and IBM didn't do anything that would be considered
bizarre or stupid. that's all I feel comfortable saying.
 
D

daytripper

I asked around and IBM didn't do anything that would be considered
bizarre or stupid. that's all I feel comfortable saying.

What - you spent an entire 26 minutes sleuthing about just to tease us?

/daytripper (sheesh! he's no fun at all ;-)
 
K

krw

Keith, do you know how does the buffering for the pSeries work?

The POWER5 also has an integrated memory controller and I know they can
use either DDR1 or DDR2, due to some external buffering (I discussed
this with some of the xSeries system designers). My guess is that they
migrated this technology down to Opteron system boards...

Good grief! The "Power5" memory controller is an IBM design in an
IBM system. Certainly IBM doesn't have an embedded Opteron memory
controller. Think, man!
 
D

David Kanter

krw said:
Good grief! The "Power5" memory controller is an IBM design in an
IBM system. Certainly IBM doesn't have an embedded Opteron memory
controller. Think, man!

I'm not an idiot Keith. It is patently obvious that the POWER5 memory
controller is proprietary. My point is that the buffering that IBM
uses in the pSeries and X3 could probably be modified to work for an
Opteron OR any other system IBM sells. I suspect that this is how they
are able to have heavily populated memory channels.

If you could comment on the substance of my posts, rather than details
which you misinterpret, I'd appreciate it.

DK
 
Top