PC Review


Reply
Thread Tools Rate Thread

Cray to buy Octigabay

 
 
Black Jack
Guest
Posts: n/a
 
      4th Mar 2004
Cray wants the low-end Opteron supercomputer and server market too.
They are already designing the RedStorm Opteron-based Strider systems:

http://makeashorterlink.com/?N27115B97

I wonder if the Octigabay 12K servers are cache-coherent or something?
I mean what's so special about them if they're just yet another
supercluster?

Yousuf Khan
 
Reply With Quote
 
 
 
 
Tony Hill
Guest
Posts: n/a
 
      4th Mar 2004

Yousuf, are you using your trolling/flamebait alias again? :>

On 3 Mar 2004 20:17:53 -0800, (E-Mail Removed) (Black
Jack) wrote:
>Cray wants the low-end Opteron supercomputer and server market too.
>They are already designing the RedStorm Opteron-based Strider systems:
>
>http://makeashorterlink.com/?N27115B97
>
>I wonder if the Octigabay 12K servers are cache-coherent or something?


Nope, or at least not in the classic sense of the term. Same deal as
with RedStorm.

>I mean what's so special about them if they're just yet another
>supercluster?


The only difference between Octigabay and traditional cluster designs
is that they're hanging their node interconnects right off of
hypertransport. Sound familiar? Yeah, they're doing pretty much the
same thing as Cray is doing with RedStorm, only on a somewhat smaller
scale, mainly targeting 12-144 processor setups.

-------------
Tony Hill
hilla <underscore> 20 <at> yahoo <dot> ca
 
Reply With Quote
 
Black Jack
Guest
Posts: n/a
 
      8th Mar 2004
Tony Hill <(E-Mail Removed)> wrote in message news:<(E-Mail Removed)>...
> Yousuf, are you using your trolling/flamebait alias again? :>


Yup, I'm gonna have to do all of this from my Deja...er...Google
Groups account for the next several weeks. Speed of posting will also
be consistently lower, to reflect my demotion to dial-up Internet
connections for the next several weeks. :-)

That's one thing about countries new to the Internet, they've never
heard of the older parts of the Internet like Usenet, so everything up
here is web-based. Uggh.

> On 3 Mar 2004 20:17:53 -0800, (E-Mail Removed) (Black
> Jack) wrote:
> >Cray wants the low-end Opteron supercomputer and server market too.
> >They are already designing the RedStorm Opteron-based Strider systems:
> >
> >http://makeashorterlink.com/?N27115B97
> >
> >I wonder if the Octigabay 12K servers are cache-coherent or something?

>
> Nope, or at least not in the classic sense of the term. Same deal as
> with RedStorm.


What do you mean by "classic sense" vs. any other sense? It's either
cache coherent or it's not.

Black Widow's architecture still hasn't been fully disclosed by Cray
yet, except to give a very highly undetailed statement that it is
cache-coherent.

> >I mean what's so special about them if they're just yet another
> >supercluster?

>
> The only difference between Octigabay and traditional cluster designs
> is that they're hanging their node interconnects right off of
> hypertransport. Sound familiar? Yeah, they're doing pretty much the
> same thing as Cray is doing with RedStorm, only on a somewhat smaller
> scale, mainly targeting 12-144 processor setups.


Which I think some of the Intel fanboys were criticizing not so long
ago as one of the weaknesses of AMD64. Why aren't there peripherals
directly using Hypertransport instead of going through the PCI buses?
Well, it looks like there definitely are such peripherals now.
Extremely high speed ones at that.

Yousuf Khan
 
Reply With Quote
 
Tony Hill
Guest
Posts: n/a
 
      8th Mar 2004
On 7 Mar 2004 21:21:47 -0800, (E-Mail Removed) (Black
Jack) wrote:
>Tony Hill <(E-Mail Removed)> wrote in message news:<(E-Mail Removed)>...
>> Yousuf, are you using your trolling/flamebait alias again? :>

>
>Yup, I'm gonna have to do all of this from my Deja...er...Google
>Groups account for the next several weeks. Speed of posting will also
>be consistently lower, to reflect my demotion to dial-up Internet
>connections for the next several weeks. :-)
>
>That's one thing about countries new to the Internet, they've never
>heard of the older parts of the Internet like Usenet, so everything up
>here is web-based. Uggh.


<shudder> everything web-based AND dial-up?! Yuck!

>> On 3 Mar 2004 20:17:53 -0800, (E-Mail Removed) (Black
>> Jack) wrote:
>> >Cray wants the low-end Opteron supercomputer and server market too.
>> >They are already designing the RedStorm Opteron-based Strider systems:
>> >
>> >http://makeashorterlink.com/?N27115B97
>> >
>> >I wonder if the Octigabay 12K servers are cache-coherent or something?

>>
>> Nope, or at least not in the classic sense of the term. Same deal as
>> with RedStorm.

>
>What do you mean by "classic sense" vs. any other sense? It's either
>cache coherent or it's not.
>
>Black Widow's architecture still hasn't been fully disclosed by Cray
>yet, except to give a very highly undetailed statement that it is
>cache-coherent.


Yeah, that's kind of what I'm getting at. Neither Octigabay or the
Craw Red Storm/Black Widow thing are really cache coherent the way
that we would usually think of cache coherency. However they do have
a couple statements that claim some form of cache coherency. We had a
discussion about this a little while back, and I believe we figured
that the "coherency" only came in the fact that all remote memory
requests went through the processor and it would do coherency checks
at that time.

>> The only difference between Octigabay and traditional cluster designs
>> is that they're hanging their node interconnects right off of
>> hypertransport. Sound familiar? Yeah, they're doing pretty much the
>> same thing as Cray is doing with RedStorm, only on a somewhat smaller
>> scale, mainly targeting 12-144 processor setups.

>
>Which I think some of the Intel fanboys were criticizing not so long
>ago as one of the weaknesses of AMD64. Why aren't there peripherals
>directly using Hypertransport instead of going through the PCI buses?
>Well, it looks like there definitely are such peripherals now.
>Extremely high speed ones at that.


Yup. Only a few special cases, but the potential is definitely there.
AMD did document this a while back, suggesting that custom chips could
connect directly to hypertransport links in some of their early Hammer
presentations. The possibility is definitely there, though the cost
of developing such a chip probably isn't worth it for all except a
tiny few cases. Going through an HT to PCI-X bridge and connecting
your I/O devices through PCI-X is probably fast enough for almost
everything and certainly much cheaper.

-------------
Tony Hill
hilla <underscore> 20 <at> yahoo <dot> ca
 
Reply With Quote
 
Felger Carbon
Guest
Posts: n/a
 
      8th Mar 2004
"Black Jack" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> Tony Hill <(E-Mail Removed)> wrote in message

news:<(E-Mail Removed)>...
>
>> What do you mean by "classic sense" vs. any other sense? It's

either
>> cache coherent or it's not.

>
> Black Widow's architecture still hasn't been fully disclosed by Cray
> yet, except to give a very highly undetailed statement that it is
> cache-coherent.


Yousuf, enough of Red Storm's (and hence its Black Widow
interconnection scheme *has* been revealed to fully support Cray's
statement that Red Storm is cache-coherent.

The mechanism to support cache-coherency is **exactly** the same as on
your desktop PC when the disk does a DMA into DRAM. No difference
whatsoever. Honest injun.

Let me expand on this for some of the other less techie NG readers
(not you and Tony). Red Storm has over 10K CPUs. It may seem to you
that a write to one CPU's memory space must be simultaneously snooped
by each and every other CPU's cache to maintain cache coherency. This
is not the case at all.

Each of those 10K CPUs has its own memory space, not shared with any
other CPU. Each CPU's cache need only track external DMAs into that
*one* memory space, just as *your* PC's CPU has to track DMAs into its
memory space. In the case of Red Storm, the DMA in question is the
message passing interface (MPI), which uses DMA via the Black Widow
architecture.

An aside: at one time I believed that "Black Widow" referred to the
I/O chip, of which there is one for each CPU. This is not the case.
"Black Widow" evidently refers to the message-passing mesh
architecture, and especially to the ability to segment the Red Storm
system into "black" and "red" divisions. A black widow spider is all
black except for red markings on its belly.




 
Reply With Quote
 
Robert Myers
Guest
Posts: n/a
 
      8th Mar 2004
On Mon, 08 Mar 2004 18:51:06 GMT, "Felger Carbon" <(E-Mail Removed)>
wrote:

<snip>
>
>Each of those 10K CPUs has its own memory space, not shared with any
>other CPU. Each CPU's cache need only track external DMAs into that
>*one* memory space, just as *your* PC's CPU has to track DMAs into its
>memory space. In the case of Red Storm, the DMA in question is the
>message passing interface (MPI), which uses DMA via the Black Widow
>architecture.
>

At the risk of revealing simultanoeously to the world that I haven't
read Hennessey and Patterson cover to cover (and I haven't--any
edition, mind you) and that I haven't read every scrap of available
Red Storm documentation, I will readily admit that there is a piece of
this I either don't understand or that makes me even more certain of
the foolishness of dense mesh networks.

There are two possible approaches to using remote data that I can
think of:

Approach No. 1: leave the data in place and do RDMA reads and writes
to the remote memory location every time you touch the data. Result:
perfect cache coherency, ungodly latency and endless (and
hard-to-predict) traffic jams on the dense mesh. Remote latencies
are, at a minimum, twenty times typical local latencies. If you have
to get 100 instructions in flight to keep an OoO processor from
becoming stalled with local data, then you need to keep 20*100
instructions in flight to keep an OoO processor from getting stalled
with remote data.

Approach No. 2: Copy the data into your own memory space, manipulate
it there, and copy it back when done. Result: more than one copy of
the data exists at a time, and you have to use something like locks to
keep the data from becoming inconsistent. Only one processor can use
the data at a time, waits to get access to the data could be very
long, and the notion of ccNUMA is no more than a marketer's gimmick,
and, to correct my previous opinion on the subject, significantly less
useful than Hyperthreading in actual practice.

RM

 
Reply With Quote
 
Felger Carbon
Guest
Posts: n/a
 
      8th Mar 2004
"Robert Myers" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
>
> Only one processor can use
> the data at a time, waits to get access to the data could be very
> long, and the notion of ccNUMA is no more than a marketer's gimmick.


As Robert correctly points out, there are problems associated with
using lotsa COTS microprocessors as the basis for a supercomputer
(e.g. Red Storm). These problems are real.

All of us would rather have one processor and memory system that is
10,000 times faster than an Opteron and its DRAM memory. Alas, such a
device is not available in the real world. In the real world, the
only alternative that's available is the vector processor a la Japan's
Earth Simulator, at ~$600M per copy.

The available evidence suggests that most folks with a checkbook
believe the 10K+ COTS approach provides a better tradeoff than a
vector machine. Neither my opinion nor Robert's counts since neither
of us owns a large enough checkbook.

I respectfully disagree with Robert about ccNUMA being a marketing
gimmick. Red Storm **is** cache coherent. This is a fact, not an
opinion. Robert is free to suggest that ccNUMA is not a panacea -
nobody claims it is - but IMHO it's more than a gimmick.

BTW - on slide 8 (of 18) on Cray's Red Storm PDF, the "system I/O"
chip is named the "Seastar". So Seastar is the chip and Black Widow
is the mesh architecture, not a chip.


 
Reply With Quote
 
Robert Myers
Guest
Posts: n/a
 
      8th Mar 2004
On Mon, 08 Mar 2004 22:02:50 GMT, "Felger Carbon" <(E-Mail Removed)>
wrote:

>"Robert Myers" <(E-Mail Removed)> wrote in message
>news:(E-Mail Removed)...
>>
>> Only one processor can use
>> the data at a time, waits to get access to the data could be very
>> long, and the notion of ccNUMA is no more than a marketer's gimmick.

>
>As Robert correctly points out, there are problems associated with
>using lotsa COTS microprocessors as the basis for a supercomputer
>(e.g. Red Storm). These problems are real.
>
>All of us would rather have one processor and memory system that is
>10,000 times faster than an Opteron and its DRAM memory. Alas, such a
>device is not available in the real world. In the real world, the
>only alternative that's available is the vector processor a la Japan's
>Earth Simulator, at ~$600M per copy.
>
>The available evidence suggests that most folks with a checkbook
>believe the 10K+ COTS approach provides a better tradeoff than a
>vector machine. Neither my opinion nor Robert's counts since neither
>of us owns a large enough checkbook.
>
>I respectfully disagree with Robert about ccNUMA being a marketing
>gimmick. Red Storm **is** cache coherent. This is a fact, not an
>opinion. Robert is free to suggest that ccNUMA is not a panacea -
>nobody claims it is - but IMHO it's more than a gimmick.
>

In order for cache coherency to make sense as a useful concept (world
according to RM, obviously), remote latencies have to fit the
requirement that another poster imposed on them in comp.arch: they
have to be comparable to local latencies. Such a requirement on
remote latencies is in general unreasonable and unattainable for MPP,
but that also means that cache coherency (world according to RM,
obviously) for a NUMA supercomputer is not a useful concept. That's
in contrast to AMD's original concept of a small (up to eight way)
cluster, where remote latencies are a small multiple of local
latencies, and the idea of ccNUMA is a useful concept. All of this
with a possible caveat.

The caveat has to do with the actual mechanics of message passing
and/or RDMA. I don't think that remote memory reads and writes for
RedStorm are necessarily limited to MPI, and even if they were,
writing to a (remote) memory location is surely a lower-overhead
operation than writing to an I/O socket. At this level of detail, I
am more than happy to admit that I don't really know what I'm talking
about.

Were I posting to newsgroups in Japanese, I probably would have been
jumping up and down, hooting and hollering about the economics of
Earth Simulator. We don't know what the economics of the Cray SV-2
aka X-1 would be if it ever achieved significant market volume, but
it's all speculation, since such a machine is probably never going to
achieve significant market volume.

A dense mesh network with one router and one garden-variety processor
per compute node (the architecture of both Blue Gene and Red Storm)
and an Earth Simulator style vector processor are not the only
possibilities. The Cray X-1 (aka SV2) is significantly more cost
effective than ES. NSA special order machines like the X-1 probably
won't make much of a dent in HPC even if a place like ORNL
occasionally breaks down and buys one, but that doesn't mean that
streaming architectures won't. Whether the DoE (which always has the
biggest checkbook) picks up on streaming architectures or not,
somebody else will.

Always, of course, with the greatest of respect.

RM
 
Reply With Quote
 
Felger Carbon
Guest
Posts: n/a
 
      9th Mar 2004
"Robert Myers" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
>
> I don't think that remote memory reads and writes for
> RedStorm are necessarily limited to MPI


My understanding is that MPI is the *only* way for the Red Storm CPUs
to communicate with other CPUs or anything else. One of us must be
wrong here, Robert. Cray's PDF repeatedly - and only - refers to Red
Storm as being an "MPI" machine.

> Whether the DoE (which always has the
> biggest checkbook) picks up on streaming architectures or not,
> somebody else will.


Really? Who? And when? I confess that my impression is that a
streaming architecture is a chip-level implementation of an algorithm.
Change the algorithm, design a new chip. How long did it take to
develop the Opteron? Itanium? Intel's NetBurst CPU? Do I have this
wrong?

I trust the readers following this thread have noted that I'm in
agreement with **almost** all of your points. ;-)






 
Reply With Quote
 
Robert Myers
Guest
Posts: n/a
 
      9th Mar 2004
On Tue, 09 Mar 2004 00:46:51 GMT, "Felger Carbon" <(E-Mail Removed)>
wrote:

>"Robert Myers" <(E-Mail Removed)> wrote in message
>news:(E-Mail Removed)...
>>
>> I don't think that remote memory reads and writes for
>> RedStorm are necessarily limited to MPI

>
>My understanding is that MPI is the *only* way for the Red Storm CPUs
>to communicate with other CPUs or anything else. One of us must be
>wrong here, Robert. Cray's PDF repeatedly - and only - refers to Red
>Storm as being an "MPI" machine.
>


The distinction may be entirely academic, as the latest implementation
of MPI, MPICH2, aims at exploiting RDMA

http://www-unix.mcs.anl.gov/mpi/mpich2/

http://www.gup.uni-linz.ac.at/pvmmpi/talks/gropp.pdf

MPI is software. The actual message-passing of MPI has to be
implemented somehow at the physical link level. On a shared-memory
architecture, messages can be passed through shared memory, without
using any I/O at all. I would guess that the original implementations
of MPI for clusters went through the entire TCP/IP stack or its
equivalent, with all the associated overhead. My interpretation of
Red Storm being characterized as an MPI machine (without knowing what
exactly you're referring to) is that nodes communicate by transmitting
and receiving packets of information (and not through shared memory).

It may well be that MPI will be the only communication mode for which
software is ever developed for Red Storm, but whatever link layer is
used by MPI could just as well be used by some other communication
protocol implemented in software.

>> Whether the DoE (which always has the
>> biggest checkbook) picks up on streaming architectures or not,
>> somebody else will.

>
>Really? Who? And when? I confess that my impression is that a
>streaming architecture is a chip-level implementation of an algorithm.
>Change the algorithm, design a new chip. How long did it take to
>develop the Opteron? Itanium? Intel's NetBurst CPU? Do I have this
>wrong?
>


There is such a thing as a programmable stream processor:

http://merrimac.stanford.edu/

http://www.sc-conference.org/sc2003/...dfs/pap246.pdf

The work is currently being funded by DARPA (or it was last time I
looked).

Trying to make a distinction between a true stream processor and a
garden-variety modern microproessor gets harder all the time as
garden-variety microprocessors implement many of the ideas of stream
parallelism with or without explicit hardware support and with or
without explicit compiler support (streaming apparently just falls out
of OoO scheduling and register bypass).

As to how long it takes to develop a CPU, most of the complication and
cost in developing a modern microprocessor comes from the generality
and complexity of on-die scheduling. Most of that complexity would be
missing from a programmable stream processor.

Most of us have working examples of a stream processor right in our
own computers in the form of a GPU.

RM


 
Reply With Quote
 
 
 
Reply

Thread Tools
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
WTB: CASE/CRAY Multiplexers pbartlett Computer Hardware 0 25th Feb 2005 02:24 PM
What's Cray been doing lately? ykhan Processors 1 28th Nov 2004 06:42 PM


Features
 

Advertising
 

Newsgroups
 


All times are GMT +1. The time now is 12:01 AM.