AMD CPUs in ASRock motherboards

Tony Hill · Oct 26, 2005

I believe DDR has a max burst bandwidth of 3200 MB/s, so
it's at least 11%. But the bursts have very limited length,
a cacheline or 4 before there's a whole new latency cycle. (I
think VRAM can burst longer). As a result AFAIK average
bandwdith is around 1 GB/s. Your shared mem eats 35%.

Felger actually high-balled the amount of overall performance it costs
you though, usually you're looking at less than a 1% loss in
performance for most office tasks. Honest.

There's a good reason why something like 60% of all PCs use integrated
graphics these days, it just doesn't make a difference for what the
vast majority of us do.

George Macdonald · Oct 26, 2005

Ok, but the graphics card need only open a new page when it's crosses the
line. I believe SDRAM can keep a couple of pages open too, but I'll
have to try to find D.

All current DDR-DRAM chips have four banks which can all be kept open
simultaneously. Memory Controllers vary on how open pages on the banks are
handled: Intel has traditionally (since the 440-BX) allowed all banks on
all ranks (up to 32) to be kept open, with an idle cycle counter for
precharge; VIA used to precharge when jumping rank to rank. Dunno exactly
what AMD's does - I can't find a limit though there is a mention of a
maximum in the docs. It ain't easy to figure out with all the different
access modes... now in latest revisions we have "bank swizzling" (each
device bank bit is XORed from 3 different address bits) on top of Chip
Select Interleaving; they do have an idle cycle limit counter and for those
things to work they have to keep banks open across rank accesses, up to
some limit?

BTW Robert seemed to suggest a cache line of 128-Bytes but on both Intel &
AMD systems the maximum burst is 64-Bytes, i.e. 4-beat burst on dual
channel systems. The P4 L2 cache line *is* 128Bytes but each line is two
sectors of 64Bytes.

My personal feeling here is that the integrated video is not that big a
deal, though it would seem wasteful to have the video refresh cycles
hitting main memory, especially when a couple of 256Mb chips hanging
directly off the chipset chip could be put to good use there.

Kai Harrekilde-Petersen · Oct 26, 2005

George Macdonald said:
BTW Robert seemed to suggest a cache line of 128-Bytes but on both Intel &
AMD systems the maximum burst is 64-Bytes, i.e. 4-beat burst on dual
channel systems. The P4 L2 cache line *is* 128Bytes but each line is two
sectors of 64Bytes.

From personal experience with PCI read bursts vs a P6 bus, I was under
the impression that 128 bytes was "the thing". We saw a huge
degradation in performance because our PCI chip only read 64 bytes at
a time*, and the P6 memory controller would prefetch 128 bytes, discard
the latter 64 when the burst stopped, and then re-fetch the same damn
128 bytes when we came back for the next 64 byttes on the PCI bus.

*) It was a PCI-SCI bridge chip, so all internal buffers were designed
for 64 bytes bursts, as this fitted into the 64-byte packets on
SCI. In a later spin, we changed the internal buffer handling to use
128 bytes at a time on PCI, and then schedule 2 64-byte packets on SCI
for each buffer.

Kai

Robert Redelmeier · Oct 26, 2005

Tony Hill said:
Felger actually high-balled the amount of overall performance
it costs you though, usually you're looking at less than a 1%
loss in performance for most office tasks. Honest.

I really have no idea what % loss this is for "most office
tasks". You might very well be correct if said tasks are
not memory bound and sufficient cache makes CPU speed the
most important factor.

However, I'd hate to have a machine that bogged horribly
on memory-bound tasks (photoediting?)

There's a good reason why something like 60% of all PCs
use integrated graphics these days, it just doesn't make
a difference for what the vast majority of us do.

The good reason is probably cost. Do you have benchmarks
on the speed loss from integrated video?

-- Robert

Robert Redelmeier · Oct 26, 2005

George Macdonald said:
All current DDR-DRAM chips have four banks which can all be
kept open simultaneously. Memory Controllers vary on how
open pages on the banks are handled: Intel has traditionally
(since the 440-BX) allowed all banks on all ranks (up
to 32) to be kept open, with an idle cycle counter for
precharge; VIA used to precharge when jumping rank to rank.
Dunno exactly what AMD's does - I can't find a limit though
there is a mention of a maximum in the docs. It ain't easy
to figure out with all the different access modes... now
in latest revisions we have "bank swizzling" (each device
bank bit is XORed from 3 different address bits) on top
of Chip Select Interleaving; they do have an idle cycle
limit counter and for those things to work they have to
keep banks open across rank accesses, up to some limit?

BTW Robert seemed to suggest a cache line of 128-Bytes but
on both Intel & AMD systems the maximum burst is 64-Bytes,
i.e. 4-beat burst on dual channel systems. The P4 L2 cache
line *is* 128Bytes but each line is two sectors of 64Bytes.

Thank you for the detailed elaboration. Some datasheets I
saw seemed to indicate an 8-beat burst was possible.

My personal feeling here is that the integrated video is not
that big a deal, though it would seem wasteful to have the
video refresh cycles hitting main memory, especially when
a couple of 256Mb chips hanging directly off the chipset
chip could be put to good use there.

This is my main complaint. I have nothing against mobo
graphics if there was 8+MB VRAM also soldered on (inside
chip?) to hold the framebuffer and take the video refresh hits.

-- Robert

George Macdonald · Oct 26, 2005

Thank you for the detailed elaboration. Some datasheets I
saw seemed to indicate an 8-beat burst was possible.

Hmm, I dunno what to believe any longer: the i875 was a good detailed
datasheet which covered DDR-DRAM well - it's quite explicitly stated there
that the 8-beat burst is single channel or asynchronous (stacked) dual
channel... and dual channel interleaved is 4-beat. The i925x is a shitty
datasheet and DDR2 only of course, lacking much detail and it's
contradictory: in features it says burst length of 8 but later in the
document it talks of 64Byte blocks with no mention of 128Bytes. I don't
know what to make of it - haven't looked for a possible updated datasheet
yet.

The CPU L1 cache line is 64Bytes (instr & data) and that's where the data
goes on a line fill. If you're going to bring in 128Bytes I think it's
going to have to be in two bursts, where the 2nd burst goes to L2 only. I
could be wrong but that's the way I've understood things till now.

George Macdonald · Oct 26, 2005

From personal experience with PCI read bursts vs a P6 bus, I was under
the impression that 128 bytes was "the thing". We saw a huge
degradation in performance because our PCI chip only read 64 bytes at
a time*, and the P6 memory controller would prefetch 128 bytes, discard
the latter 64 when the burst stopped, and then re-fetch the same damn
128 bytes when we came back for the next 64 byttes on the PCI bus.

Well I can't argue with empirical evidence, :-)

but which chipset was that?
When I wrote the above, I'd only looked at i875's datasheet in detail; on a
look at i925x, I'm confused: it's contradictory on 8-beat vs. 4-beat for
dual channel interleaved. I guess there's nothing like practical
experience but I'd still like to see Intel's docs cover it.

George Macdonald · Oct 26, 2005

I really have no idea what % loss this is for "most office
tasks". You might very well be correct if said tasks are
not memory bound and sufficient cache makes CPU speed the
most important factor.

However, I'd hate to have a machine that bogged horribly
on memory-bound tasks (photoediting?)

If those tasks are bandwidth friendly, I think the 1GB/s "average" you gave
is a bit off on a modern system. My Athlon64 s939 nForce3 system gets a
bandwidth of ~5.8GB/s on Sandra's memory benchmark - that's sequential of
course, so as good as it gets.

The good reason is probably cost. Do you have benchmarks
on the speed loss from integrated video?

And as often mentioned here, many of the "buyers" are cluel.... err
non-expert - they just buy whatever is on sale a Dell's monthly inventory
run-down... yes, cost does count. :-)

Kai Harrekilde-Petersen · Oct 26, 2005

George Macdonald said:
Well I can't argue with empirical evidence, but which chipset was that?

Uhm, good question. We had to test against just about every
hostbridge in the world, because they all - at the time - had peculiar
bugs that had to be worked around or explained to the customers, why
the performance was lousy. The earliest were the P5 chipsets (430NX,
TX, HX, VX). I don't recall the P6 ones anymore, but it was around
96-98 when I was at Dolphin, if it helps.

I think we went from shoddy 35-60MB/s to almost 100MB/s when to 128B
bursts, but my memory on that is failing me.

Kai

Robert Redelmeier · Oct 27, 2005

George Macdonald said:
If those tasks are bandwidth friendly, I think the 1GB/s
"average" you gave is a bit off on a modern system.
My Athlon64 s939 nForce3 system gets a bandwidth of ~5.8GB/s
on Sandra's memory benchmark - that's sequential of course,
so as good as it gets.

Better than good! That sounds like a cache bandwidth

DDR
at 200 MHz gives 3.2 GByte/s burst bandwidth, so your number
is barely within the realm of possibility for a dual-channel
system. If so, the average interburst "latency" is tiny,
less than one clock.

And as often mentioned here, many of the "buyers" are
cluel.... err non-expert - they just buy whatever is on
sale a Dell's monthly inventory run-down... yes, cost
does count.

And for many tasks, MS-Windows remains CPU bound.

-- Robert

George Macdonald · Oct 27, 2005

Better than good! That sounds like a cache bandwidth DDR
at 200 MHz gives 3.2 GByte/s burst bandwidth, so your number
is barely within the realm of possibility for a dual-channel
system. If so, the average interburst "latency" is tiny,
less than one clock.

Well I do have a mistrust in Sandra for absolute numbers but no, with an
effective 128-bit wide channel, the theoretical peak is 6.4GB/s so though I
think the 90% efficiency is a bit optimistic and probably somewhat due to
cache effects, it compares well: if I change the command rate from 1T to 2T
(adds a clock cycle between address signal and chip select) it drops to
~4.5GB/s. Other results, e.g. Xeon/1MB cache on an Intel E7520 is only at
~3.6GB/s, highlight just how good the Athlon64 is.

Stuart Krivis · Oct 27, 2005

I believe DDR has a max burst bandwidth of 3200 MB/s, so
it's at least 11%. But the bursts have very limited length,
a cacheline or 4 before there's a whole new latency cycle. (I
think VRAM can burst longer). As a result AFAIK average
bandwdith is around 1 GB/s. Your shared mem eats 35%.

Not something I'd want to hobble a great CPU with. IMHO, you'd
be _far_ better off disabling on board graphics and getting
an old Matrox PCI card, especially if you're only doing 2D.

I've been toying with something like this. I have an Intel 915G board
and don't like the idea of shared memory for video. Unfortunately, I
also don't like the idea of graphics card that require cooling fans.
:-)

I can't use one of my old Matrox PCI cards because this mobo only has
2 PCI slots, and they're already full with a SCSI HBA and sound card.
(The onboard sound was really bad. The onboard video seems to be
livable for now.)

The only likely candidates for PCI-Express graphics cards seem to be
also shared memory (like the Nvidia Turbo Cache). Everything else
seems to have a fan. I think the Matrox P650 might be available now in
PCI-E, but the price was sky high last time I looked.

I'm ok with things as they are more or less. It appears I'm I/O
limited anyway, so I'm not sure how much it would speed up with a new
graphics card. For what I paid for the system I'm _very_ happy with
it. :-)

I never thought I would buy a prebuilt system from one of the major PC
assemblers, but Dell made me an offer I couldn't refuse. hehe

Stuart Krivis · Oct 27, 2005

Felg, I gots some *old* Matrox PCI cards if you want 'em[*]. I use a
"new" AGP Matrox 2D card. ;-)

[*] One has been doing dual-display duty for my laptop, which will be
replaced in a week or so, or so I'm told.

I still have a Mill I, a couple of Mill IIs in PCI, and a G200 and
G400 in AGP. :-)

Great cards and I wish I could use them now.

Rob Stow · Oct 27, 2005

Stuart said:
I can't use one of my old Matrox PCI cards because this mobo only has
2 PCI slots, and they're already full with a SCSI HBA and sound card.
(The onboard sound was really bad. The onboard video seems to be
livable for now.)

Go to a "for sale" newsgroup for your area.
Check currents ads and if nothing suitable is ther, put a wanted
ad in a "for sale".

Not long ago I recently bought a Matrox G550, AGP, 32 MB for $25
(Canadian) - less than $20 YankeeBucks - that was advertised in a
newsgroup in my area.

And of course there is always E-Bay.

I bought the G550 for a friend who was quite satisfied with the
single monitor performance of the integrated video on his nForce2
motherboard - he just wanted something that would let him use two
monitors. He could have added a PCI video card and used both it
and the integrated video, but the Matrox AGP card was available
and cheap - plus he and I are both Matrox fans.

Disabling the motherboard's integrated video did not helped
performance in anything - spreadsheets still take just as long to
recalc, documents still takes just as long to open, converting
..mov to .avi still takes as long, etc. It might have been a
different story if he didn't have very much RAM, but freeing up
the 32 MB that was used by the integrated video isn't significant
when you have a pair of 512 MB DIMMs.

keith · Oct 28, 2005

Felg, I gots some *old* Matrox PCI cards if you want 'em[*]. I use a
"new" AGP Matrox 2D card. ;-)

[*] One has been doing dual-display duty for my laptop, which will be
replaced in a week or so, or so I'm told.

Click to expand...

I still have a Mill I, a couple of Mill IIs in PCI, and a G200 and
G400 in AGP.

How about *mystique 220s*. ;-) The ones I've kept will do dual display
(one is in my laptop's dockings station at work). The others were junked
*long* ago. I use a G550-dual here at home.

Great cards and I wish I could use them now.

Could? I *do*. ...though the new laptop should understand dual-display
out of the box; WinXP (groan).

Actually, if I was sure I could get decent dual-screen performance and all
that jazz, I *might* go for a lower end 3D card. That's on the list for
the 'blows machine upgrade, perhaps in the spring.

George Macdonald · Oct 28, 2005

I've been toying with something like this. I have an Intel 915G board
and don't like the idea of shared memory for video. Unfortunately, I
also don't like the idea of graphics card that require cooling fans.

I can't use one of my old Matrox PCI cards because this mobo only has
2 PCI slots, and they're already full with a SCSI HBA and sound card.
(The onboard sound was really bad. The onboard video seems to be
livable for now.)

The only likely candidates for PCI-Express graphics cards seem to be
also shared memory (like the Nvidia Turbo Cache). Everything else
seems to have a fan. I think the Matrox P650 might be available now in
PCI-E, but the price was sky high last time I looked.

How much graphics power do you need?... some of the TurboCache cards come
with 128MB of on-board memory and most have 64MB. Apart from the liquid
cooled jobs with a heat exchanger hanging off them there are also a couple
of 6600s without fan - MSI has one.

Tony Hill · Oct 29, 2005

I really have no idea what % loss this is for "most office
tasks". You might very well be correct if said tasks are
not memory bound and sufficient cache makes CPU speed the
most important factor.

However, I'd hate to have a machine that bogged horribly
on memory-bound tasks (photoediting?)

True enough, I'm not saying that integrated video is for everyone,
just that it isn't nearly as bad as it was at one time and for a LOT
of people it's well past the "good enough" point.

As a point of note though, it's really more a question of 2D graphics
vs. 3D stuff where you REALLY hit the difference, no memory-bound
tasks.

The good reason is probably cost.

Of course it's cost, though in more ways that just up-front cost.
Having integrated video also tends to simplify things for IT
departments in that it reduces the number of variables in their
systems. Just one video type of video card across a whole line of
computers makes things a lot easier when looking at drivers and
images.

Do you have benchmarks
on the speed loss from integrated video?

I have seen some, though I'm having trouble tracking down any recent
ones that really compare integrated video to non-integrated video on
anything other than games. Here's one old example:

http://www.realworldtech.com/page.cfm?ArticleID=RWT110500000000

Intel's i810 and i815 chipsets, along with nVidia's first nForce
chipset, were some of the first integrated video chipsets where the
performance hit wasn't all that great. Later chipsets have closed the
gap even further, primarily through new techniques that greatly reduce
the amount of time they need to go to main memory.

Of course, the flip side to this coin is that there are some add-in
video cards that are quite respectable and sell for dirt-cheap these
days, so it's still tough to justify integrated video to anyone other
than real penny-pinchers.

Tony Hill · Oct 29, 2005

On Mon, 24 Oct 2005 20:47:47 GMT, Robert Redelmeier

I can't use one of my old Matrox PCI cards because this mobo only has
2 PCI slots, and they're already full with a SCSI HBA and sound card.
(The onboard sound was really bad. The onboard video seems to be
livable for now.)

The only likely candidates for PCI-Express graphics cards seem to be
also shared memory (like the Nvidia Turbo Cache). Everything else
seems to have a fan.

Most of the nVidia GeForce 6200 (non-Turbo Cache) models as well as
some 6600LE models are available without fans on them. Same goes for
some of the ATI x300 and x550 cards. Here are a couple of examples:

http://www.newegg.com/Product/Product.asp?Item=N82E16814125201

http://www.newegg.com/Product/Product.asp?Item=N82E16814127168

http://www.newegg.com/Product/Product.asp?Item=N82E16814121544

http://www.msi.com.tw/program/products/vga/vga/pro_vga_detail.php?UID=693

Felger Carbon · Oct 30, 2005

Tony Hill said:
Of course, the flip side to this coin is that there are some add-in
video cards that are quite respectable and sell for dirt-cheap these
days, so it's still tough to justify integrated video to anyone other
than real penny-pinchers.

Also power-pinchers (like me) and persons who have absolutely no
application for 3d (like me), notably including multimedia and games.

AMD CPUs in ASRock motherboards

Tony Hill

George Macdonald

Kai Harrekilde-Petersen

Robert Redelmeier

Robert Redelmeier

George Macdonald

George Macdonald

George Macdonald

Kai Harrekilde-Petersen

Robert Redelmeier

George Macdonald

Stuart Krivis

Stuart Krivis

Rob Stow

keith

George Macdonald

Tony Hill

Tony Hill

Felger Carbon