Serious
Performance on the MPX2: the PCI-buses
Leaving this
overclocking malarkey to one side . . .
The major
bottleneck in present-day workstations & entry-level servers is not the
available CPU-speed [massive] - or the memory bandwidth of DDR [massive
again] - it's the PCI-bus, off which run the system's hard disk controllers.
The AMD760
MPX chipset allows at last for an affordable platform with 64-bit/66MHz PCI-slots;
in fact it allows for exactly & only two of 'em; & the physical layout of the
MPX2 - like that of the Gigabyte competition - gives sufficient space for bulky
full-length PCI-cards to be fitted & used.
Most
760MPX motherboards cram the two 64-bit slots together, so awkwardly
situated or protruding sockets, plugs, memory-cards, or heatsinks on
these or adjacent AGP or PCI-cards can block off at least one of these priceless
high-speed slots.
The
MPX2 has a 32-bit slot between this pair of 64/66 slot - full marks to Iwill.
|
The
importance of 64-bit/66MHz PCI is that this quadruples the theoretical bandwidth
available to high-end storage:
PCI-bus: bits/speed |
maximum
theoretical data-rate |
32-bit/33MHz PCI |
133MB/s |
32-bit/66MHz PCI |
266MB/s |
64-bit/33MHz PCI |
266MB/s |
64-bit/66MHz PCI |
533MB/s |
-
since current 10K rpm SCSI HD's like the Fujitsu MAN3xx4 series
have a peak read-rate of around 55MB/s, & ordinary 7200rpm IDE HD's in
the 40'sMB/s; it doesn't take much calculation to see that RAID-arrays of 4
or more of these devices can or should test or exceed the limits of the old 32-bit/33MHz
PCI-bus.
PCI-bus
storage-controller performance is much more complex than these crude
numbers might suggest; VIA received much unfavourable comment recently for
the dire default performance of 32-bit buses off their Southbridges.
One
of the keys to resolving these issues turned out to be controlling the PCI-latency
[how long a device can hang on to the bus] - PCI-latency issues are also associated with
the many problems seen in systems with Creative SB Live
audio-accelerators installed.
Not
only did VIA have to release a series of patches to correct this PCI-bus
latency issue; but better-quality SMP motherboards using VIA chipsets
- like the RioWorks SDVIA - include PCI-latency adjustment within the
BIOS: similar adjustment is provided to MPX-chipset motherboards from Asus
& Tyan.
We've
tested the MPX2 - which by default appears to have 32-clock latency on all its
PCI-buses
- with 3 different U3W 2-channel 64-bit SCSI RAID-hosts in a
64/66 slot controlling 6-disk RAID0 & RAID5 arrays built from an
unchanging hardware configuration of Fujitsu MAN3184 HD's; & with a Promise
TX2 IDE s/w RAID-controller with 2-disk & 4-disk RAID0 arrays of
Maxtor DX740X HD's in both 64/66 & 32/33
slots:
The ATTO benchmarks
below give a rough
initial impression of how well the PCI-bus
controllers of the AMD762/768 chipset work - with specific reference to
its implementation in the MPX2.
(Please
note these benchmarks are all from a single & quite simple
benchmarking utility: we are running IOMeter tests to confirm basic Read
& Write transfer rates)
U3W
SCSI
RAID0 & RAID5: 64-bit 66MHz:
Compaq
SmartArray 5302/64: 6-HD
RAID5

SmartArray
5302/64: 6-HD
RAID0

U3W
SCSI RAID5: 64-bit/33MHz:
IBM
ServeRAID-4M: 6HD RAID5
[settings
throughout: writeback cache for physical & logical drives; Adaptive
readahead cache mode; both channels U160]

U3W
SCSI RAID5: 64-bit/33MHz:
ICP-Vortex
GDT7523RN: 6HD RAID5
[settings
throughout: Synchronous transfer=ON; Synchronous transfer rate =U160; Disk
Read Cache = ON; Disk Write cache = ON; Tagged queues = ON]

IDE
RAID0: Promise TX2 100 32-bit/66mhz host in 66 & 33MHz slots with 2 & 4 drives
66MHz
RAID0

33MHz
RAID0

66MHz
4-HD RAID0
66MHz
4-HD RAID0/1

further
questions (not conclusions):
IDE:
The
Promise TX2 is a cheap-'n-cheerful s/w RAID-host, dependant on OS-drivers:
however, its performance - specifically write-performance - varies markedly between the 66 & 33MHz
PCI-buses
with no other obvious variable present & when nowhere near theoretical
32-bit bandwidth
limits [no other PCI-card was installed during these tests].
Its
disastrously poor exploitation of 4 drives in both RAID0 & 0/1 is no
doubt partly due to IDE's unhappy combo of one-command-at-a-time &
master-slave contention.
We are now
testing a
64-bit/33MHz 4-channel hardware IDE RAID-host with 4 Maxtor
D740X's [3ware 7410] & will be adding benchmarks to this review.
SCSI:
The good:
Our
Compaq SmartArray 5302/64 64-bit/66MHz U3W RAID-host shows acceptable if mysterious performance in the MPX2
[& reportedly in a similar system installed in an MPX chipset MSI K7D]: normally, one would
expect RAID5 & RAID0 arrays, using the same hardware, to show
contrasting performance-profiles; while those above are near-identical:
Typically;
write-performance in RAID5 is markedly slowed in comparison to read-performance
[the
Xor overhead]; whereas RAID0 - essentially overhead-free -
scales fairly predictably from the the individual drive's read & write
performance.
The
benchmarks above may show a quirk of this enterprise-level host [optimised
RAID5/compromised
RAID0 performance] or that the MPX2's 64-bit PCI-bus limits performance
towards the higher end.
Two other enterprise-level 2-channel U3W RAID-hosts,
both using the Intel i960RN co-processor & LSI 33MHz U3W SCSI-bus
silicon [IBM
ServeRAID-4M & ICP-Vortex GDT7523RN] showed very poor performance on the MPX2.
The bad:
The IBM ServeRAID-4M is a decent-quality 64-bit 2-channel U3W host, very
similar to Mylex hosts of similar specification [IBM own Mylex]:
The
ServeRAID-4M's performance when installed into the MPX2 is very poor
indeed, despite being optimally set up & using identical hardware to
the Compaq 5302.
The ugly:
The ICP-Vortex is a supposedly
high-performing host; it is much more tweakable than the other hosts
tested & we used the optimal settings recommended by ICP-Vortex; again
using identical hardware as the Compaq & IBM.
The
GDT7523RN's performance when installed in the MPX2 is dreadful - worse
than a single 7200rpm IDE HD.
Latency?
I2O?
We
have nothing like enough evidence on the MPX2's PCI-bus performance to
draw firm conclusions: if we had to give
a first opinion based on these simple benchmarks, it
would be that it's a disappointment compared to other
64-bit PCI platforms:
However,
what we have seen over several week's testing suggests this platform has
several issues with commonly used RAID-hosts, both IDE & SCSI:
We
suspect that the performance of the Compaq 53xx host on the MPX2 is limited
when approaching 200MB/s by PCI-latency issues.
We
also suspect the Promise TX2's very different performance between the two
PCI-buses of the MPX2 is also a latency issue.
The
dire performance of the two other SCSI RAID-hosts tested on the MPX2 points to
a much more profound problem, which seems likely to be common to all
motherboards using the AMD MPX chipset: we suspect that there is an
issue between this
chipset & devices using versions of the I2O
protocol
- which may be cured to
some degree by firmware updates.
We would
very much like to see Iwill joining other manufacturers of MPX-chipset
motherboards - such as Tyan & Asus - & provide an in-BIOS option to vary the
PCI-bus latency [ideally both buses' latencies] to the user's requirements: the MPX2's default setting of 32 clocks is not optimal
for a 64-bit/66MHz bus used by higher-end storage controllers - 96 clocks
is a not uncommon setting.
We are
deeply concerned that the MPX-chipset platform may be at present fundamentally
unsuited to use of many Intel i960 co-processor, I2O-architecture SCSI RAID-hosts on the
market
[this is the most common RAID-host hardware configuration:
used by Adaptec & AMI/LSI as well as those mentioned above].
We have seen many reports of similar performance isssues using these & similar
hosts on
other MPX-chipset motherboards.
The
complex architecture of the MPX chipset has 533MB/s buswidth to
the Northbridge shared between three points: two PCI-slots slots
of the 64-bit/66MHz bus, & the Southbridge on a
32-bit/66MHz datapath:
From
this Southbridge run the secondary 32-bit/33MHz bus, the IDE
ports, USB & etc.
The
snag with this cunning arrangement may be that the latency timers
for each PCI-bus are additive, starting from the latency of the 66MHz bus:
this means - with default 32-clock timings - the 66MHz-bus is a 32-clocks,
& the 33MHz bus at 64-clocks latency.
This
is precisely opposite to the needs of most users: where you don't
mind a high-end 64-bit storage-controller hogging the bus; but mind
very much indeed if some 32-bit device [often a soundcard] takes the
bus over due to enjoying excessive latency. A
further concern is that these additive latencies may affect the message layer between the hardware
device modules & OS modules of I2O
devices such as SCSI RAID-controllers. |
|