Category: Network & Server (網絡及服務器)

Impressive Equallogic PS6000XV IOPS result

By admin, October 9, 2010 11:41 am

Impressive Equallogic PS6000XV IOPS result

I just performed the test again 3 times and confirmed the followings, this is with default 1 Worker only, IOmeter testing using VM’s VMFS directly, no MPIO direct mapping to EQL array, VM is version 7, Disk Controller is Paravirtualized and NIC is VMXNet3.
SERVER TYPE: VM on ESX 4.1 with EQL MEM Plugin, VAAI enabled with Storage Hardware Acceleration
CPU TYPE / NUMBER: vCPU / 1
HOST TYPE: Dell PE R710, 96GB RAM; 2 x XEON 5650, 2,66 GHz, 12 Cores Total
STORAGE TYPE / DISK NUMBER / RAID LEVEL: Equallogic PS6000XV x 1 (15K), / 14+2 600GB Disks / RAID10 / 500GB Volume, 1MB Block Size
SAN TYPE / HBAs : ESX Software iSCSI, Broadcom 5709C TOE+iSCSI Offload NIC

##################################################################################
TEST NAME——————-Av. Resp. Time ms——Av. IOs/sek——-Av. MB/sek——
##################################################################################

Max Throughput-100%Read……4.1913………13934.42………435.45

RealLife-60%Rand-65%Read……13.4110………4051.49………31.65

Max Throughput-50%Read………5.5166………10240.39………320.01

Random-8k-70%Read……………14.1525………3915.15………28.95

EXCEPTIONS: CPU Util. 67.82, 38.12, 56.80, 40.2158%;

##################################################################################
RealLife-60%Rand-65%Read 4051 IOPS is really impressive for a single array with 14 15K RPM spindles!

Windows Server Backup in Windows Server 2008 R2

By admin, October 8, 2010 1:17 pm

I’ve been using Windows Server Backup in Windows Server 2008 R2 for almost a month, and found it can do everything Acronis True Image Server does, seemed there is really no need to buy AIS in the future in my own opinion.

See what WSB can offer you (my requirement list):

  • WSB works at Bloack Level, so backup or taking snapshot is very fast during backup and restore as well.
  • Full server baremetal backup.
  • Full backup at first time, and incremental afterwards, Excluding files during backup, Compression
  • System State integrated with AD, so you won’t get a crash consistant state that you have your server restored, but found AD cannot be started.
  • Individual folder/file restore
  • Backup to network shared folders (there is a limit that you cannot have incremental copies, but only keep ONE copy, the later one will over-write the previous one), this does suck badly! However I don’t use network folder to store my backup, so it’s fine for me.
  • Maximum 64 copies (in my term, it’s almost 2 months since I only have 1 schedule backup running) or limited to your backup disk size
  • The backup copies are hidden from the file system, in TIS, you need to create a partition to hide the backup copies. (Acronis Secure Zone)
  • WSB can backup Hyper-V vm images as well.

Best of all Windows Server Backup (WSB) in Windows Server 2008 R2 IS FREE! TIS is over USD1,200, I don’t need any features like Convert to VM, Universal Restore, Central Management, so WSB works perfectly for the standalone server.

Finally, you may ask what about the rescure bootable DVD/CD-ROM? While you don’t have one, what? Yes, the Windows Server 2008 DVD is your optimate rescure bootable DVD, fair enough? :)

A Possible Bug in VMware ESX 4.1 or EQL MEM 1.0 Plugin

By admin, October 8, 2010 12:37 am

This week, I encountered a strange problem in redundancy testing, all paths to our network switches, servers and EQL arrays have been setup correctly with redundancy.  Each ESX Host iSCSI VMKernel (or pNIC) has 16 paths to EQL arrays and we tested every single possible failure situation and found ONLY ONE senerio doesn’t work. See When Power OFF Master PC5448 Switch, we are no longer able to ping PS6000XV.

After two days of troubleshooting with local Pro-Support as well as US EQL support, we have narrowed down the problem to “A Possible Bug in VMware 4.1 or EQL MEM 1.0 Plugin”.

During the troubleshooting, I found there is another Equallogic user in Germany is also having similar problem not exactly the same though (See  Failover problems between esx an Dell EQL PS4000), as he’s using PS4000 and only have two iSCSI paths, and his problem is more seroius than ours.

Oct 5, 2010 3:16 AM
Fix-List from v5.0.2 Firmware:
iSCSI Connections may be redirected to Ethernet ports without valid network links.

Also he’s problem is similar as whatever iscsi connection left in LAG won’t get redirected to slave switch after shutdown the master switch, I got 4 paths, his PS4000 has two, so my iscsi connection survived due to there is an extra path to the slave switch, but somehow vmkping doesn’t work.

and if you look at comment #30 .

Jul 27, 2010
Dell acknowledged that the known issue they reporting in the manual of the EqualLogic Multipathing Extension Module is the same I get.

They didn’t open a ticket at vmware for now, but they will, after some more tests.

I think this issue is there since esx 4.0. In VI3 they used only one vmkernel for swiscsi with redundancy on layer1/2, so there it should not be the case.

My case number for this issue at vmware is 1544311161, the case number at dell is 818688246.

If vmware acknowledge this as a bug in 4.1, and don’t have a workaround, we will go with at least 4 logical paths for each volume and hope that at least one path is still connected after switch1 fails, until they fix it.
Finally, it could also be something related to EQL MEM Plugin for ESX which we have installed. (Comment #29 on page 2)

It indicates there is a know issue that once a network link failed (could be due to shut down the master switch), if the physical NIC with the network failure is the only uplink for the VMKernel port that is used as the default route for the subnet. This affects several types of kernel network traffic, including ICMP pings which the EqualLogic MEM uses to test for connectivity on the SAN.

Jul 23, 2010

from the dell eql MEM-User_Guide 4-1:

Known Issues and Limitations
The following are known issues for this release.

Failure On One Physical Network Port Can Prevent iSCSI Session Rebalancing
In some cases, a network failure on a single physical NIC can affect kernel traffic on other NICs. This occurs if the physical NIC with the network failure is the only uplink for the VMKernel port that is used as the default route for the subnet. This affects several types of kernel network traffic, including ICMP pings which the EqualLogic MEM uses to test for connectivity on the SAN. The result is that the iSCSI session management functionality in the plugin will fail to rebuild the iSCSI sessions to respond to failures of SAN changes.

Could it be the same problem I have? So they (DELL/VMware) already know about this problem?

Aside this it looks like the Dell MEM makes only sense in setups with more then one array per psgroup, because the PSP selects a path to a interface of the array where the data of the volume is stored. And it have a lot of limitations. We only have one array per group for now, so I think I skip this.

Still dont understand why there is no way to prevent that the connections go through the LAG in the first place, it should be possible to prefer direct connections…

My last reply to EQL Support today:

Some updates, may be you can pass them to L3 for further analysis.

The problem seemed to be due to EQL MEM version 1.0 Known Issue. (User Manual 4-1)

==================================================
Failure On One Physical Network Port Can Prevent iSCSI Session Rebalancing

In some cases, a network failure on a single physical NIC can affect kernel traffic on other NICs. This occurs if the physical NIC with the network failure is the only uplink for the VMKernel port that is used as the default route for the subnet. This affects several types of kernel network traffic, including ICMP pings which the EqualLogic MEM uses to test for connectivity on the SAN. The result is that the iSCSI session management functionality in the plugin will fail to rebuild the iSCSI sessions to respond to failures of SAN changes.
==================================================
I’ve performed the test again and it showed your prediction is correct that the path is always fixed to vmk2.

1. I’ve restarted the 2nd ESX host, and found the path C0 to C1 showed correctly.
2. The reboot master switch, from 1st or 2nd ESX Host, I cannot ping vCenter or vmkping EQL or iSCSI vmks on other ESX hosts, and CANNOT PING OTHER VMKernel such as VMotion of FT this time as well.
3. But VM did not crash or restart, so underneath, all iSCSI connections stay on-line, that’s good news.
4. After master swtich comes back, under storage path on ESX Hosts, I see those C4, C5 paths generated.

Could you confirm with VMware and EQL if they have this bug in their ESX 4.1 please? (ie, Path somehow is always fixed to vmk2)

I even did a test by switching off iSCSI ports on the master switch one by one, problem ONLY HAPPENS when I switch off ESX Host VMK2 port. (which is the first vmk iScsi port, ie, the default route for the subnet?)

It confirmed the vmkping IS BOUND to the 1st iSCSI vmk port which is vmk2 and this time, all my vmkping dead including VMotion as well as FT.

The good news is underneath the surface everything works perfectly as expected, iSCSI connections are working fine, VMs are still working and they can be VMotioned around, FT is also working fine.

We do hope VMware and EQL can release a patch sometime soon so they fix this vmkping problem that vmkping always has to go out of the default vmk2 which is the 1st iSCSI VMKernel in the vSwitch, but not any other vmk connections, so when the master switch dead, vmkping also died with vmk2 as vmkping uses vmk2 for ICMP ping to other VMKernel IP address.

Some interesting findings from VMWare Unofficial Storage Performance

By admin, October 6, 2010 5:17 pm

Some more interesting comments contributed by all the testers:

1) VMFS block size (1-8MB) seems to have little to no effect on performance.

2) Thin/Thick provisioning doesn’t have much impact on performance.

3) RDM has minimal performance increases over VMFS (except in 100% sequential tests which just won’t ever happen in the real world), VMFS has minimal impact on performance, achieving approximately 98%+ of physical performance, so others suggest use VMFS all the way

4. The most real test seems to be the “RealLife-60%Rand-65%Read” – in normal life you have random and sequential connections mixed (often 60% Random vs 40% Sequential).

5. We can see vm on iSCSI compared to physocal server loses more throughput and response time than its counterparts on FC SAN. (but not much <5%, especially in situation of sequential Read/Write)

6. First my suggestion for disabling Jumbo Frames would only be on the Guest O/S. Leave it enabled on the switch and the vSwitch on the host.

7. As for performance best practice, configure VM using Version 7, Paravirtual for Disk Controller and NIC as VMXNET3.

Extract from VMWare Unofficial Storage Performance Comparing Equallogic and other SAN Vendors

By admin, October 6, 2010 4:47 pm

It’s not offical, but after comparing the results, I would still say Equallogic ROCKS!

Finally, I wonder why there aren’t many results from Lefthand, NetApp, 3PAR and HDS?

My own result:

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
TABLE oF RESULTS
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

SERVER TYPE: VM on ESX 4.1 with EQL MEM Plugin
CPU TYPE / NUMBER: vCPU / 1
HOST TYPE: Dell PE R710, 96GB RAM; 2 x XEON 5650, 2,66 GHz, 12 Cores Total
STORAGE TYPE / DISK NUMBER / RAID LEVEL: Equallogic PS6000XV x 1 (15K), / 14+2 600GB Disks / RAID10 / 500GB Volume, 1MB Block Size
SAN TYPE / HBAs : ESX Software iSCSI, Broadcom 5709C TOE+iSCSI Offload NIC

##################################################################################
TEST NAME——————-Av. Resp. Time ms——Av. IOs/sek——-Av. MB/sek——
##################################################################################

Max Throughput-100%Read……..5.4673……….10223.32………319.48

RealLife-60%Rand-65%Read……15.2581……….3614.63………28.24

Max Throughput-50%Read……….6.4908……….4431.42………138.48

Random-8k-70%Read……………..15.6961……….3510.34………27.42

EXCEPTIONS: CPU Util. 83.56, 47.25, 88.56, 44.21%;
##################################################################################
Compares with other Equallogic user’s result:
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
TABLE OF RESULTS
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

SERVER TYPE: VM ON ESX 3.0.1
CPU TYPE / NUMBER: VCPU / 1
HOST TYPE: Dell PE6850, 16GB RAM; 4x XEON 7020, 2,66 GHz, DC
STORAGE TYPE / DISK NUMBER / RAID LEVEL: EQL PS3600 x 1 / 14+2 SAS10k / R50
SAN TYPE / HBAs : iSCSI, QLA4050 HBA

##################################################################################
TEST NAME——————-Av. Resp. Time ms——Av. IOs/sek——-Av. MB/sek——
##################################################################################

Max Throughput-100%Read……..__17______……….___3551___………___111____

RealLife-60%Rand-65%Read……___21_____……….___2550___………____20____

Max Throughput-50%Read……….____10____……….___5803___………___181____

Random-8k-70%Read……………..____23____……….___2410___………____19____

EXCEPTIONS: VCPU Util. 60-46-75-46 %;
##################################################################################

 

SERVER TYPE: VM.
CPU TYPE / NUMBER: VCPU / 1 ) JUMBO FRAMES, MPIO RR
HOST TYPE: Dell PE2950, 32GB RAM; 2x XEON 5440, 2,83 GHz, DC
STORAGE TYPE / DISK NUMBER / RAID LEVEL:EQL PS5000 x 1 / 14+2 Disks (sata)/ R5

##################################################################################
TEST NAME——————-Av. Resp. Time ms——Av. IOs/sek——-Av. MB/sek——
##################################################################################

Max Throughput-100%Read……..____9,6___……….____5093___………___159,00_

RealLife-60%Rand-65%Read……____26,6___……….___1678___………___13,11__

Max Throughput-50%Read………._____8,5__……….____4454___………___139,20_

Random-8k-70%Read…………….._____31,3_……….____1483___………___11,58____

EXCEPTIONS: CPU Util.-XX%;
##################################################################################

 

SERVER TYPE: PHYS
CPU TYPE / NUMBER: CPU / 2
HOST TYPE: DL380 G3, 4GB RAM; 2X XEON 3.20 GHZ
STORAGE TYPE / DISK NUMBER / RAID LEVEL: PS6000XV / 14+2 DISK (15K SAS) / R10)
NOTES: 2 NIC, MS iSCSI, no-jumbo, flowcontrol on

##################################################################################
TEST NAME——————-Av. Resp. Time ms——Av. IOs/sek——-Av. MB/sek——
##################################################################################

Max Throughput-100%Read……..___13.60____……….___3788____………___118____

RealLife-60%Rand-65%Read…….___14.87____……….___3729____………___29.14__

Max Throughput-50%Read………___12.75____……….___4529____………___141____

Random-8k-70%Read…………..___15.42____……….___3580____………___27.97__

EXCEPTIONS: CPU Util.-XX%;
##################################################################################

 

SERVER TYPE: PHYS
CPU TYPE / NUMBER: CPU / 2
HOST TYPE: DL380 G3, 4GB RAM; 2X XEON 3.20 GHZ
STORAGE TYPE / DISK NUMBER / RAID LEVEL: PS6000XV / 14+2 DISK (15K SAS) / R50)
NOTES: 2 NIC, MS iSCSI, no-jumbo, flowcontrol off, ntfs aligned w/ 64k alloc, mpio-rr

##################################################################################
TEST NAME——————-Av. Resp. Time ms——Av. IOs/sek——-Av. MB/sek——
##################################################################################

Max Throughput-100%Read……..____9.84____……….___5677____………___177_____

RealLife-60%Rand-65%Read…….___13.20____……….___3712____………___29.00___

Max Throughput-50%Read………____8.39____……….___6742____………___211_____

Random-8k-70%Read…………..___13.91____……….___3783____………___29.55___

EXCEPTIONS: CPU Util.-XX%;
##################################################################################

 

SERVER TYPE: VM windows 2008 enterprise.
CPU TYPE / NUMBER: VCPU / 1
HOST TYPE: Dell PE2950, 32GB RAM; 2x XEON 5450, 3,00 GHz
STORAGE TYPE / DISK NUMBER / RAID LEVEL: EQL PS5000E x 1 / 14+2 Disks / R10 / MTU: 9000

####################################################################
TEST NAME——————-Av. Resp. Time ms—-Av. IOs/sek—-Av. MB/sek—–AV. CPU Utl.
Max Throughput-100%Read………….16,3……………..3638,3…………..113,7……………..35………
RealLife-60%Rand-65%Read………21,7………………2237,8…………….17,5……………..43………
Max Throughput-50%Read…………..17,7……………….2200,6…………….67,8……………..80………
Random-8k-70%Read………………….23,6………………2098,4…………….16,3……………..41………
####################################################################

 

SERVER TYPE: database server
CPU TYPE / NUMBER: CPU / 2
HOST TYPE: Dell PowerEdge M600, 2*X5460, 32GB RAM.
STORAGE TYPE / DISK NUMBER / RAID LEVEL: Equallogic PS5000E / 14*500GB SATA in RAID10

##################################################################################
TEST NAME——————-Av. Resp. Time ms——Av. IOs/sek——-Av. MB/sek—
##################################################################################

Max Throughput-100%Read……___10.29____……._5694__………_177.94___
RealLife-60%Rand-65%Read…..___31.75____…….__1382__………__10.80___
Max Throughput-50%Read…….___10.51____…….__5664__………_177.02___
Random-8k-70%Read…………___34.34____…….__1345__………__10.51___

EXCEPTIONS: CPU Util. 20% – 15% – 10% – 13%;
####################################################################

 

SERVER TYPE: VM, VMDK DISK
CPU TYPE / NUMBER: VCPU / 1
HOST TYPE: DELL R610, 16GB RAM; 2 x Intel E5540, QuadCore
STORAGE TYPE / DISK NUMBER / RAID LEVEL: EqualLogic PS5000XV / 14+2 DISK (15k SAS) / R50)
NOTES: 3 NIC, modified ESX PSP RR IOPS parameter, jumbo on, flowcontrol on

##################################################################################
TEST NAME——————-Av. Resp. Time ms——Av. IOs/sek——-Av. MB/sek——
##################################################################################

Max Throughput-100%Read……..__6,48__……….__9178,56__………_286,83__

RealLife-60%Rand-65%Read…….__13,08__……….__3301,94__………__25,8__

Max Throughput-50%Read………__9,06__……….__6160,2__………__192,51__

Random-8k-70%Read…………..__13,59__……….__3215,69__………__25,12__
##################################################################

 

SERVER TYPE: Windows XP VM w/ 1GB RAM on ESXi 4
CPU TYPE / NUMBER: VCPU / 1
HOST TYPE: Sun SunFire x4150, 48GB RAM; 2x XEON E5450, 2.992 GHz
STORAGE TYPE / DISK NUMBER / RAID LEVEL: Two EQL PS6000E’s with / 14+2 SATA Disks / R50

##################################################################################
TEST NAME——————-Av. Resp. Time ms——Av. IOs/sek——-Av. MB/sek——
##################################################################################

Max Throughput-100%Read….….15.025.……….…3915.89……..…..122.37

RealLife-60%Rand-65%Read……12.20..…..…..….3324.92.…..…….25.97

Max Throughput-50%Read….……13.18..………….4460.97….…..….139.40

Random-8k-70%Read….….….…..13.40….………..3033.14….…..…..23.69

EXCEPTIONS: CPU%= 44 – 66 – 40 – 63

Using iscsi w/ software initiator. 4 nics, each with a vmkernel assigned to it.
##################################################################################

 

Server Type: VM Windows Server 2008 R2 x64 Std. on VMware ESXi 4.1
CPU Type / Number: vCPU / 1
VM Hardware Version 7
Two vmxnet3 NICs (10 GBit) used for iSCSI Connection (10 GB LUN directly connected to VM, no VMFS/RDM)
MS iSCSI Initiator (integrated in 2008 R2)
SAN Type: EQL PS6000XV (14+2 SAS HDDs, 15K, RAID 50)
Switches: Dell PowerConnect 6224
ESX Host is equipped with four 1GBit NICs (only for iSCSI connection)
Jumbo Frames and Flow Control enabled.

##################################################################################
Test——————-Av. Resp. Time ms——Total IOs/sek——-Total MB/sek——
##################################################################################

Max Throughput-100%Read……..___10.1929_____…….___4967.06_____…..____155.22______

RealLife-60%Rand-65%Read……_____12.6970____…..____3933.39____…..____30.73______

Max Throughput-50%Read………____9.5941____…..____5115.05____…..____159.85______

Random-8k-70%Read…………..____12.9845_____…..____4030.60______…..____31.49______
##################################################################################

 

SERVER TYPE: Dell NX3100
CPU TYPE / NUMBER: Intel 5620 x2 24GB RAM
HOST TYPE: Server 2008 64bit
STORAGE TYPE / DISK NUMBER / RAID LEVEL: Equallogic PS4000XV-600 14 * 600GB 15K SAS @ R50 

##################################################################################
TEST NAME——————-Av. Resp. Time ms——Av. IOs/sek——-Av. MB/sek——
##################################################################################
Max Throughput-100%Read 7163 223 8.3
RealLife-60%Rand-65%Read 4516 35 11.4
Max Throughput-50%Read 6901 215 8.4
Random-8k-70%Read 4415 34 11.9
##################################################################################

 

SERVER TYPE: W2K8 32bit on ESXi 4.1 Build 320137 1vCPU 2GB RAMCPU TYPE / NUMBER: Intel X5670 @ 2.93GhzHOST TYPE: Dell PE R610 w/ Broadcom 5709 Dual Port w/ EQL MPIO PSP EnabledNETWORK: Dell PC 6248 Stack w/ Jumbo Frames 9216STORAGE TYPE / DISK NUMBER / RAID LEVEL: EQL PS4000X 16 Disk Raid 50

##################################################################################
TEST NAME——————-Av. Resp. Time ms——Av. IOs/sek——-Av. MB/sek——
##################################################################################
Max Throughput-100%Read 8.12 7410 231 29%
RealLife-60%Rand-65%Read 10.65 3347 26 59%
Max Throughput-50%Read 7.19 7861 245 34%
Random-8k-70%Read 11.37 3387 26 55%
##################################################################################

  

Also compares with other major iSCSI/FC SAN Vendors:

 SERVER TYPE: VM ON ESX 3.5 U3
CPU TYPE / NUMBER: VCPU / 1
HOST TYPE: HP DL380 G5, 24GB RAM; 4x XEON 5410(Quad), 2,33 GHz,
STORAGE TYPE / DISK NUMBER / RAID LEVEL: EMC CX4-120 / 4+1 / R5 / 14+1total disks
SAN TYPE / HBAs : 4GB FC HP StorageWorks FC1142SR (Qlogic)
MetaLUNS are configured with 200GB LUNs striped accross all 14 disks for total datastore size of 600GB

##################################################################################
TEST NAME——————-Av. Resp. Time ms——Av. IOs/sek——-Av. MB/sek——
##################################################################################

Max Throughput-100%Read……………__6______……….___9320___………___291____

RealLife-60%Rand-65%Read……___24_____……….__1638___………____13____

Max Throughput-50%Read…………….____5____……….___11057___………___345____

Random-8k-70%Read……………..____23____……….___1800___………____14____
####################################################################

 

SERVER TYPE: VM on ESX 3.5.0 Update 4
CPU TYPE / NUMBER: VCPU / 1
HOST TYPE: HP Proliant DL385C G5, 32GB RAM; 2x AMD 2,4 GHz Quad-Core
SAN Type: HP EVA 4400 / Disks: 4GB FC 172GB 15k / RAID LEVEL: Raid5 / 38+2 Disks / Fiber 8Gbit FC HBA

##################################################################################
TEST NAME——————-Av. Resp. Time ms——Av. IOs/sek——-Av. MB/sek——
##################################################################################

Max Throughput-100%Read…….______5___……….___10690__……..___334____

RealLife-60%Rand-65%Read……______8___……….____5398__……..____42____

Max Throughput-50%Read…….._____49___……….____1452__……..____45____

Random-8k-70%Read………….______9___……….____5390__……..____42____

EXCEPTIONS: NTFS 32k Blocksize

##################################################################################

 

SERVER TYPE: VM WIN2008 64bit SP2 / ESX 4.0 ON Dell MD3000i via PC 5424
CPU TYPE / NUMBER: VCPU / 2 )JUMBO FRAMES, MPIO RR
HOST TYPE: Dell R610, 16GB RAM; 2x XEON 5540, 2,5 GHz, QC
ISCSI: VMWare iSCSI software initiator , Onboard Broadcom 5709 with TOE+ISCSI
STORAGE TYPE / DISK NUMBER / RAID LEVEL:Dell MD3000i x 1 / 6 Disks (15K 146GB / R10)

##################################################################################
TEST NAME——————-Av. Resp. Time ms——Av. IOs/sek——-Av. MB/sek——
##################################################################################

Max Throughput-100%Read……..____14,55___……….____4133__………____128,48____
RealLife-60%Rand-65%Read……_____22,69_………._____2085__………____16,92____
Max Throughput-50%Read………._____14,13___………._____4289__………____134,04____
Random-8k-70%Read…………….._____21,7__………._____2272__………____17,75___

##################################################################################

 

####################################################################
SERVER TYPE: Windows Server 2003r2 x32 VM with LSI Logic controller, ESX 4.0
CPU TYPE / NUMBER: VCPU / 1
HOST TYPE: HP BL490c G6, 64GB RAM; 2x XEON E5540, 2,53 GHz, QC
STORAGE TYPE / DISK NUMBER / RAID LEVEL: HP EVA6400 / 23 Disks / RAID5
Test name Avg resp time Avg IO/s Avg MB/s Avg % cpu
Max Throughput-100%Read 5.5 10831 338.47 38
RealLife-60%Rand-65%Read 10.8 4313 33.70 45
Max Throughput-50%Read 31.6 1822 56.95 17
Random-8k-70%Read 9.9 4613 36.04 47
SERVER TYPE: Windows Server 2008 x64 VM with LSI Logic SAS controller, ESX 4.0
CPU TYPE / NUMBER: VCPU / 1
HOST TYPE: HP BL490c G6, 64GB RAM; 2x XEON E5540, 2,53 GHz, QC
STORAGE TYPE / DISK NUMBER / RAID LEVEL: HP EVA6400 / 48 Disks / RAID10
Test name Avg resp time Avg IO/s Avg MB/s Avg % cpu
Max Throughput-100%Read 5.51 10905 340.8 32
RealLife-60%Rand-65%Read 8.20 6366 49.7 39
Max Throughput-50%Read 9.31 5279 165 43
Random-8k-70%Read 7.81 6734 52.6 39
####################################################################

 

SERVER TYPE: VM ON VI4
CPU TYPE / NUMBER: VCPU / 1
HOST TYPE: Supermicro , 64GB RAM; 4x XEON , E5430 2,66 GHz, QC
STORAGE TYPE / DISK NUMBER / RAID LEVEL: SUN 7410 11×1tb + 18gb ssd write + 100gb ssd read

##################################################################################
SAN TYPE / HBAs : 1gb NIC NFS
##################################################################################
TEST NAME——————-Av. Resp. Time ms——Av. IOs/sek——-Av. MB/sek——

Max Throughput-100%Read……..__17______……….___3421___………___106____

RealLife-60%Rand-65%Read……___6_____……….___7771___………____60____

Max Throughput-50%Read……….____11____……….___5321___………___166____

Random-8k-70%Read……………..____6____……….___2662___………____60____

##################################################################################

 

SERVER TYPE: VM Windows 2003, 1GB RAM
CPU TYPE / NUMBER: 1 VCPU
HOST TYPE: IBM x3650 M2, 34GB RAM, 2x X5550, 2,66 GHz QC
STORAGE TYPE / DISK NUMBER / RAID LEVEL: IBM DS3400 (1024MB CACHE/Dual Cntr) 11x SAS 15k 300GB / R6 + EXP3000 (12x SAS 15k 300GB) for the tests
SAN TYPE / HBAs : FC, QLA2432 HBA

##################################################################################
RAID10- 10HDDs ——————-Av. Resp. Time ms——Av. IOs/sek——-Av. MB/sek——
##################################################################################

Max Throughput-100%Read_______5,8_______________9941_______310

RealLife-60%Rand-65%Read_____16,7______________3083_________24

Max Throughput-50%Read________12,6______________4731________147

Random-8k-70%Read___________15,5______________3201________25

##################################################################################

 

####################################################################
SERVER TYPE: 2008 R2 VM ON ESX 4.0 U1
CPU TYPE / NUMBER: VCPU / 1 / 2GB Ram
HOST TYPE: HP BL460 G6, 32GB RAM; XEON X5520
STORAGE TYPE / DISK NUMBER / RAID LEVEL: EMC CX4-240 / 3x 300GB 15K FC / RAID 5
SAN TYPE / HBAs: 8Gb Fiber Channel

Test Name Avg. Response Time Avg. I/O per Second Avg. MB per Second CPU Utilization
Max Throughput – 100% Read 5.03 12,029.33 375.92 21.87
Real Life – 60% Rand / 65% Read 42.81 1,074.93 8.39 19.57
Max Throughput – 50% Read 3.63 16,444.30 513.88 29.67
Random 8K – 70% Read 51.44 1,039.38 8.12 14.01
SERVER TYPE: 2003 R2 VM ON ESX 4.0 U1
CPU TYPE / NUMBER: VCPU / 1 / 1GB Ram
HOST TYPE: HP DL360 G6, 24GB RAM; XEON X5540
STORAGE TYPE / DISK NUMBER / RAID LEVEL: LeftHand P4300 x 1 / 7 +1 Raid 5 10K SAS Drives
SAN TYPE / HBAs: iSCSI, SWISCSI, 2x 82571EB GB Eth Port Nics, One connection on each MPIO enabled – Jumbo Frames Enabled – 4 iSCSI connections to Volume – 1x Hp Procurve Switch
Test Name   Avg. Response Time   Avg. I/O per Second   Avg. MB per Second   CPU Utilization
Max Throughput – 100% Read   13.94   4289.95   134.06   22.17
Real Life – 60% Rand / 65% Read   18.95   1952.18   15.25   54.70
Max Throughput – 50% Read   41.95   1284.81   40.13   27.41
Random 8K – 70% Read   15.56   2132.71   16.66   60.32
####################################################################


SERVER TYPE: VMWare ESX 4u1
GUEST OS / CPU / RAM Win2K3 SP2, 2 VCPU, 2GB
HOST TYPE: DELL R610, 32GB RAM, 2 x Intel E5520, 2.27GHz, QuadCore
STORAGE TYPE / DISK NUMBER / RAID LEVEL: PILLAR DATA AX500 180 drives 525GB SATA, RAID5
SAN TYPE / HBAs : FCOE CNA EMULEX LP21002C on NEXUS 5010

####################################################################
TEST NAME———-Av.Resp.Time ms—Av.IOs/se—Av.MB/sek——
##################################################################
Max Throughput-100%Read….5.1609……….11275……… 362.86 CPU=22.84%

RealLife-60%Rand-65%Read…3.2424……… 17037…….. 131.68 CPU=32.6%

Max Throughput-50%Read……4.2503 ………12742 …….. 403.35 CPU=26.45%

Random-8k-70%Read………….3.2759……….16824………128.19 CPU=30.39%##################################################################

 

SERVER TYPE: ESXi 4.10 / Windows Server 2008 R2 x64, 2 vCPU, 4GB RAMCPU TYPE / NUMBER: Intel Xeon X5670 @ 2.93GHzHOST TYPE: HP ProLiant BL460c G7STORAGE TYPE / DISK NUMBER / RAID LEVEL: NetApp FAS6280 Metrocluster, FlashCache / 80 Disks / RAID DP

##################################################################################
TEST NAME——————-Av. Resp. Time ms——Av. IOs/sek——-Av. MB/sek——
##################################################################################
Max Throughput-100%Read 4.07 11562 361 63%
RealLife-60%Rand-65%Read 1.67 22901 178 1%
Max Throughput-50%Read 3.93 11684 365 61%
Random-8k-70%Read 1.45 25509 199 1%
##################################################################

 

SERVER TYPE: HP Proliant DL360 G7
CPU TYPE / NUMBER: Intel Xeon 5660 @2.8 (2 Processors)
HOST TYPE: Server 2008R2, 4vCPU, 12GB RAM
STORAGE TYPE / DISK NUMBER / RAID LEVEL: HP P4500 SAN, 24 600GB 15K in NETRAID 10.  4 Paths to Virtual iSCSI IP, RoundRobin host IOPS policy set to 1 Jumbo Frames Enabled Netflow Enabled

##################################################################################
TEST NAME——————-Av. Resp. Time ms——Av. IOs/sek——-Av. MB/sek——
##################################################################################
Max Throughput-100%Read 8.45 7119 222 22%
RealLife-60%Rand-65%Read 15.68 2423 18 55%
Max Throughput-50%Read 9.75 6000 187 25%
Random-8k-70%Read 11.71 2918 22 61% 
##################################################################

EMC VNX5500, 200gb fast cache 4×100 efd raid1)
Pool of 25×300gb 15k disks
Cisco UCS blades
##################################################################################
TEST NAME——————-Av. Resp. Time ms——Av. IOs/sek——-Av. MB/sek——
##################################################################################
Max Throughput-100%Read—— 16068 — 502 —– 1.71
RealLife-60%Rand-65%Read—– 3498 —- 27 —– 10.95
Max Throughput-50%Read——– 12697 —- 198 —- 0.885
Random-8k-70%Read—————- 4145 —– 32.38 — 8.635
##################################################################

Equallogic PS6000XV using VMWare Unofficial Storage Performance IOMeter parameters

By admin, October 6, 2010 4:27 pm

Unofficial Storage Performance Part 1 & Part 2 on VMTN

Here is my result from the newly setup EQL PS6000XV, I noticed the OEM harddisk is Seagate Cheetah 15K.7 (6Gbps) even PS6000XV is a 3Gbps array. (I thought they will ship me Seagate Cheetah 15K.6 3Gbps originally), also the firmware is updated to EN01 instead the shown EN00.

eql_15k

I’ve also spent 1/2 day today to conduct the test on different generation servers both local storage, DAS and SAN.

The result is pretty making sense and reasonable if you look deep into it.

That’s is RAID10 > RAID5, SAN > DAS >= Local and EQL PS6000XV Rocks despite warning saying all 4 links being 99.9% saturated during the sequential tests. (I increased the workers to 5, that’s why, it’s not in the result but in a seperate test for Max Throughput-100%Read)

 

raid-triangle

 

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
TABLE oF RESULTS
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
SERVER TYPE: VM on ESX 4.1 with EQL MEM Plugin
CPU TYPE / NUMBER: vCPU / 1
HOST TYPE: Dell PER710, 96GB RAM; 2 x XEON 5650, 2,66 GHz, 12 Cores Total
STORAGE TYPE / DISK NUMBER / RAID LEVEL: Equallogic PS6000XV x 1 (15K), / 14+2 600GB 15K Disks (Seagate Cheetah 15K.7) / RAID10 / 500GB Volume, 1MB Block Size
SAN TYPE / HBAs : ESX Software iSCSI, Broadcom 5709C TOE+iSCSI Offload NIC

##################################################################################
TEST NAME——————-Av. Resp. Time ms——Av. IOs/sek——-Av. MB/sek——
##################################################################################

Max Throughput-100%Read……..5.4673……….10223.32………319.48

RealLife-60%Rand-65%Read……15.2581……….3614.63………28.24

Max Throughput-50%Read……….6.4908……….4431.42………138.48

Random-8k-70%Read……………..15.6961……….3510.34………27.42

EXCEPTIONS: CPU Util. 83.56, 47.25, 88.56, 44.21%;

##################################################################################

 

SERVER TYPE: Physical
CPU TYPE / NUMBER: CPU / 1
HOST TYPE: Dell PER610, 12GB RAM; E6520, 2.4 GHz, 4 Cores Total
STORAGE TYPE / DISK NUMBER / RAID LEVEL: Equallogic PS6000XV x 1 (15K), / 14+2 600GB 15K Disks (Seagate Cheetah 15K.7) / RAID10 / 500GB Volume, 1MB Block Size
SAN TYPE / HBAs : Broadcom 5709C NICs with 2 paths only (ie, 2 physical NICs to SAN)
Worker: Using 2 Workers to push PS6000XV to it’s IOPS peak!

##################################################################################
TEST NAME——————-Av. Resp. Time ms——Av. IOs/sek——-Av. MB/sek——
##################################################################################

Max Throughput-100%Read……..14.3121……….6639.48………207.48

RealLife-60%Rand-65%Read……12.8788……….7197.69………150.51

Max Throughput-50%Read……….11.3125……….6837.76………213.68

Random-8k-70%Read……………..13.7343……….6739.38………142.22

EXCEPTIONS: CPU Util. 25.99, 24.10, 28.22, 25.36%;

##################################################################################

 

SERVER TYPE: Physical
CPU TYPE / NUMBER: CPU / 1
HOST TYPE: Dell PER610, 12GB RAM; E6520, 2.4 GHz, 4 Cores Total
STORAGE TYPE / DISK NUMBER / RAID LEVEL: Equallogic PS6000XV x 1 (15K), / 14+2 600GB 15K Disks (Seagate Cheetah 15K.7) / RAID10 / 500GB Volume, 1MB Block Size
SAN TYPE / HBAs : Broadcom 5709C NICs with 2 paths only (ie, 2 physical NICs to SAN)

##################################################################################
TEST NAME——————-Av. Resp. Time ms——Av. IOs/sek——-Av. MB/sek——
##################################################################################

Max Throughput-100%Read……8.7584………5505.30………172.04

RealLife-60%Rand-65%Read……12.5239………4032.84………31.51

Max Throughput-50%Read………6.8786………6455.76………201.74

Random-8k-70%Read……………14.96………3435.59………26.84

EXCEPTIONS: CPU Util. 19.37, 10.33, 18.28, 9.78%;

##################################################################################

SERVER TYPE: Physical
CPU TYPE / NUMBER: CPU / 1
HOST TYPE: Dell PER610, 12GB RAM; E6520, 2.4 GHz, 4 Cores Total
STORAGE TYPE / DISK NUMBER / RAID LEVEL: Local Storage, PERC H700 (LSI), 512MB Cache with BBU, 4 x 300 GB 10K SAS/ RAID5 / 450GB Volume
SAN TYPE / HBAs : Broadcom 5709C NIC

##################################################################################
TEST NAME——————-Av. Resp. Time ms——Av. IOs/sek——-Av. MB/sek——
##################################################################################

Max Throughput-100%Read……..2.7207……….22076.17………689.88

RealLife-60%Rand-65%Read……50.4486……….906.69………7.08

Max Throughput-50%Read……….2.5429……….22993.78………718.56

Random-8k-70%Read……………..55.1896……….841.89………6.58

EXCEPTIONS: CPU Util. 6.32, 6.94, 5.95, 6.98%;

##################################################################################

 

SERVER TYPE: Physical
CPU TYPE / NUMBER: CPU / 2
HOST TYPE: Dell PE2450, 2GB RAM; 2 x PIII-S, 1,26 GHz, 2 Cores Total
STORAGE TYPE / DISK NUMBER / RAID LEVEL: Local Storage, PERC3/Si (Adaptec), 64MB Cache, 3 x 36GB 10K U320 SCSI / RAID5 / 50GB Volume
SAN TYPE / HBAs : Intel Pro 100 NIC

##################################################################################
TEST NAME——————-Av. Resp. Time ms——Av. IOs/sek——-Av. MB/sek——
##################################################################################

Max Throughput-100%Read……..44.1448……….1326.03………41.44

RealLife-60%Rand-65%Read……93.1499……….456.88………3.57

Max Throughput-50%Read……….143.9756……….269.51………8.42

Random-8k-70%Read……………..80.27……….502.63………3.93

EXCEPTIONS: CPU Util. 23.33, 13.23, 11.65, 12.51%;

##################################################################################

 

SERVER TYPE: Physical
CPU TYPE / NUMBER: CPU / 2
HOST TYPE: DIY, 3GB RAM; 2 x PIII-S, 1,26 GHz, 2 Cores Total
STORAGE TYPE / DISK NUMBER / RAID LEVEL: Local Storage, LSI Megaraid 4D (LSI), 128MB Cache, 4 x 300GB 7.2K SATA / RAID5 / 900GB Volume
SAN TYPE / HBAs : Intel Pro 1000 NIC

##################################################################################
TEST NAME——————-Av. Resp. Time ms——Av. IOs/sek——-Av. MB/sek——
##################################################################################

Max Throughput-100%Read……..15.1582……….3882.81………121.34

RealLife-60%Rand-65%Read……60.2697……….499.05………3.90

Max Throughput-50%Read……….2.8170……….2337.38………73.04

Random-8k-70%Read……………..152.8725……….244.40………19.1

EXCEPTIONS: CPU Util. 16.84, 18.79, 15.20, 17.47%;

##################################################################################

 

SERVER TYPE: Physical
CPU TYPE / NUMBER: CPU / 2
HOST TYPE: Dell PE2650, 4GB RAM; 2 x Xeon, 2.8 GHz, 2 Cores Total
STORAGE TYPE / DISK NUMBER / RAID LEVEL: Local Storage, PREC3/Di (Adaptec), 128MB Cache, 5 x 36 GB 10K U320 SCSI / RAID5 / 90GB Volume
SAN TYPE / HBAs : Broadcom 1000 NIC

##################################################################################
TEST NAME——————-Av. Resp. Time ms——Av. IOs/sek——-Av. MB/sek——
##################################################################################

Max Throughput-100%Read……..33.9384……….1743.55………54.49

RealLife-60%Rand-65%Read……111.2496……….310.62………2.43

Max Throughput-50%Read……….55.7005……….518.47………16.20

Random-8k-70%Read……………..122.5364……….317.95………2.48

EXCEPTIONS: CPU Util. 7.66, 6.97, 7.78, 9.27%;

##################################################################################

 

SERVER TYPE: Physical
CPU TYPE / NUMBER: CPU / 2
HOST TYPE: DIY, 3GB RAM; 2 x PIII-S, 1,26 GHz, 2 Cores Total
STORAGE TYPE / DISK NUMBER / RAID LEVEL: Local Storage (DAS), PowerVault 210S with LSI Megaraid 1600 Elite (LSI), 128MB Cache with BBU, 12 x 73GB 10K U320 SCSI Splite into two Channels 6 Disks each/ RAID5 / 300GB Volume x 2, fully ultilize Raid Card’s TWO U160 Interfaces.
SAN TYPE / HBAs : Intel Pro 1000 NIC

##################################################################################
TEST NAME——————-Av. Resp. Time ms——Av. IOs/sek——-Av. MB/sek——
##################################################################################

Max Throughput-100%Read……..28.9380……….3975.19………124.22

RealLife-60%Rand-65%Read……30.2154……….2913.15………84.17

Max Throughput-50%Read……….31.0721……….3107.95………97.12

Random-8k-70%Read……………..33.0845……….2750.71………78.00

EXCEPTIONS: CPU Util. 23.91, 22.02, 26.01, 20.24%;

##################################################################################

 

SERVER TYPE: Physical
CPU TYPE / NUMBER: CPU / 2
HOST TYPE: DIY, 4GB RAM; 2 x Opeteron 285, 2.4GHz, 4 Cores Total
STORAGE TYPE / DISK NUMBER / RAID LEVEL: Local Storage, Areca ARC-1210, 128MB Cache with BBU, 4 x 73GB 10K WD Raptor SATA  / RAID 5 / 200GB Volume
SAN TYPE / HBAs : Broadcom 1000 NIC

##################################################################################
TEST NAME——————-Av. Resp. Time ms——Av. IOs/sek——-Av. MB/sek——
##################################################################################

Max Throughput-100%Read……..0.2175……….10932.45………341.64

RealLife-60%Rand-65%Read……88.3245……….393.66………3.08

Max Throughput-50%Read……….0.2622……….9505.30………296.95

Random-8k-70%Read……………..109.6747……….336.66………2.63

EXCEPTIONS: CPU Util. 14.11, 7.04, 13.23, 7.80%;

##################################################################################

 

SERVER TYPE: Physical
CPU TYPE / NUMBER: CPU / 2
HOST TYPE: Tyan, 8GB RAM; 2 x Opeteron 285, 2.4GHz, 4 Cores Total
STORAGE TYPE / DISK NUMBER / RAID LEVEL: Local Storage, LSI Megaraid 320-2X, 256MB Cache with BBU, 4 x 36GB 15K U320 SCSI  / RAID 5 / 90GB Volume
SAN TYPE / HBAs : Broadcom 1000 NIC

##################################################################################
TEST NAME——————-Av. Resp. Time ms——Av. IOs/sek——-Av. MB/sek——
##################################################################################

Max Throughput-100%Read……..0.4261……….7111.26………222.23

RealLife-60%Rand-65%Read……30.1981……….498.56………3.90

Max Throughput-50%Read……….0.5457……….5974.71………186.71

Random-8k-70%Read……………..42.7504……….496.88………3.88

EXCEPTIONS: CPU Util. 29.71, 24.51, 27.74, 32.93%;

##################################################################################

Some interesting IOPS benchmarks comparing with Equallogic PS6000XV

By admin, October 5, 2010 10:47 pm

The following is my own findings tested with IOmeter 2006 (4K, 100% Random, Read and Write tests). I ran the test from a VM and used as much as 10 workers to saturate that single PS6000XV.

  # of Harddisk RPM RAID Read, IOPS Write, IOPS
EQL PS6000XV 14 SAS 15K 10 5142 5120
Server1 4 SATA  7.2K 5 558 200
Server2 3 SCSI U320 10K 5 615 263
Server3 6 SCSI U320 10K 5 514 547
Server4 4 SCSI U320 15K 5 1880 820
Server5 4 SATA  10K 5 750 525

There are also two very interesting IOPS threads on VMware Communities, I am going to perform those tomorrow as well.

Unofficial Storage Performance Part 1 & Part 2

 

 

The VMTN IOMETER global options and parameters:
=====================================

Worker
Worker 1
Worker type
DISK
Default target settings for worker
Number of outstanding IOs,test connection rate,transactions per connection
64,ENABLED,500
Disk maximum size,starting sector
8000000,0

Run time = 5 min

For testing the disk C is configured and the test file (8000000 sectors) will be created by
first running – you need free space on the disk.

The cache size has direct influence on results. By systems with cache over 2GB the test
file should be increased.
LINK TO IOMETER:
http://sourceforge.net/projects/iometer/

Significant results are: Av. Response time, Av. IOS/sek, Av. MB/s
To mention are: what server (vm or physical), Processor number/type; What storage system, How many disks

Here the config file *.icf, copy it and save it as vmware.icf for IOMETER later. (Do not copy those the top and bottom lines containing ###############BEGIN or END###############)
####################################### BEGIN of *.icf
Version 2004.07.30
‘TEST SETUP ====================================================================
‘Test Description
IO-Test
‘Run Time
‘ hours minutes seconds
0 5 0
‘Ramp Up Time (s)
0
‘Default Disk Workers to Spawn
NUMBER_OF_CPUS
‘Default Network Workers to Spawn
0
‘Record Results
ALL
‘Worker Cycling
‘ start step step type
1 5 LINEAR
‘Disk Cycling
‘ start step step type
1 1 LINEAR
‘Queue Depth Cycling
‘ start end step step type
8 128 2 EXPONENTIAL
‘Test Type
NORMAL
‘END test setup
‘RESULTS DISPLAY ===============================================================
‘Update Frequency,Update Type
4,WHOLE_TEST
‘Bar chart 1 statistic
Total I/Os per Second
‘Bar chart 2 statistic
Total MBs per Second
‘Bar chart 3 statistic
Average I/O Response Time (ms)
‘Bar chart 4 statistic
Maximum I/O Response Time (ms)
‘Bar chart 5 statistic
% CPU Utilization (total)
‘Bar chart 6 statistic
Total Error Count
‘END results display
‘ACCESS SPECIFICATIONS =========================================================
‘Access specification name,default assignment
Max Throughput-100%Read,ALL
’size,% of size,% reads,% random,delay,burst,align,reply
32768,100,100,0,0,1,0,0
‘Access specification name,default assignment
RealLife-60%Rand-65%Read,ALL
’size,% of size,% reads,% random,delay,burst,align,reply
8192,100,65,60,0,1,0,0
‘Access specification name,default assignment
Max Throughput-50%Read,ALL
’size,% of size,% reads,% random,delay,burst,align,reply
32768,100,50,0,0,1,0,0
‘Access specification name,default assignment
Random-8k-70%Read,ALL
’size,% of size,% reads,% random,delay,burst,align,reply
8192,100,70,100,0,1,0,0
‘END access specifications
‘MANAGER LIST ==================================================================
‘Manager ID, manager name
1,PB-W2K3-04
‘Manager network address
193.27.20.145
‘Worker
Worker 1
‘Worker type
DISK
‘Default target settings for worker
‘Number of outstanding IOs,test connection rate,transactions per connection
64,ENABLED,500
‘Disk maximum size,starting sector
8000000,0
‘End default target settings for worker
‘Assigned access specs
‘End assigned access specs
‘Target assignments
‘Target
C:
‘Target type
DISK
‘End target
‘End target assignments
‘End worker
‘End manager
‘END manager list
Version 2004.07.30

####################################### ENDE of *.icf

SAN HQ requires all ports on active controller to be up in order to work properly

By admin, October 4, 2010 4:15 pm

Thank you for contacting Dell / EqualLogic about this issue.

Yes, you are absolutely correct. 

For SanHQ to work properly the host running the SANHQ software must be able to ping each interface on the array.  The reason each port on the array needs to be accessible is due to the fact that the array will respond to snmp request on any randomly selected port on the array. 

Thus, if one or two network ports on your array goes offline for network related issue, SanHQ may or may not receive SanHQ updates during the expected interval if the problematic network ports are selected.

When Power OFF Master PC5448 Switch, we are no longer able to ping PS6000XV

By admin, October 3, 2010 12:19 pm

Can anyone offer some suggestions please or simulate to see if this could also happen in your environment? Thanks!

We are currently testing the final switch and array redundancy now, we have performed every possible fail scenarios (on Switch, Switch Ports, LAN on ESX, ESX hosts etc), they all worked perfectly as excepted, EXCEPT ONE OF THE following situation.

This is how we performed the test:

1. Putty into ESX 4.1 host Service Console, then issue “vmkping 10.0.8.8 -c 3000″ where 10.0.8.8 is our group IP, it can ping it without problem.

2. Turn OFF the master PowerConnect 5448 swtich (where we have two PC5448, master and slave, no STP and all LAG/VLAN etc has been setup properly according to the guide/best practice and we have connected all the redundancy paths correctly between switches and ESX Hosts), then we see in vCenter the ESX 4.1 host, it shows 2 out of 4 ports failed with a red cross in iSCSI VMKernel vSwitch.

3. The “vmkping 10.0.8.8 -c 3000″ stopped working until we Turn On the the master PowerConnect 5448 swtich again.
Please note the following special findings:

a. Even we cannot ping to 10.0.8.8 from the ESX Host during master switch is off, but in EQL group manager, it is still showing the ESX Host CAN STILL ESTABLISH iSCSI connection to it, and all the VM on that ESX Host is working with no problem, and we can still do VMotion between ESX Hosts even with the master switch turned off. So the iSCSI connection is not dead, just cannot be pinged somehow from ESX Host.

b. We also performed ANOTHER SIMILAR test BY turning off individual array iSCSI ports on master switch, we used OpenMange to connect to the master switch, and then TURN OFF the TWO ports connecting to PS6000XV, so to PS6000XV active controller, it shows again 2 out of 4 ports failed with a red cross.

Please note to EQL PS6000XV active controller, they see 2 out of 4 ports failed, yes, BUT we used TWO different method to have the same goal (1st one is to turn off the whole switch, the 2nd one is to turn off the iSCSI ports connecting to the array on switch) In the 2nd case, “vmkping 10.0.8.8 -c 3000″ IS STILL WORKING! How come the 1st situation doesn”t work? So the conclusion is “vmkping 10.0.8.8 -c 3000″ WILL ONLY NOT WORKING when WE TURN OFF the master switch.

Do not update firmware/BIOS from within ESX console

By admin, October 2, 2010 10:55 pm

Today I’ve learnt the hard way, and with the help of Dell ProSupport, I was able to rectify the problem that almost rendered my Poweredge R710 non-bootable.

  1. I saw there is a new BIOS update (FW 2.1.15, released on September 13, 2010) on Dell’s web site for Poweredge R710/R610 today. As described it fixed some serious problem that hang the server especially if you are using Xeon Westmere 5600 series, so it’s strongly suggested. Then I’ve upgrade the BIOS on R610 without any problem as it’s a Windows Server 2008 R2.
  2. The big problem came now, how to update BIOS/Firmware for servers running ESX? The answer is simple, take the host to Maintenance mode (VMotion all the VM off that host off course), then reboot, press F10 to go into Unified Server Configuration (USC), then setup IP and DNS server address, then use Update System, then use FTP to grab a bunch of updates directly from Dell’s FTP server, sounds easy right? Yes, it should, but what if Dell has not updated the catalog.xml which contains the latest BIOS path? Like us, today is Oct 2, 2010, and Dell still hasn’t update that important file, leaving every R710 having the existing and available BIOS same as 2.1.09, What the Hack! So you stuck, as there is no way you can easily update your BIOS, there is no Virtual Floppy anymore in iDRAC6, if there is, then I can simply boot into DOS and then attach another ISO contains the BIOS. Or shall I say I do not know where to download a bootable DOS image (ISO).
  3. Now I have boot my ESX again as USC method failed, I start to Google around and called Pro-Support, they suggest running the linux version of BIOS update programe.BIN directly from ESX console, ok, some source from Google saying it’s doable, then I use FastSCP to upload the BIN to /tmp, and then Putty into the server, then chmod u+x BIOS.BIN, then ./BIOS.BIN, after pressing “q”, it asked me if I want to continue to update BIOS (Y/N), I pressed Y, then after 5 seconds it stopped saying Update failed!
  4. Then the “BEAUTY” CAME! When I issue a reboot from vCenter, it just hang there viewing from iDRAC6’s console, “Waiting for Inventory Collector to finish” with many “……” counting, then after 20 mins, the server finally reboot itself, I tried reboot it again and it just hang again and this time, I used Reset from iDRAC6, then I found there is no more F10 available as it’s saying System Service is NOT AVAIABLE! What!!! Then Dell Pro-Support told me to go to iDRAC by Ctrl+E, then set Cancel System Service to YES, it will clear the fail state and bring back F10 after exit iDRAC. THIS IS DEFINITELY NOT GOOD! SOMETHING in the ./BIOS.BIN script MUST HAVE changed my server setting!!!
  5. I searched through Google and luckily I found Dell’s official KB.After OpenManage Server Administrator 6.3 is installed on ESX 4.1, when the system is rebooted, the system may not reboot until the Inventory Collector has completed. A message may be displayed that states “Waiting for Inventory Collector to Finish”. The system will not reboot for approximately 15 to 20 minutes.  Note: This issue can also affect the Server Update Utility (SUU) and Dell Update Packages (DUPs).The key to fix it is to issue command “chkconfig usbarbitrator off” to turn off usbarbitrator.
  6. Dell Pro-Support Level 2 engineer told me to type a list of things- “chkconfig –list” to show the Linux configure
    - “cat /etc/redhat-distro” to show the service console is actually RHEL 5.0, then I google around and found others also failed when directly updating server firmware as it’s not compatible with the general Redhat Linux may be.
    - “service usbarbitrator stop” to stop usbarbitrator service
    - “ps aux |grep usb” again to show usbarbitrator is no longer running
    - finally issue “chkconfig usbarbitrator off” to permanently disable usbarbitrator service.
  7. Finally I compared the original system config using “chkconfig –list” with my other untouched R710s, I found the only line has been changed is usbarbitrator 3:on, it should be 3:off!!! So the ./BIOS.BIN must have changed that in between and failed to update BIOS after that, and it didn’t roll back, so my system configuration has been changed! Dell’s KB 374107 didn’t specify and indicating the original ESX 4.1 system configure usbarbitrator is indeed with 3:off!

Why Dell still hasn’t update the catalog.xml in their FTP (both ftp.dell .com and ftp.us.dell.com), the BIOS has been released for two weeks? Anyway, I will wait till the end of October and try to use USC to update it again.

The following is quoted from the official Dell Update Packages README for Linux

* Due to the USB arbitration services of VMWare ESX 4.1, the USB devices appear invisible to the Hypervisor. So, when DUPs or the Inventory Collector runs on the Managed Node, the  partitions exposed as USB devices are not shown, and it reaches the timeout after 15 to 20 minutes.

This timeout occurs in the following cases:

* If you run DUPs or Inventory Collector on VMware ESX 4.1, the partitions exposed as USB devices are not visible due to the USB arbitration service of VMware ESX 4.1 and timeout occurs.

The timeout occurs in the following instances:

• When you start “DSM SA Shared Service” on the VMware ESX 4.1 managed node, it runs Inventory Collector. To work around this issue,  uninstall Server Administrator or wait until the Inventory Collector completes execution before attempting to stop the “DSM SA Shared Service”.

• When you manually try to run DUPs or the Inventory Collector on the VMware ESX 4.1 managed node while USB arbitration service is running.  To fix the issue, stop the USB arbitration service and run the DUPs  or the Inventory Collector.

To stop the USB arbitration service:

1. Use the “ps aux|grep” usb to check if the USB arbitration
service is running.
2. Use the “chkconfig usbarbitrator off” command to prevent the USB
arbitration service from starting during boot.
3. After you stop the usbarbitrator, reboot the server to allow the
DUPs and/or the Inventory collector to run.

Note: If you require the usbarbitrator, enable it manually. To enable the usbarbitrator, run the command – chkconfig usbarbitrator on.

Update: April 6, 2012

* The USB arbitration service of VMWare ESX 4.1 makes the USB devices invisible to the Hypervisor. So, when DUPs or the Inventory Collector runs on the MN, the partitions exposed as USB devices are not shown, and it reaches the timeout after 15 to 20 minutes. This timeout occurs in the following cases:

When you start “DSM SA Shared Service” on the VMware ESX 4.1 managed node, it runs the Inventory Collector.  While the USB arbitration service is running, you must wait for 15 to 20 minutes for the Inventory collector to complete the execution before attempting to stop this service, or uninstall Server Administrator.

When you manually run the Inventory Collector (invcol) on the VMware ESX 4.1 managed node while the USB arbitration service is running, you must wait for 15 to 20 minutes before the operations end. The invcol output file has the following:

<InventoryError lang=”en”>
<SPStatus result=”false” module=”MaserIE -i”>
<Message> Inventory Failure:  Partition Failure – Attach
partition has failed</Message>
</SPStatus><SPStatus result=”false” module=”MaserIE -i”>
<Message>Invalid inventory results.</Message>
</SPStatus><SPStatus result=”false”>

To fix the issue, stop the USB arbitration service and run the DUPs, or Inventory Collector.
Do the following to stop the USB arbitration service:

1. Use ps aux | grep usb to find out if the USB arbitration service is running.
2. To stop the USB arbitration service from starting up at bootup, use chkconfig usbarbitrator off.
3. Reboot the server after stopping the usbarbitrator to allow the
DUPs and/or the Inventory collector to run.

If you require the usbarbitor, enable it manually. To enable the usbarbitrator, run the command – chkconfig usbarbitrator on. (373924)

Pages: Prev 1 2 3 4 5 6 7 ...22 23 24 25 26 27 28 Next