Category: Network & Server (網絡及服務器)

Install vCenter in a Workgroup instead of joining a Domain cause warning and problem?

By admin, October 16, 2010 10:03

I am having the following problem on our Virtual Center, if you know how to solve this, please kindly let me know, many many thanks in advance!

EventID 1000[VpxdLdap] Failed to search OU=Instances container.  This may indicate a problem with LDAP permissions for the account running VirtualCenter, or that the schema is not compatible with this version of VirtualCenter.

The error occur on the clock and every 15 mins after the clock (ie, 9am, then 9:15am, then 10am, then 10:15am)

It only happens when
1. Running vSphere Client and leave it on (1-3 times a day)
2. Occur 24 times a day if we have vSphere Client on as well as Veeam Monitor on, seem Veeam Monitor is competing with vSphere Client for pulling resources, so that’s why the error occurs more often.

Then vCenter server alarm section will periodically produce alerts saying vCenter Health Status is in YELLOW due to LDAP server cannot be contacted because I am not joining an AD Domain, this sounds ridiculous.

Btw, the vCenter server DID NOT JOIN A DOMAIN, only using the same server’s Workgroup, I know it’s not right or the best way according to vCenter setup guide, but I really want to keep it simple. (ie, do not want to have another physical server just for AD), I really wish VMware will release a patch for vCenter that allow us to select Domain or Workgroup model during installation or even better allow us to change the option on the fly.

I suspect it’s a client pulling problem and/or the client can’t search through AD/LDAP, so it reports such error?

It’s just a warning error, nothing really affecting operation, so I think I can safely ignore it, but do appreciate if someone came across and solved this strange problem.

 

Update:

From vCenter Error Log:

[2010-10-24 04:19:24.791 05976 error 'App'] [LDAP Client] Failed to poll search: 0×0 (The call completed successfully.)
[2010-10-24 04:19:24.791 05976 warning 'App'] [LDAP Client] Reinitializing search -1 (ou=Licenses,ou=Licensing,dc=virtualcenter,dc=vmware,dc=int)
[2010-10-24 04:19:24.791 05976 error 'App'] [LDAP Client] Failed to perform asynchronous search for base DN = ou=Licenses,ou=Licensing,dc=virtualcenter,dc=vmware,dc=int: 0×51 (Cannot contact the LDAP server.)

[2010-10-24 08:11:56,116 Timer-4  INFO  com.vmware.vim.jointool.util.ldaphealth.LdapHealthMonitor] Encountered an error when checking domain trust health : error code: $@, result: 1717
From vCenter Health Check:

Ldap domain trust change monitor – Warning – encountered an an error when checking domain trust health: error code: 1717

 

Solution:

From VMware Communities:

The message “Encountered an eror when checking domain trust health: error code 1717″ is simply an informational message in Virtual Center. The “vCenter Service Status plugin for Virtual Center 4″ runs some LDAP checks including checking for the possibility to perform domain trust lookups. When it cannot perform this domain trust lookup then it will show this message.

This message is simply an informational message and should have no major impact on the running of the Virtual Center Server. The only ways to stop this message from appearing would be joining vCenter Server to a AD Domain. Btw, you CANNOT install AD Domain Controller on the same machine with vCenter, it will not work. Because vCenter 4.1 will install an instance of ADAM (Active Directory Application Mode). It uses this when you use vCenter Linked Mode and ADAM will conflict with its’ own AD services if the server is also a Domain Controller.

 

From ESX 4.1 vCenter Installation Guide:

The system that you use for your vCenter Server installation must belong to a domain rather than a
workgroup. If assigned to a workgroup, the vCenter Server system is not able to discover all domains and
systems available on the network when using such features as vCenter Guided Consolidation Service. To
determine whether the system belongs to a workgroup or a domain, right-click My Computer and click
Properties and the Computer Name tab. The Computer Name tab displays either a Workgroup label or
a Domain label.

 

Seemed there is no workaround for running vCenter on standalone Workgroup, but why would I use an extra physical machine for the sole purpose of running an AD Domain Controller? It’s TOTALLY AGAINST VIRTUALIZATION and it’s not Green at all, most of all if I have a small enviornment with less than 5 ESX Host, why would I bother to setup a AD?

My own solution would be disable vCenter Health Check alarm or just simply remove the part saying Health Check changed to Yellow should be fine.

 

Finally, some people may install vCenter on Windows Server 2008 R2 and encounter the following problem, according to VMware KB1025668.

Installing vCenter Server 4.1 on a Windows 2008 R2 system fails

Symptoms
•Cannot install vCenter Server 4.1 on a Windows 2008 R2 system
•Installing vCenter Server 4.1 on a Windows 2008 R2 system fails
•You see on of these errors:

◦The trust relationship between this workstation and the primary domain failed in the jointool-0.log
◦Setup cannot create vCenter Server directory Services Instance
Resolution
This issue may occur if the Active Directory in your environment is hosted by a Windows 2000 domain controller (THAT’S OLD!!!). This issue occurs because vCenter Server 4.1 is unable to retrieve the security identifier (SID) for an account.

To resolve this issue, you must apply a Microsoft hotfix. For more information and to download the hotfix, see the Microsoft Knowledge Base article 976494.

Note: You must reboot the system before installing vCenter Server again.

Storage Protocol Choices & Storage Best Practices for VMware ESX

By admin, October 15, 2010 11:33

Just found a great article for ESX Storage Best Practices from Cisco, definitely worth reading for understanding how storage really works in VMware vSphere

At the end, it also mentioned the future: VAAI, well the paper was written in 2009 and one year after, we are already using it in our Equallogic SAN. :)

There is even a section called “Day After Tomorrow”, future technologies like vMotion between Datacenters, DRS and DPM for storage, etc.

Mouse is very slow in Windows Server 2008 R2 under ESX 4.1

By admin, October 12, 2010 13:04

Basically, all you need to update the SVGA driver to WDDM driver, but why didn’t VMware include that in its latest VMware Tools?

Troubleshooting SVGA drivers installed with VMware Tools on Windows 7 and Windows 2008 R2 running on ESX 4.0

WDDM and XPDM graphics driver support with ESX 4.x, Workstation 7.0, and Fusion 3.0

Solution to Dell OEM Windows Server Requires Re-Activation in ESX 4.1

By admin, October 12, 2010 09:26

So you have been there and encountered that annoying thing, you’ve called Dell Pro-Support and they replied you there is DEFINITELY NO WAY and you also called Microsoft, finger pointing back to Dell by asking you to contact Dell directly as it’s OEM product. You have asked local Microsoft distributor, they also said there is no way to do it, you have to buy Box set or Open License, your existing Dell OEM license will not allow you to reactivate using the key printed on it.

dellkey

Well, THEY ARE ALL WRONG!!!

  • Dell’s Pro-Support is unprofessional in this case.
  • Microsoft is responsible for its own product, NOT!
  • Local Microsoft Distributor wants you to pay more, huh?

 

This is the Official Solution from Dell, hope it’s useful for others, the key point is to use Virtual Key to re-activate, and then either activate on-line or use phone to activate again and finally clone it as master gold image for further deployment.

You cannot automatically pre-activate the Windows Server 2008 operating system installed on VMs by using the product activation code in the Dell OEM installation media. You must use the virtual product key to activate the guest operating system. For more information, see the whitepaper Dell OEM Windows Server 2008 Installation on Virtual Machines using Dell OEM Media at dell.com.

 

I always thought Virtual Key is for Microsoft’s own Hyper-V only and cannot be used in VMWare enviornment, but I was wrong.

 

Alternatively, you can force the VM to load the default BIOS containing DELL SLIC 2.1 (supports Windows 7 and Windows Server 2008 R2), which will trick the VM thinking it’s actually a PHYSICAL DELL server.

1. Simply add bios440.filename = “DELL.ROM” to VM configuration parameters by using VC Client, of course you do need to upload DELL.ROM to your VM directory and please don’t ask me where to get that DELL.ROM, goole it around yourself. One draw back is this VM can’t be vMotioned around, as the DELL.ROM won’t get vMotioned. (Update: Solution to vMotion is to put DELL.ROM on every host or simply on SAN such as bios440.filename = “/vmfs/volumes/san/DELL.ROM”. :)

2. Very importantly, you will also need to find the corresponding certificatedell.XRM-MS, then use slmgr.vbs -ilc c:\dell.XRM-MS to import the certificate.

3. Insert the Key by slmgr.vbs -ipk XXXXX-XXXXX-XXXXX-XXXXX-XXXXX

 

Finally, some say even by adding SMBIOS.reflect = True will work, but I COULD NEVER get this method working!

Update: The reason I didn’t get it working is because I didn’t use Dell’s W2k8R2 installation disk, see this link from IBM, sounds so simple! Really?

Solution

Edit the virtual machine’s .vmx file to contain the following line:

SMBIOS.reflectHost = “true”

Note: Encoding of the text added to the .vmx file must be in UTF8.

This updates the virtual machine BIOS with the IBM Original Equipment Manufacture (OEM) information required to use IBM-provided Operating System (OS) installation media.

IBM-provided Microsoft Windows 2008 media must be “BIOS Locked” to ensure that the OS will only install on IBM hardware. Virtual machines use a virtual BIOS that does not contain information that identifies the system as being manufactured by IBM.

The installation of Microsoft Windows Server 2008 from IBM OEM media to such a virtual machine will fail until the virtual BIOS has been updated to include this information. Alteration of the virtual machine’s .vmx file to state SMBIOS.reflectHost = “true” performs this function for servers using VMware’s ESX/ESXi technology.

The workaround resolves this issue by using media that is not locked to a specific OEM.

The solution resolves this issue by adding IBM information to the virtual BIOS.

Update Apr-16

Tried again today, the method SMBIOS.reflectHost = “true” is DEFINITELY NOT working! Even loaded with Dell’s OEM w2k8r2 std installation disk and the server is Poweredge R710, it still asked for activation. In additional, I discovered I can install Dell’s OEM w2k8r2 std disk on VM even without SMBIOS.reflectHost = “true”, so this means Dell’s w2k8r2 disk can be used on a non-Dell server.

So only the above two methods are working, but not the last one, if you got the last one working, pls drop me a line, thanks.

Update Apr-17

May be the answer is SMBIOS.reflectHost = “true” WILL ONLY WORK for ESX 3.5 or before, as VMware’s KB didn’t indicate this method apply to ESX 4.0.

ESX 4.1, VM is Version 7, VMware PVSCSI and VMXNet3, Safely Remove Hardware?

By admin, October 11, 2010 11:15

After upgrading to ESX4.1, my VM with latest Version 7, VMware PVSCSI and VMXNet3 starts to show “Safely Remove Hardware” alert in tray, but why would you want to remove your harddisk and NIC? Huh?

Then I found this useful link, case solved!

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1012225

To Thin or Not To Thin? On Equallogic and/or ESX Datastore?

By admin, October 10, 2010 12:42

About a month ago, I was told by an experienced Dell Equallogic Consultant to use Normal (non-thin) on EQL array and Thin on ESX VMFS, I wasn’t exactly sure what did he mean by then.

So I did a simple test on my EQL box:

Create a 10GB volume (non-thin), attach to Windows, write 5GB, then remove 4GB, leave with 1GB, to EQL it’s 5GB used.

Then I write another 4GB, EQL still reports 5GB, then I write 1GB more, now EQL reports 6GB.

However in my Thin Provisioning test for the above same 10GB, case looks completely different now.

Create a 10GB volume, attach to Windows, write 5GB, then remove 4GB, leave with 1GB, to EQL it’s 5GB used.

Then I write another 4GB, EQL somehow EQL volume reports the size continously growing to 5GB, then 6GB, then finallyBbang 9GB. WHY? WHY doesn’t it use the UNUSED SPACE? (actually inside to Windows, it’s still 5GB, you will see later)

HOWEVER, Please note THIS, as I continue to add another 4GB to the volume (now EQL reports 9GB, windows reports 5GB), then EQL reached 10GB max (somehow the volume didn’t go offline? why? I don’t know), but I can still add this 4GB to the volume, and windows reports 9GB/10GB used.

So in a strange way, even EQL reports the volume has been fully used, we can still add data to it at Windows level, but it’s just TOTALLY CONFUSING and false volume is going to full alarm all the way when using Thin Provisioning.

That’s why WAIT UNTIL FW5.0.x or FW5.x coming out with the REAL THIN RECLAMATION feature like what HDS’s or 3PAR’s did a year ago. (Yes, EQL is behind in this particular area) 

We are probably better to NOT USE Thin Prov. in ESX, what I mean is to

Use Thick Prov. in EQL, but Thin in ESX VMFS would be the best way.

For snapshots, just set it to a smaller % during the volume creation (10% would be good, as you can always grow it later), this apply to the volume as well, make your own Thin Provioning, just set the volume to a smaller size when you first create it, then gradually expand it as you need later, then you won’t waste a lot of space from the beginning.

 

Update:

I did another test and it proved I was wrong above.

The GB are reported in EQL Group Manager under Volume Used Size

  Thick (20GB) Thin (20GB)
——————————————-
1. +5GB  5GB  5GB
2. +5GB  10GB  10GB
3. -5GB  10GB  10GB
4. +5GB  10GB  10GB
5. -10GB 10GB  10GB
6. +15GB 15GB  15GB (Warning as over the default 60%)
7. -5GB  15GB  15GB
8. +5GB  15GB  15GB

So we are safe to use Thin Provisioned VMFS now I think. 

Btw, I also received a reply from EQL indicating they are working on the Re-Thin feature.

In response to “reclaiming unallocated array disk space” on the PS Series arrays:

An enhancement request for this feature (reclaim space that was previously used) has already been submitted.  Firmware version 5.0.2 does not introduce this feature.  Engineering has not updated support as to when such a feature will be available in future firmware releases.

Finally, I looked into details about Hitachi HDS’s Re-Thin feature, a 3PAR guy points out HDS’s Re-Thin in fact is actually a…Migration and the Zero Out the unused blocks, but in 3PAR, they can REALLY, I MEAN REALLY do the Re-Thin in real time, no need to copy the volume to another copy and then zero out the unused block. I do hope Equallogic can have this kind of feature instead of a “Not so real” Re-Thin like HDS ones.

 

Oct 14, 2010 Some update from Dell Pro-Support regarding NTFS/VMFS can REUSE the touched blocks somehow.

=======================
A similar problem is when the initiator OS reports significantly more space in use than the array does. This can be pronounced in systems like VMWare that create large, sparse files. In VMWare, if you create yourself a 10GB disk for a VM as a VMDK file, VMWare does not write 10GB of zeros to the file. It creates an empty (sparse) 10GB file, and subtracts 10GB from free space. The act of creating the empty file only touches a few MB of actual sectors on the disk. So VMWare says 10GB missing, but the array says, perhaps, only 2MB written to.

Since the minimum volume reserve for any volume is 10%, the filesystem has a long way to go before the MB-scale writes catch up with the minimum reservation of a volume. For instance, a customer with a 100GB volume might create 5 VMs with 10GB disks. That’s 50GB used according to VMWare, but only perhaps 5 x 2MB (10MB) written to the array. Until the customer starts filling the VMDK files with actual data, the array won’t know anything is there. If has no idea what VMFS is; it only knows what’s been written to the volume.

• Example: A file share is thin-provisioned with 1 TB logical size. Data is placed into the volume so that the physical allocation grows to 500 GB. Files are deleted from the file system, reducing the reported file system in use to 100 GB. The remaining 400 GB of physical storage remains allocated to this volume in the SAN.

� This issue can also occur with maintenance operatiions including defragmentation, database re-organization, and other application operations.

In most environments, file systems do not dramatically reduce in size, so this issue occurs infrequently. Also some file systems will not make efficient re-use of previously allocated space, and may not reuse deleted space until it runs out of unused space (this is not an issue for NTFS, VMFS).
=======================

 

Update Oct-15-2010

If you ask me again now, I would say THIN PROVINTIONING (aka TP) ALL THE WAY, both on Equallogic AND on ESX Datastore is the BEST way to go and it is going to be the trend in storage management world I think, especially if Equallogic will release it’s upcoming Re-Thin or Space Reclaim feature in coming 5.x firmware update. (So far only 3PAR is able to do it I think)


Update Sep-3-2011

Storage APIs for Array Integration (VAAI) has been enhanced to reclaim blocks when a virtual disk is deleted, unlike previously where the storage array is not aware about deleted blocks contains data after deleting virtual disks.

There is a new feature in vSphere 5.0 that may finally solved the problem, but is this only going to work in vSphere 5.0? I really do hope ESX 4.1 can also get this VAAI enhancement after upgrading the Equallogic firmware with such thin provisioning reclaim capability.

Currently, the only way to reclaim a thin provisioned volume (TP) in Equallogic is to Storage VMotion all existing VMs to a new TP volume and then delete the existing one.

Impressive Equallogic PS6000XV IOPS result

By admin, October 9, 2010 11:41

Impressive Equallogic PS6000XV IOPS result

I just performed the test again 3 times and confirmed the followings, this is with default 1 Worker only, IOmeter testing using VM’s VMFS directly, no MPIO direct mapping to EQL array, VM is version 7, Disk Controller is Paravirtualized and NIC is VMXNet3.
SERVER TYPE: VM on ESX 4.1 with EQL MEM Plugin, VAAI enabled with Storage Hardware Acceleration
CPU TYPE / NUMBER: vCPU / 1
HOST TYPE: Dell PE R710, 96GB RAM; 2 x XEON 5650, 2,66 GHz, 12 Cores Total
STORAGE TYPE / DISK NUMBER / RAID LEVEL: Equallogic PS6000XV x 1 (15K), / 14+2 600GB Disks / RAID10 / 500GB Volume, 1MB Block Size
SAN TYPE / HBAs : ESX Software iSCSI, Broadcom 5709C TOE+iSCSI Offload NIC

##################################################################################
TEST NAME——————-Av. Resp. Time ms——Av. IOs/sek——-Av. MB/sek——
##################################################################################

Max Throughput-100%Read……4.1913………13934.42………435.45

RealLife-60%Rand-65%Read……13.4110………4051.49………31.65

Max Throughput-50%Read………5.5166………10240.39………320.01

Random-8k-70%Read……………14.1525………3915.15………28.95

EXCEPTIONS: CPU Util. 67.82, 38.12, 56.80, 40.2158%;

##################################################################################
RealLife-60%Rand-65%Read 4051 IOPS is really impressive for a single array with 14 15K RPM spindles!

Windows Server Backup in Windows Server 2008 R2

By admin, October 8, 2010 13:17

I’ve been using Windows Server Backup in Windows Server 2008 R2 for almost a month, and found it can do everything Acronis True Image Server does, seemed there is really no need to buy AIS in the future in my own opinion.

See what WSB can offer you (my requirement list):

  • WSB works at Bloack Level, so backup or taking snapshot is very fast during backup and restore as well.
  • Full server baremetal backup.
  • Full backup at first time, and incremental afterwards, Excluding files during backup, Compression
  • System State integrated with AD, so you won’t get a crash consistant state that you have your server restored, but found AD cannot be started.
  • Individual folder/file restore
  • Backup to network shared folders (there is a limit that you cannot have incremental copies, but only keep ONE copy, the later one will over-write the previous one), this does suck badly! However I don’t use network folder to store my backup, so it’s fine for me.
  • Maximum 64 copies (in my term, it’s almost 2 months since I only have 1 schedule backup running) or limited to your backup disk size
  • The backup copies are hidden from the file system, in TIS, you need to create a partition to hide the backup copies. (Acronis Secure Zone)
  • WSB can backup Hyper-V vm images as well.

Best of all Windows Server Backup (WSB) in Windows Server 2008 R2 IS FREE! TIS is over USD1,200, I don’t need any features like Convert to VM, Universal Restore, Central Management, so WSB works perfectly for the standalone server.

Finally, you may ask what about the rescure bootable DVD/CD-ROM? While you don’t have one, what? Yes, the Windows Server 2008 DVD is your optimate rescure bootable DVD, fair enough? :)

A Possible Bug in VMware ESX 4.1 or EQL MEM 1.0 Plugin

By admin, October 8, 2010 00:37

This week, I encountered a strange problem in redundancy testing, all paths to our network switches, servers and EQL arrays have been setup correctly with redundancy.  Each ESX Host iSCSI VMKernel (or pNIC) has 16 paths to EQL arrays and we tested every single possible failure situation and found ONLY ONE senerio doesn’t work. See When Power OFF Master PC5448 Switch, we are no longer able to ping PS6000XV.

After two days of troubleshooting with local Pro-Support as well as US EQL support, we have narrowed down the problem to “A Possible Bug in VMware 4.1 or EQL MEM 1.0 Plugin”.

During the troubleshooting, I found there is another Equallogic user in Germany is also having similar problem not exactly the same though (See  Failover problems between esx an Dell EQL PS4000), as he’s using PS4000 and only have two iSCSI paths, and his problem is more seroius than ours.

Oct 5, 2010 3:16 AM
Fix-List from v5.0.2 Firmware:
iSCSI Connections may be redirected to Ethernet ports without valid network links.

Also he’s problem is similar as whatever iscsi connection left in LAG won’t get redirected to slave switch after shutdown the master switch, I got 4 paths, his PS4000 has two, so my iscsi connection survived due to there is an extra path to the slave switch, but somehow vmkping doesn’t work.

and if you look at comment #30 .

Jul 27, 2010
Dell acknowledged that the known issue they reporting in the manual of the EqualLogic Multipathing Extension Module is the same I get.

They didn’t open a ticket at vmware for now, but they will, after some more tests.

I think this issue is there since esx 4.0. In VI3 they used only one vmkernel for swiscsi with redundancy on layer1/2, so there it should not be the case.

My case number for this issue at vmware is 1544311161, the case number at dell is 818688246.

If vmware acknowledge this as a bug in 4.1, and don’t have a workaround, we will go with at least 4 logical paths for each volume and hope that at least one path is still connected after switch1 fails, until they fix it.
Finally, it could also be something related to EQL MEM Plugin for ESX which we have installed. (Comment #29 on page 2)

It indicates there is a know issue that once a network link failed (could be due to shut down the master switch), if the physical NIC with the network failure is the only uplink for the VMKernel port that is used as the default route for the subnet. This affects several types of kernel network traffic, including ICMP pings which the EqualLogic MEM uses to test for connectivity on the SAN.

Jul 23, 2010

from the dell eql MEM-User_Guide 4-1:

Known Issues and Limitations
The following are known issues for this release.

Failure On One Physical Network Port Can Prevent iSCSI Session Rebalancing
In some cases, a network failure on a single physical NIC can affect kernel traffic on other NICs. This occurs if the physical NIC with the network failure is the only uplink for the VMKernel port that is used as the default route for the subnet. This affects several types of kernel network traffic, including ICMP pings which the EqualLogic MEM uses to test for connectivity on the SAN. The result is that the iSCSI session management functionality in the plugin will fail to rebuild the iSCSI sessions to respond to failures of SAN changes.

Could it be the same problem I have? So they (DELL/VMware) already know about this problem?

Aside this it looks like the Dell MEM makes only sense in setups with more then one array per psgroup, because the PSP selects a path to a interface of the array where the data of the volume is stored. And it have a lot of limitations. We only have one array per group for now, so I think I skip this.

Still dont understand why there is no way to prevent that the connections go through the LAG in the first place, it should be possible to prefer direct connections…

My last reply to EQL Support today:

Some updates, may be you can pass them to L3 for further analysis.

The problem seemed to be due to EQL MEM version 1.0 Known Issue. (User Manual 4-1)

==================================================
Failure On One Physical Network Port Can Prevent iSCSI Session Rebalancing

In some cases, a network failure on a single physical NIC can affect kernel traffic on other NICs. This occurs if the physical NIC with the network failure is the only uplink for the VMKernel port that is used as the default route for the subnet. This affects several types of kernel network traffic, including ICMP pings which the EqualLogic MEM uses to test for connectivity on the SAN. The result is that the iSCSI session management functionality in the plugin will fail to rebuild the iSCSI sessions to respond to failures of SAN changes.
==================================================
I’ve performed the test again and it showed your prediction is correct that the path is always fixed to vmk2.

1. I’ve restarted the 2nd ESX host, and found the path C0 to C1 showed correctly.
2. The reboot master switch, from 1st or 2nd ESX Host, I cannot ping vCenter or vmkping EQL or iSCSI vmks on other ESX hosts, and CANNOT PING OTHER VMKernel such as VMotion of FT this time as well.
3. But VM did not crash or restart, so underneath, all iSCSI connections stay on-line, that’s good news.
4. After master swtich comes back, under storage path on ESX Hosts, I see those C4, C5 paths generated.

Could you confirm with VMware and EQL if they have this bug in their ESX 4.1 please? (ie, Path somehow is always fixed to vmk2)

I even did a test by switching off iSCSI ports on the master switch one by one, problem ONLY HAPPENS when I switch off ESX Host VMK2 port. (which is the first vmk iScsi port, ie, the default route for the subnet?)

It confirmed the vmkping IS BOUND to the 1st iSCSI vmk port which is vmk2 and this time, all my vmkping dead including VMotion as well as FT.

The good news is underneath the surface everything works perfectly as expected, iSCSI connections are working fine, VMs are still working and they can be VMotioned around, FT is also working fine.

We do hope VMware and EQL can release a patch sometime soon so they fix this vmkping problem that vmkping always has to go out of the default vmk2 which is the 1st iSCSI VMKernel in the vSwitch, but not any other vmk connections, so when the master switch dead, vmkping also died with vmk2 as vmkping uses vmk2 for ICMP ping to other VMKernel IP address.

Some interesting findings from VMWare Unofficial Storage Performance

By admin, October 6, 2010 17:17

Some more interesting comments contributed by all the testers:

1) VMFS block size (1-8MB) seems to have little to no effect on performance.

2) Thin/Thick provisioning doesn’t have much impact on performance.

3) RDM has minimal performance increases over VMFS (except in 100% sequential tests which just won’t ever happen in the real world), VMFS has minimal impact on performance, achieving approximately 98%+ of physical performance, so others suggest use VMFS all the way

4. The most real test seems to be the “RealLife-60%Rand-65%Read” – in normal life you have random and sequential connections mixed (often 60% Random vs 40% Sequential).

5. We can see vm on iSCSI compared to physocal server loses more throughput and response time than its counterparts on FC SAN. (but not much <5%, especially in situation of sequential Read/Write)

6. First my suggestion for disabling Jumbo Frames would only be on the Guest O/S. Leave it enabled on the switch and the vSwitch on the host.

7. As for performance best practice, configure VM using Version 7, Paravirtual for Disk Controller and NIC as VMXNET3.

Pages: Prev 1 2 3 ...16 17 18 19 20 21 22 23 24 25 26 Next