Category: Network & Server (網絡及服務器)

Damn…Even eBay Got Hacked!

By admin, May 24, 2014 08:30

Received a letter from Devin Wenig, President, eBay asking everyone to change their password.

ebay

If you used the same eBay password on any other site, I encourage you to change your password on those sites too. And if you are a PayPal user, we have no evidence that this attack affected your PayPal account or any PayPal financial information, which is encrypted and stored on a separate secure network.

This is Something New for Me

By admin, April 7, 2014 13:10

It is generally accepted as common knowledge that the high-end RISC server vendors—IBM and Oracle—have been bleeding market share in favor of high-end Intel Xeon based servers. Indeed, the RISC market accounts for about 150k units while the x86 market has almost 10 million servers. About 5% of those 10 million units are high-end x86 servers, so the Xeon E7 server volume is probably only 2-4 times the size of the whole RISC market. Still, that tiny amount of RISC servers represents about 50% of the server market revenues.

Lenovo plans to acquire IBM’s x86 Server Business

By admin, January 30, 2014 11:45

Seemed Lenovo is making the move again, but it seem IBM is always selling the rubbish to Lenovo to me…haha…

On January 23, Lenovo and IBM have entered into a definitive agreement in which Lenovo plans to acquire IBM’s x86 server business. As a valued client of IBM, I wanted to reach out to you with this exciting news, and give you some of the highlights:

Under the agreement, Lenovo plans to acquire IBM’s System x, BladeCenter and Flex System blade servers and switches, x86-based Flex Systems, NeXtScale and iDataPlex servers and associated software, networking and maintenance operations.

IBM will retain its System z mainframes, Power Systems, Storage Systems, Power-based Flex servers, and PureApplication and PureData appliances. This is consistent with our strategy over the last several years to continually remix our portfolio to focus on high value solutions and services for our clients.

Lenovo and IBM plan to enter into a strategic relationship which will include a global OEM and reseller agreement for sales of IBM’s industry-leading entry and midrange Storwize disk storage systems, tape storage systems, General Parallel File System software, SmartCloud Entry offering, and elements of IBM’s system software portfolio, including Systems Director and Platform Symphony.

Various Trouble Shooting Notes

By admin, January 11, 2014 19:38

Yesterday was like a long fight, different parts started to fall apart within a few hours, first it was one of the ESX host, then Equallogic, finally Group Manager and iDrac problem, people say Shxt happens, this fits exactly to my case!

One thing I’m glad that Dell fulfilled it’s promise this time and fixed everything within the 4 hours pro-support contract (hardware wise of course), poor guy has to go to NOC twice and worked till almost mid-night with me working remotely.

So please let me list them accordingly:

1. ESX Host:
I suddenly received a host fail alert, vCenter shows the problem host got disconnected, all the VMs on it also went grey out. Funny thing is all VMs can be still pingable and function perfectly normal as if there is nothing wrong.

Telnet/SSH Even Console hung completely, there was no way to login using root, openmanage doesn’t load. Later I found out a 15K 146GB disk failed in a RAID1 configuration from iDrac system log.

Worst enough, the replaced disk did not start to rebuild. Later Dell’s technician went into Megaraid BIOS utility and found out he has to manually add back the disk. I suspect the problem is due to the replaced disk is a Fujisu where as the faulty disk is a Hitachi, that’s why they don’t work together initially. (they should in theory, but in reality NO)

At this stage, since there is no way to remove the live VM or do a vMotion, I have no choice but to power down the host manually. Even more strange, HA didn’t kick in, all the VM did not restart on other hosts in the cluster even after 5 mins.

The whole rebuild took about 15 minutes, thanks to RAID1. The rebuild status in Openmange shows it’s always 33% while the disk light stopped blinking (meaning completed), funny! After reboot again, the optimal status can be verified in Megaraid BIOS, also reflects in Openmanage later, so this means Openmanage takes time to fetch the status from different hardware parts.

So I still have no clue why the faulty disk in a RAID1 caused the ESX host to be non-responsive.

2. Equallogic:

I received the following notice multiple times via Email, Group Manager shows it’s Information type and it’s in Green, I’ve sensed there must be something wrong, so I called Dell EQL support, as expected, the local support knows nothing about it.

—————————————–
INFO event from storage array eql01
subsystem: SP
event: 14.2.22
time: Fri Jan 10 12:10:30 2014

I/Os containing bad blocks were read from drive 10 and successfully reconstructed in the last 8 minutes.
—————————————–

After approximately 6 hours, the following faulty alert confirmed my previous worry.

—————————————–
ERROR event from storage array eql01
subsystem: SP
event: 14.4.22
time: Fri Jan 10 20:11:15 2014

Disk drive 10 failed in RAID LUN 0.
—————————————–

So the previous notice is actually EQL’s Predictive Failure in Action!!!

SANHQ also generated the similar alert.

Warning conditions:

  • 1/10/2014 8:10:50 PM to 1/10/2014 8:12:50 PM
    • Warning: Member eql01 RAID Set Is Degraded
      • Warning: Member eql01 RAID set is degraded because a disk drive failed or was removed.
    • Warning: Member eql01 RAID More Spares Expected
      • Warning: Member eql01 The current RAID configuration requires more spare drives then are currently available.
    • Warning: Member eql01 has a failed drive in slot 10

With the replacement disk, reconstruction immediately took place, and the process took about 1 hour to complete, again, thanks to RAID1.

3. EQL Group Manager

As I need to verify if the replaced EQL disk has successfully changed to a hot spare, then I found out I can no longer login to EQL Group Manager due to some strange Java error, no matter if it’s IE or Firefox. The Java version is v7 u45, then I’ve tried different versions until I figured out only v7 u17 worked. My conclusion is EQL firmware plays a big role in this case, as I am still using v5.2.2, so EQL probably hard coded the requirement into their application, anyway, Java JRE verion always produces nasty problem in my environment one way another, so I’ve decided not to upgrade it for sure.

4. iDrac

Back to the Disconnected Host with faulty disk, I found I can no longer login to iDrac Web UI, IE works but producing all sorts of problem, not to mention the console doesn’t show up at all with its ActiveX stuff. I’ve even tried to removed the iDRAC cert from advanced option, reboot the managed machine, won’t help at all, and it turns out a simple Content Cache Clear in Firefox solved the problem completely! Ridiculous Really!

If it still doesn’t work, do a soft rest by “racadm racreset soft”

5. Veeam

Yes, it’s not finished yet, I also found Veeam’s schedule job stopped working as I am still using V5.0.1, there is a Veeam KB and an update (v5.0.2) for this issue, but I can’t explain why it’s been working for 3+ years and suddenly stopped working with no reason, so I’ve removed all the old backup and created a New Full Backup, truth will tell by tomorrow morning and I shall verify the Schedule Job again by then.

Update: I have to install the update in order to solve the schedule job doesn’t run problem. Also do remember to close all the extra TPC/UDP ports that’s been re-enabled by the upgrade of Veeam B&R program. (Potential Risk: Veeam Agent, NFS and Windows Shares in particular)

Updated:

Restarting the management agents on ESX may help:

  1. Log in to your ESX Server as root (by su -) from either an SSH session or directly from the console of the server.
  2. Type “service mgmt-vmware restart”.
    Caution: Ensure Automatic Startup/Shutdown of virtual machines is disabled before running this command or you risk rebooting the virtual machines.
  3. Press Enter.
  4. Type “service vmware-vpxa restart”.
  5. Press Enter.
  6. Type “logout” and press Enter to disconnect from the ESX host.

關於近兩年Equallogic Firmware的命名方式

By admin, December 24, 2013 12:24

eqlf說真的,我還在用v5.2.2,v6頂多可以叫做5.4,v7那也就是5.7。

EQL最近一兩年這樣的命名方式簡直就是自欺欺人,用來嚇唬人新人還可以,其實背后根本還是一頭紙老虎來的。所以最近一年v6和v7的更新在EQL社區里都沒有像之前5.1引來那么大的回響了,而那些所謂的新功能其實也都比較雞肋。

咨同樣道理ESX4.1>ESX5.5 (其實應該叫ESX4.8)沒之前那么大的回響也是預料之中。

If it ain’t broke don’t fix it

另外據一位資深中國Dell L2工程師多年前的忠告,如果沒事就請別搬石頭砸自己的腳,請不要動不動就去升級各類硬件的Firmware和Driver,一定要以穩定至上。

e.g., http://communities.vmware.com/message/2300998

Last but not least, keep your network & storage design simple but elegant!

最後其實我一直都有以下疑問,多年來都沒有得到很好的答案:

個別Member出錯的機會應該會隨着你增加EQL Members進Group而增加的,那豈不很危險﹖想想看,一個16個Members的Group個別硬件出錯機會一定比一個4個Members的Group大。

如果是我的話,就會把16個Members的Group分成4組4個Members的Group,主要還是不放心。

究竟怎樣分配才是最佳的平衡點呢﹖

Time to Clean Up Windows 7 Rubbishes

By admin, November 14, 2013 13:22

windows-update-cleanup

6 months ago, I found my Windows 7 C:\ drive (50GB) is almost used up, using WinDirStat (another neat tool), I quickly found out C:\Windows\WinSXS and iPhone Backup took up 20% and 15% of the space respectively.

I’ve been searching for a simply solution to reduce these wasted space especially in C:\Windows\WinSXS without success. It has come to a point that I almost used up my entire C:\ today, my only option is to use Acronis Disk Director to expand C:\, but that involves some risks.

Luckily, I fount out Microsoft has released a nice updated Disk Cleanup tool just about two weeks ago that will solve all the problem at once.

So problem solved for now!

As for changing the default iTunes backup folder location, here is a good link.

Gigantic External Storage for Desktop

By admin, November 12, 2013 10:20

P9230_3qtr

I couldn’t believe my 2TB LaCie Minimus (two of them, 2nd one is for backing up the 1st one) has been filled up so quickly in less than 2 years, probably due to all those 720P MKVs. :)

So I’ve decided to look around for something bigger in storage size. This time I was looking for at least 3TB to 4TB, and it turns out 4TB is the one to go for in terms of $ per 1TB, at least this is the case for LaCie Porsche Design P′9230 as it’s the same $ per 1TB for 3TB and 4TB.

Most importantly I wanted to have another Lacie because I can re-use the existing power supplies (ie, 3A plug) instead of adding two extra power supplies which I don’t have room for on the power bar.

Physical dimension is also important, as LaCie Minimus is currently the smallest external disk available and of course not not mention how cool it looks! The only draw back is LaCie Minimus only comes with 1 year warranty.

In fact, I’ve looked at other brands as well such as Seagate and Hitachi (Touro Pro, 7,200RPM), but the size is simply much larger than LaCie and they all look dull in terms of design and style.

Yes, I am well aware of the extra 20% premium I have to pay if I stick to LaCie, but consider all facts and limitations, plus I found out LaCie Porsche Design P′9230 comes with 2 years warranty, so the choice is very clear!

Somehow all LaCie disks are in fact Seagate, my Minimus 2TB is ST2000DL001 and Porsche Design P′9230 is ST4000DM000, both are 5,400 RPM.

People say the best is yet to come and yeah, the shop owner agreed to give me extra 2.5% off as I was buying two at the same time, so here we go, two gigantic 4TB disks sitting in front of my desktop and doing the data transfer now.

The only complain I have is why LaCie couldn’t make the Porsche Design P′9230 to be the same size as Minimus? It’s about 20% larger in dimension and it also uses plastic material with holes instead of metal in the bottom part (that’s why it’s hotter). I thought Seagate harddisk should be the same size, no matter if it’s 2TB or 4TB, or does the bigger brother requires more space to dissipate heat?

Of course, 400GB was wasted after format, leaving only 3.66TB usable, this somehow reminds me the same 3.66TB usable size in my Equallogic PS6000XV (600GB x 16). So think of having all the storage of a PS6000XV in a little LaCie Box, the idea is quite neat and funny.

Finally, there will be 6TB and 8TB to be released in 2014, well, reliability is always another issue of course. :)

Big Brother is Watching You (轉文)

By admin, November 6, 2013 12:24

從數據中心連線落手 逾億人私隱蕩然無存
美英聯手截取 Google雅虎用戶資料

a3101a1

美國國家安全局(NSA)竊聽和截取網上活動的伎倆,越揭越猖狂。命令科網公司交出資料,也滿足不了NSA想刺探一切的「老大哥」欲望,竟跟英國政府通訊總部(GCHQ)合作,索性秘密直接從Google和雅虎兩大科網公司的數據中心連線,肆意截取數以億計用戶資料。這行徑令用戶私隱蕩然無存,Google和雅虎對此大表震怒。

美國叛諜斯諾登(Edward Snowden)今年較早前向傳媒大爆NSA監控手段,當中的「稜鏡」(PRISM)計劃,是逼九大科網公司交出網上通訊資料,但要外國情報監視法院批出命令。NSA已有這走前門索資料的尚方寶劍,還不心足,另設走後門截取資料的「大力」(MUSCULAR)計劃。
美國《華盛頓郵報》前天(周三)引用斯諾登文件,指「大力」計劃是NSA和GCHQ聯手進行,分別向Google和雅虎的數據中心之間光纖電纜連線入手,在未公開的截取點,完全複製傳送的數據。
數據中心間傳資料沒加密

報道指,今年1月9日一份NSA內部報告,指NSA每日從兩家公司截取的數據送給總部分析,之前的30日就總共送了1.81億個紀錄到馬里蘭州總部,包括電郵收發兩方是誰的「元資料」,以至文字影音內容。簡報文件指這些紀錄為敵對國家動向提供重要線索。
Google和雅虎分散世界各地的數據中心,存放了無數用戶的通訊紀錄和資料,為加強運算速度和備份以防萬一,各地數據中心會互傳用戶資料,將資料同步,有時連整個電郵資料庫也傳送,如能中途截取,就可將即時通訊和過往紀錄都一覽無遺。
《華郵》指文件未說明NSA是如何截取數據中心連線通訊,一個可能是NSA技高一籌,能擊破科網公司私人聯網的嚴密保安,直接從連線截取資料,但知情者指兩公司都相信他們的內部網絡安全,所以數據中心間傳資料並沒有加密。

不過他們的私有海底光纖電纜駁上陸地網絡,是要經第三者營運的海纜登陸站,他們有時亦會向其他機構租用聯網設施,或者共用同一座數據中心,NSA和GCHQ有可能威逼利誘第三者在聯網裝上截取裝置。一份文件顯示截取是在美國境外進行,由一家沒公開名字的電訊服務供應商提供協助。
《華郵》指「稜鏡」計劃受法例約束,不能隨意蒐集美國公民資料,「大力」計劃則在海外進行,可假設使用海外連線的都是外國人,就算蒐集到美國公民資料也可當「誤中副車」,而且不用通知科網公司。NSA前首席分析員欣德勒說:「NSA大隊律師的工作,就是利用漏洞,在法律容許內蒐集最多資料。」電子私隱資訊中心則指「大力」計劃很可能是「非法監控」。
Google稱不知情 大表震怒

「大力」計劃曝光令兩公司難向用戶交代,Google法務總監德拉蒙德表示對計劃不知情,「對政府從我們的私有光纖網絡截取資料而震怒」,並說會加緊將數據中心間資料傳送加密。雅虎發言人指他們的數據中心都受到嚴密保護,沒有讓NSA和其他政府部門進入。
NSA發言人否認報道屬實,局長亞歷山大(Keith Alexander)稱NSA沒入侵Google和雅虎的伺服器,但沒說明有否截取傳送中資料。
美國《華盛頓郵報》/路透社

美國竊聽風波越鬧越大,意大利傳媒指NSA連梵蒂岡也不放過,在教廷樞機團召開前後曾竊聽有關選新教宗的事宜。澳洲報紙爆料,指美國利用駐亞太區的本國及澳洲大使館從事秘密電子刺探行動,中國和東南亞多國政府都非常憤怒,要求美國解畫。
意 大利雜誌《全景》周刊前天(周三)報道,NSA去年12月10日至今年1月8日期間,竊聽意大利多達4,600萬電話通話,包括梵蒂岡的通訊,據稱情報分 為四大類,包括領導意向、對財金系統的威脅、外交政策目標和人權,報道更擔心連3月選新教宗的閉門樞機團會議也可能被竊聽。NSA隨即否認竊聽梵蒂岡,批 評報道失實,梵蒂岡發言人隆巴爾迪神父說:「我們對這事全不知情,也沒有擔心過。」
一波未平,一波又起,澳洲Fairfax媒體昨天報道,美國利 用在駐耶加達、曼谷、河內、北京、吉隆坡等地的美國和澳洲大使館作為監聽站,截取亞洲各國的電話通訊和網絡資料。中國外交部發言人華春瑩說中方非常關注事 件,要求美國澄清和解釋。印尼提出強烈抗議,馬來西亞和泰國都非常關注事件。
美國白宮高層和NSA局長亞歷山大則分別跟德國情報官員和歐洲議員會面解釋,希望可消除對方疑慮,重建互信。美國還向聯合國保證,現在和以後也不會監控聯合國總部的通訊。
法新社/美聯社

美國叛諜斯諾登(圖)揭發NSA對全球的監控工程,有如一石激起千重浪,引起極大迴響,除了當局要向全球多國政府解畫外,國會議員都認為情報機構的所為太過份,呼籲立法收緊情報部門竊聽能力,但情報界人士認為難望短期內有改變。
已 退休的中情局秘密行動處副處長薩諾(John Sano)直言,政客批評得有道理,但「指出需要改變規則,跟實際創立機制以便有效改變規則、好讓國會監察,是完全兩回事」;尤其是目前監聽計劃都是自 911之後實施的反恐工具,證明相當有效,而且對美國的利益非常重要,令情報機關不願放棄,真正的改變很可能根本不會發生。
情報界與政界唯一共識,就是齊聲譴責斯諾登是出賣美國的叛徒、罪犯。司法部前日宣佈,已經向為美國政府做僱員背景審核、包括斯諾登背景審核的USIS公司,提出起訴,控告該公司審查不力。
美國廣播公司/路透社

Ubuntu Server Configuration Experience

By admin, October 21, 2013 21:40

Today I got a chance to play with Ubuntu Linux Distro.

1. The latest compatible release for ESX 4.1 is 10.04 64bits, on the download page, it says ubuntu-10.04.4-server-amd64, and there is no version for intel 64bits, it turns out the iso will also work on Intel platform.

I suspect you can of course install the latest Ubuntu 13, but you may not able to install Vmware Tools, which is very important.

2. The bare server iso does not come with a GUI, so the following steps will help you to install a nice GUI.

$ sudo apt-get update
$ sudo apt-get install ubuntu-desktop –no-install-recommends

3. I noticed the security is much better and fine tuned in Ubuntu than Redhat or CentOS, as root is diabled by default completely, every time you will need to issue command ’sudo’ to start with when changing something in system configuration.

If you want to enable root ssh, then edit /etc/ssh/sshd_config and change the line PermitRootLogin to yes.

4. Enabling snmp is similar to CentOS, by adding rocommunity community_string mrtg.yourhost.com to /etc/snmp/snmpd.conf, but you will also need to modify /etc/default/snmpd, the line SNMPDOPTS=’-Lsd -Lf /dev/null -u snmp -I -smux -p /var/run/snmpd.pid -c /etc/snmp/snmpd.conf’

5. To configure Ubuntu firewall, you need to install gufw, the rest is a piece of cake, same as in CentOS. In fact, you can also use ufw to block DDOS IP address.

e.g.,
sudo ufw deny proto tcp from 12.34.56.78 to any port 22

6. After deploy a vm from Ubuntu template, I found eth0 has gone missing (reminds me w2k8 VMXNET 3 issue ), eventually I found this VMware KB. Or even easier, simply delete /etc/udev/rules.d/70-persistent-net.rules will do the trick.

7. ‘gksudo gedit /etc/hostname’ is the command to graphically edit any file, no more vi, which is very useful for many new Ubuntu or Linux users.

8. Finally regarding extending disk in Ubuntu, the method is similar, but with some twist.

9. There is a very good link for installing VMware Tools, one specific thing is you need to create a special directory ’sudo mkdir /usr/lib64′ in order to successfully install VMware Tools. Just make sure you download the latest VMware Tools (latest is 10.04) as the older one comes with ESX 4.1 (8.0.x) doesn’t work in latest Ubuntu 16.x. I also noticed VMware Tools status shows as “Unmanaged”!  That’s actually ok, as the tools is installed from an individual package instead of using the default attached CD-ROM (which the version doesn’t work anyway), so you can safely ignore it.

Update: Oct 23, 2013

It turns out even the latest release 12.04 worked perfectly on ESX4.1.

Update: Oct 28, 2015

Tested the latest 14.04 also worked perfectly on ESX4.1.

Update: Nov 14, 2016

Tested the latest 16.04 also worked perfectly on ESX4.1.


Biggest Disappointment About vSphere 5.5 New Feature AppHA (Application High Availability) by Veeam

By admin, September 23, 2013 10:39

Just read the following in the latest Veeam Community Forums Digest and it’s quite interesting.

In fact, I use a much simpler method in Windows environment, I simply set the particular services to restart by itself should there be any failure, it worked perfectly so far, no hassle at all. :)

You may remember after sorting through all of the vSphere 5.5 features a few weeks ago; I was most excited for the vSphere AppHA (Application High Availability). Well, I have to admit it turned into my biggest disappointment based on some hands-on experience.

The theory behind this feature sounded excellent: in addition to vSphere HA (high availability) that VMware provided for a few years now (VM monitoring, with automatic VM restart after VM or host failure), the same will now be possible at the application level (application monitoring, with automatic restart of services and/or VM in case of application failure). And because this will be built right into the platform, it’s going to be transparent and easy to use… or so I thought, based on years watching VMware dishing out incredible functionality that was always integrated, intuitive and “just worked”.

I assumed VMware will simply “enlighten” VMware Tools with the ability to detect known applications and monitor key metrics, and also make this framework extensible for custom applications (similar to pre-freeze / post-thaw scripts for application-specific snapshot logic). In case of application failure detected, VMware Tools would throw events into vCenter and first attempt “local” recovery by restarting services, and if that does not help, message vCenter to restart the VM. This architecture would make AppHA work out of box for every VM (including newly added), with zero hassle for admins: huge value that EVERY user would immediately benefit from.

Well, it appears that I assumed too much. In reality, the feature comes with incredible complexity, and is based on legacy architecture I would not expect leading virtualization vendor to release in 2013. First, this feature is not something built into the platform, but rather completely “glued” on top of it. Before you can even start using this feature, you will need to deploy two separate appliances… yes, one was not enough! The first appliance is Hyperic appliance (recent VMware acquisition), which is Microsoft SCOM like tool with ugly web interface (carrying maybe 10% of SCOM functionality), and sporting identical architecture (thus bringing 100% of SCOM complexity along). Second appliance is actual VMware AppHA appliance, which seems to orchestrate “stuff” between Hyperic server and vCenter Server.

And the “best” part? AppHA requires that you deploy special monitoring agents in every VM, so welcome back to the agent management fun we’ve made great strides to avoid (having to remember to install, upgrade, and babysit yet another agent in your VMs). And even worse, you will also need to ensure that every VM is accessible to Hyperic server over the network! Direct network connectivity to a VM from core infrastructure servers? What’s up with that, I thought cloud was all about complete isolation? In other words, just think about all the things you like about agent-free Veeam solutions, remember how you struggled with agent-based solutions before, and apply all that to vSphere AppHA. I totally expected they would simply reuse VMware Tools, because it is the necessary evil we have to live with… but unfortunately, this is not the case.

This is probably the first time ever that VMware delivers the feature that sounds good on paper, but has horrible implementation in reality. It feels very much like a “buy and glue on top” approach, rather than “innovate and build” acquisition. Are we seeing the change of VMware approach to R&D? I honestly hope this was more of an exception, rather than a rule, but this is still worrying and very annoying for me, hardened VMware fan. I will definitely be looking for VMware folks behind AppHA at VMworld Europe next month to discuss this, and understand what’s going on with this feature.

Pages: Prev 1 2 3 4 5 6 7 8 9 10 ...24 25 26 Next