Thursday, June 27, 2013

Virtualization

I am surprised to learn there are other good virtualization geeks out there. I have only worked with two really good virtualization guys in the last ten years. I don't know why its so hard to convince other old school people that more processors != more performance, most of the time. Here is another site confirming much of what I have posted previously about virtualization. Requires login with email address. http://searchservervirtualization.techtarget.com/tip/Sizing-server-hardware-for-virtual-machines I disagree with sizing the same as you would physical. Do not do that. http://blogs.softchoice.com/advisor/ssn/4-principles-for-right-sizing-physical-servers-in-your-virtual-environment-intel/ pretty good post by a geek. http://www.datacenterpost.com/2012/03/vcpu-sizing-considerations.html

Wednesday, June 26, 2013

Virtual machine sizing

Unfortunately lately I have been pushed to make bad decisions when it comes to VM sizing. A few reasons have been, "We are not utilizing all of our hardware, all that extra processing power and memory should be allocated, or it just goes to waste", "Each additional VM costs upwards of $30K to support, so we should build fewer larger VMs", "X company won't certify or support their products unless you match their minimum specs".
     While these are valid business concerns and should be factored in when creating a VM infrastructure, they should not be the deciding factor. An older rule of thumb when creating virtual machines was to always start with a single vcpu and 4 gigs of memory. Times have changed and the rules of thumb need to be updated.
     Sizing is highly dependent on what applications you run and how you plan to utilize the machine. I will delve into Apache Httpd, because I most frequently worked tuning them.
     For instance if you have a virtual machine hosting the Apache httpd front end to a wiki server, you want to run for a perceived user load of 1000 concurrent users. I have tested these sizes and they give the best default performance. Apache HTTPD virtual, 1 Vcpu, 4 gigs of memory and two Gig Ethernet interfaces for production traffic . Of course httpd needs to be setup correctly. Fresh out of the box httpd will use about 250mb per thread.
Tuning similar to http://chrisgilligan.com/consulting/tuning-apache-and-mysql-for-best-performance-in-a-shared-virtual-hosting-environment/ and http://oxpedia.org/wiki/index.php?title=Tune_apache2_for_more_concurrent_connections Apache HTTPD can be tuned to use about 15mb a thread while servicing SSL/TLS. For 225 concurrent user connections on a reasonable well tuned system you will use 3.5 gig of memory the OS will use the other 500MB. If you have a requirement for large amounts concurrent users, for example 1000 which is the perceived max per system based on httperf testing results. Apache httpd server should never be allocated with more than 2 cores and 16 gig of memory. If you have a 10,000 concurrent user requirement, you should not build an eight core box with 160 gig of memory. Scaling horizontally and spreading load out to 10 machines with 2 cores and 16 will provide much better response times. Even on the same virtual host. Depending on load while doing a production replay through httperf while testing processor count may need to be moved up to 4 cores. This is a test, you should split the load across more virtuals. Don't take this at face value. You should test 2 / 4 four proc Apache httpd setups vs. 4 / 8 two proc Apache setups. The Performance metrics don't lie.

Wednesday, November 02, 2011

Linux vs Windows resurfacing.

I was recently forwarded a link.
http://www.zdnet.com/blog/diy-it/why-ive-finally-had-it-with-my-linux-server-and-im-moving-back-to-windows/245?pg=2

I need to start off with the Oh, my; after reading and formulating my rebuttal, I think this article was written more for shock value than any technical content. David complains about not running Linux because he doesn't have any time. hmm, as any decent Linux admin will tell you the number one reason to run Linux is you don't have the time to properly administrate a windows environment. Just based on the amount of time for setup, patching, and maintenance, I am going to say declaratively, Linux takes way less time.

David complains about the excess of options to many shells and UI's. I submit, you absolutely, positively never run a GUI in a production environment. There is one caveat to that comment, I know some one will catch it, so I will go ahead, the first time you install Oracle in a cluster you need a gui. But after it is installed you shut it off.
Next I notice David refferences an ini file, hmm, probably meant .conf file. David reffers to compiling everything, when is the last time you compliled something in a modern distro that wasn't a one off, like needing apache 2.3.5. Its a misunderstanding to think you need to compile anything in the present, unless that is you have the time and you really want to.

Most of David's article is FUd and he lowers my opinion of ZDnet, my favorite part is

"I just can’t afford to waste any more time with Linux. Not when — by design — everything is held together with toothpicks, duct tape, and bailing wire.

No way. You couldn’t pay me to run Linux on my raw iron.

Never again."

Nice, very well rounded and inflammatory. So, David is permanently swearing off Linux for good. I for one say thank you. We don't need or want you...

Wednesday, October 19, 2011

Running Oracle in Vmware.
One area where I am constantly beat up is my Oracle practices. When I refer back to the Oracle config documents I always call them recommendations. Over the last three years this has been a major point of contention between me and my DBA's. I have built multi-tenant Oracle Rac implementations leveraging infiniband and I have built small three node Rac clusters for data availability. I have recently moved my Oracle experiences into the VM-World. Different Rulesets apply and through performance testing I have proved or disproved several theories of my own and my peers. For those who don't know, Oracle scales Excellent vertically, well, but not quite as well horizontally. Previous world record benchmarks were all set on Big Sun Iron, now the world records are all on multinode Rac clusters.

A good starting point Document is https://docs.google.com/viewer?a=v&q=cache:rbFmvCq0CxoJ:www.vmware.com/files/pdf/Oracle_Databases_on_vSphere_Deployment_Tips.pdf+&hl=en&gl=us&pid=bl&srcid=ADGEEShEbhUk5I76OpN_S7SBmuu0TnW_FFBO60DjpJ4gLZKAXb2TmbHlwQVyo00dxS9RKdHSZLDJhLkv4oFe4pwI7Y9YnylgTS9K-2lX2MSfnWL2kFAoz3bAKbCl4ycbU3hmSBnDOCav&sig=AHIEtbSK4GOjlyrMTBhTHH90Kov_jStbWw&pli=1

I start to deviate on Page 6. Leave IPtables on, duh, every box must run a systems level firewall. Also the doc doesn't tell you to shut off enough services. Next pay particular attention to tip 12 on page 13. Never ever use RDM's. If in doubt read my previous top ten rules of vmware. Tune your OS, based on the install guide, many of the OS variables change depending on the final amount of Memory you add to the system, so start with a reasonable baseline and move from there.

The biggest performance gains are made by right sizing your VM's. Using the performance suite of your choice, I like the combination of Toad, dyna trace, esxtop and GKrellM. When you run a performance test against your database server, if you hit more than 80% for a sustained period bump up your vcpu count. Make sure you do not cross boundaries. Don't use 6 vcpu's if you have 4 core procs. Most of the time barring oddities, your best Oracle performance will be had at 4 vcpus. Next play with you Memory. Always set a reservation on Memory used by Oracle virtuals. Other apps are very tolerent of memory being swapped in and out, but Oracle takes a steep performance penalty when it swaps. Dependign on dataset the final numbers will end up aroung 16-24Gb of system mem. Next set your SGA, make sure to leave enough space for your operating system. On most SGA max should be set to 14G with 16G of system memory. Next, tune your SGA, this is also a point of contention between me and DBA's, I think you should set your sga targets and watch it over time and then lock it down. Next set your sessions and your processes. This is also a tuning issue, if your performance test replicates your production load this should be easy, increase the count until performance falls off then back off.

The final outcome of all these tuning options will be your little 4vcpu VMware box will run almost as as good as physical hardware with similar core counts.

Wednesday, September 14, 2011

No more Tiers.

Data center networking has fundamentally changed over the last few years. Previously there was a clear separation of tiers; A couple of cores, a few distros, some server edges, some aggregation, and a bunch of access layer switches. The current datacenter network design philosophy is no more tiers. A data center network should be flat and encompass the previous functionality in one layer. Cisco called this a single tier datacenter model now they call it the Fabric Path based networ. I call this a distributed core architecture. All switches are access layer, distro layer, and core layer switches.

The move to a single layer in the datacenter is hard. There are many entrenched CCNA who are not capable of free thought unless cisco says so. Encouraging networkers to think outside the cert, is challenging.

To move to a single tier you must evaluate newer technologies. In a single tier everything counts so you want all ports to run at line speed. The next key piece of the architecture is uplinks.

Although Cisco provides nice equipment there is no way to get there from here. Current cisco inventory has to little backplane speed and throughput. Look for cisco to buy another company who makes decent 10/40Gbe gear.

A properly laid out single tier datacenter network has over subscription rates of less than 3:1@10Gbe. Less when the density is not as high. With a multi-tier layout over subscription on the access switch is at least 1.2:1@1Gbe and 9+:1@10Gbe. You can lay out a multiple tier datacenter in many different ways, but with tiers you will always have high over-subscription.

A benefit of a single tier is latency, when the max number of switches you will traverse to get between servers is two. Low latency is king in the Datcenter network. I like to illustrate this with a voip call, on a high latency network, voip calls are somewhat choppy and occasionally you can here yourself echo. Then you have to wait for the other end to catch up. Although most network engineers will tell you latency doesn't matter when its bellow a few ms. Many applications are sensitive to latency, so we buy separate networking for these applications; with single a tier all of these pools of technology can be integrated.

The drawbacks of moving to a single tier in the datacenter is complexity of configuration, the type of equipment you should buy, and the amount of ports utilized for full speed uplinks. The biggest drawback is scalability each datacenter will be limited to ~8 switches because of the number of interconnects. A single tier only scales to ~4Tb and depending on switch, 800 10Gbe server links.

Wednesday, June 22, 2011

No one owns knowledge, it should be freely shared. It is the processes and software derived from this knowledge that are leveraged to create products which are bought and sold. You can not patent a process or an idea.

You can patent how you do this process, what you have created from this process or unique methods that allow you to use this idea. Once a process exists in the wild, and is used by the community, such as a design pattern, you can no longer claim your patent unless it was delivered to the public by means of corruption.

If you write code to allow you to authenticate through a unique security process, then some one figures out your method and decides they should write code to do the same thing. You have no claim to their code and you can not stop them from using their process.

This fits into a larger open source argument, that has been ongoing between colleagues.

Thursday, May 26, 2011

My top ten networking rules.

10. Always segment traffic. Storage traffic should be on a storage vlan, Backup traffic on a backup vlan, database traffic - guess where on the database vlan. I even promote farther segregation, prod, stage, test, and dev vlans.

9. Scan your network. If you properly segment traffic, you should never see ssh on your database vlan. This should throw a big red flag and be investigated.

8. Always allow all vlans at the core, filter which vlans are allowed at the aggregation / distro layer. The core should be static. The Core is the backbone upon which your network is built. It should be redundant and bullet proof. Admins should almost never login. The core should be transparent.

7. Create ACL's, some protocols should not transit the network (telnet, netbios). Server administrators should filter out 90% of unnecessary traffic. Network admins should put in rules in case they are lazy and to get the other 10%.


6. There shalt not be more than 5 network devices between the user and the internet ( or the voip phone and the router), not including firewalls. So the worse case, A user pc connects to an access layer switch, to the aggregation layer switch, to the distro switch, to the core switch, to the core router.

5. There shalt not be more than 4 layers of switches between the top and the bottom of any network.(Thanks for your rebuttal Kevin, I Still don't Agree. This rule is sign of good design. I think you should re-evaluate 6 layers is more than excessive. I know I break the cisco mold. The data center stack should be directly connected to the core. but on the client side and the access layer you should not have more than four layers. Core, distro, aggregation, access. How would you name 6 layers?)

4. There shalt never be more than 3 devices between server and server, this includes servers in other data centers. The worse case a server connects to the data center edge switch, to the core, to different data center edge switch.

3. Avoid over subscription. In the Data center the rule is 1.2:1 for 1g, 8.4:1 for 10g

2. The core router should be used for network ingress and egress traffic only.

1. Always route at the access layer if possible. Layer 3 switches provide greater throughput and routing speeds for trivial routing. Voip acts better if you route as close to the user as possible.


The last but not least. Keep it simple. Always err towards simplicity, complexity kills.