Tag Archives: design

Example Architectural Decision – HA Admission Control Policy with Software licensing constraints via @josh_odgers

Example Architectural Decision – HA Admission Control Policy with Software licensing constraints via @josh_odgers


High Availability Admission Control Setting & Policy with a Software Licensing Constraint

Problem Statement

The customer has a requirement to virtualize “Application X” which is currently running on physical servers. The customer is licensed for a maximum of 32 cores and the software vendor has strict licensing restrictions which do not recognize the use of DRS rules to restrict virtual machines to a sub-set of hosts within a cluster.

The application is Tier 1, and requires maximum availability. A capacity planner assessment has been conducted and found 32 cores and 256Gb RAM is sufficient to run all servers.

The servers requirements vary greatly from 1vCPU/2GB RAM to 8vCPU/64GB Ram with the bulk of the VMs 2vCPU or less with varying RAM sizes.

What is the most suitable hardware configuration and HA admission control policy / setting  that complies with the licensing restrictions while ensuring N+1 redundancy and minimizing the change…

View original post 512 more words


Example architectural decision-network IO control shareslimits for ESXi host using ip storage via Josh Odgers (@josh_odgers)

Another in a great series by Josh Odgers


Problem Statement

With 10GB connections becoming the norm, ESXi hosts will generally have less physical connections than in the past where 1Gb was generally used, but more bandwidth per connection (and in total) than a host with 1GB NICs.

In this case, the hosts have only to 2 x 10GB NICs and the design needs to cater for all traffic (including IP storage) for the ESXi hosts.

The design needs to ensure all types of traffic have sufficient burst and sustained bandwidth for all traffic types without significantly negatively impacting other types of traffic.

How can this be achieved?


1. No additional Network cards (1gb or 10gb) can be supported
2. vSphere 5.1
3. Multi-NIC vMotion is desired


1. Two (2) x 10GB NICs


1. Ensure IP Storage (NFS) performance is optimal
2.Ensure vMotion activities (including a host entering maintenance mode) can be performed in a timely…

View original post 615 more words

My first written “design” proposal – can you help me rip this design apart?

For the past 12 years I have straddled design, administration and management for various systems including VMware and never having the opportunity to really master any one specific aspect.  As I have decided to focus on VMware design, I am hoping all the VMware gurus can (nicely) rip this apart.  I think my ego is ready for a butt-kicking – its the only way I will get better.  While this first written design happens to be a very basic setup – a small business wanting to have some CAPEX spend before the end of 2012, I’d still love your feedback.  This is also written as part of a proposal, and more detailed documentation would be created as part of the actual implementation.  Dell is the vendor of choice for most (if not all) of the hardware so if you don’t like Dell that is certainly fine but would appreciate your feedback on the high level aspects of what I have written and how I can improve.  Also, storage is intentionally light right now as I am still going back and forth with a few vendors.  Here it goes…

Company X wishes to purchase and implement the necessary servers, storage and networking devices to support their business and future needs, including the ability to host Microsoft Dynamics GreatPlains.  The network will be built on two physical servers with shared storage to support the necessary Windows and VMware Operating Systems.  An initial system design was provided, however updated based on the 12/17/2012 meeting.  This design factors in the ability for additional capacity to support the growth of the company.

Phase 1: Infrastructure Design

The following items needs to be considered as part of the design to ensure performance and availability:

  • Storage
  • Networking
  • VMware Host (ESXi)
  • VMware vCenter
  • Virtual Machines (VMs)
  • Backup and Disaster Recovery
  • Expansion

Requirements:  Provide basic network services (Active Directory, DNS, DHCP) and security (Group Policy) and the capacity to run Microsoft Dynamics GreatPlains during 2013 (possibly during Q2) and support future growth.

Assumptions:  Power, cooling, space and internet access will be provided and available.

Constraints:  While the driving factor is to purchase the best available platform by the end of 2012, budget is a constraint in building a true n+1 configuration.

The current server room may be too small for a server rack and there is currently no dedicated AC to maintain < 70deg F temperature.

Risks:  Several single points of failure including:  internet access, core network switch and storage.  A failure to any of these services may result in extended down time.  There is no generator available, power outages (unplanned or building maintenance) will result in down time.

Current State Analysis:  There is currently a Sonicwall TZ170 Wireless firewall connected to RCN and two unmanaged switches for computers and phones.

Stakeholders:  Owner of Company X and Company Y (a provider of managed services for ongoing support of the environment).

Storage:  A single, 12 drive enclosure will be used connected via iSCSI and a single dedicated switch for iSCSI traffic.

Networking:  A total of 4 switches will be used.  Rather than use a single 48 port POE switch for both phones and computers, 2 separate 24 port switches will be used.  The reason for this change is to provide redundancy in the event of a switch failure.  With a single switch, all connectivity (servers, computers and phones) would be lost, with 2 switches a failure to either the phone or computer switch would not impact the other services.  The server switch is still a single point of failure in this design.

Dell 6224 24 port switches will be used, one each for servers/core networking and one for desktops.  A 6624P 24 port Power Over Ethernet (POE) will be used for phones.  Finally a Dell 5524 24 port switch will be used for storage connectivity to the host servers.

VMware vCenter:  The VMware vCenter appliance will be deployed.  This will eliminate Microsoft Windows licensing requirements and still allow the cluster to grow to up to 5 hosts or 50 virtual machines.  Proposed vSphere licensing will limit the cluster to only 3 hosts.

Cluster:  There will be 2 host configured in the cluster.  DRS will be set to manual due to the small number of virtual machines to reduce complexity not be used as it is not available with this edition of vSphere.  HA will be configured, this will allow for Virtual Machines to be power on by the remaining host should there be a hardware failure.  HA performs a reboot, so there will be a period where services will be unavailable as they are restarted.  Fault Tolerance (FT)  is not available with the proposed edition of vSphere and could be used on VMs with only a single vCPU however FT will not be an option for vCenter, the Microsoft Dynamics GreatPlains application or database server as they require multiple vCPUs.

Virtual Machines (VMs):  The following VMs will be needed at a minimum:

–  2x Windows Servers for Active Directory, DNS, and DHCP

  • Approximate specifications:  Single vCPU, 2GB vRAM, 30GB thin formatted VMDK for OS, single vNIC connected to the VM Network.  Suggest using memory reservation to reduce disk consumption.
  • An additional 166MB of memory is required for Virtual Machine overhead memory.  Should 2 vCPUs and 4GB of vRAM be required, 243MB of overhead memory would be required.

–   1x VMware vCenter Appliance for vCenter

  • Approximate specifications:  Dual vCPU, 8GB vRAM, 60GB thin formatted VMDK, single vNIC connected to the VM Network.  Suggest using memory reservation to reduce disk consumption.
  • An additional 332MB of memory is required for Virtual Machine overhead memory.  Should 16GB of vRAM be required, 509MB of overhead memory would be required.

Because the vCenter appliance will be used, Update Manager will not be available to update the hosts.  Manually patching will be required, however this is manageable because there are only 2 host (maximum of 3 based on proposed vSphere license).

– 1x Windows server for Microsoft Dynamics GreatPlains application remote access

  • Specifications will be dependent on the number of concurrent users running the Dynamics GreatPlains client.
  • Estimated specifications:  Dual vCPU, 8GB vRAM, 30GB think formatted VMDK for OS, 60GB thin formatted VMDK for application installation (exact requirements to be determined with Dynamics consultant)
  • An additional 332MB of memory is required for Virtual Machine overhead memory.  Should 16GB of vRAM be required, 509MB of overhead memory would be required.

– 1x Windows server for Microsoft Dynamics GreatPlains database

  • Specifications will be dependent on the number of concurrent users running the Dynamics GreatPlains client and size of the database
  • Estimated specifications:  Dual vCPU, 8GB vRAM, 30GB think formatted VMDK for OS, 60GB thin formatted VMDK for database installation (exact requirements to be determined with Dynamics consultant).  Additional space may be required based upon the size of the database to store backups.
  • An additional 332MB of memory is required for Virtual Machine overhead memory.  Should 16GB of vRAM be required, 509MB of overhead memory would be required.

VMware Host (ESXi): Two physical servers will be built on Dell PowerEdge R720.  The R720xd was not selected as the increased cost provides additional drive bays which are being moved to an iSCSI storage appliance.

The Dell R720 with the Intel Xeon E50-2600 Series CPU is supported on ESXi 5.1 as of 12/19/2012 on the VMware Hardware Compatibility List (HCL).

The Intel Xeon E5-2620 is recommended; the incremental cost per server is $900 (retail) over the base model CPU and provides two additional processing cores as well as Hyper-Threading.  This processor provides a cost savings of $3533 (retail) over the originally proposed E5-2670.  While there are performance differences, I do not feel  as though these warrant a $3533 price increase per server.

28GB of memory is required for the 5 necessary virtual machines.  An additional 2GB of memory is required to support the VM overhead memory for a total of 30GB required.  Because this is only a 2 host cluster, in order to support running all 5 VMs on a single host I would suggest 32GB of RAM per host.  RDIMMs are suggested because there will be multiple DIMMS per memory channel.  The cost of 16GB of RAM per host would be $320, the cost for 32GB of RAM per host would be $640.  4x 8GB, with 2x 8GB DIMMs per channel (CPU) will be used.

ESXi will be installed and configured locally on each host.  2x 500GB 7200RPM SATA drives will be used.  2x 1TB SATA drives will be added to perform backups of the VMs to each host.  An additional 4 SATA drives can be added per host at a later date to support new storage requirements.  Local storage in each host would be used for archives, one-off backups or other not critical data only.

Typically multiple NICs and switches are used for various functions – management, VM network, vMotion, and FT.  Because FT is not likely to be useful in this environment and because there is only a single server switch in this environment we will setup 3 interfaces – management, VM network and vMotion though all 3 will traverse the same physical switch.  Separate VLANs will be created for the vMotion network to isolate and broadcast traffic that could impact the management and VM network.  Three Broadcom 5720 Dual Port (DP) 1Gb cards are suggested.  While only 1 port will be used initially, the additional ports will already be available should they be needed for increased performance or availability.

One Broadcom 57810 Dual Port (DP) 10Gb card is suggested for iSCSI connectivity as they provide hardware iSCSI offload capabilities.


–  VMware vSphere Essentials Plus Kit for 3 hosts will be used.

  • Maximum number of hosts 3 – each with up to 2 processors.
  • Includes HA, vMotion and VMware Replication
  • Current promotion includes VMware Go Pro for VM patch management.

– Microsoft Windows Standard licenses will be used.

  • As the company grows, suggest investigating Windows Data Center edition license for unlimited Windows VM deployments.

VM Backup and Disaster Recovery:  VMware Data Protection will be used which is part of the vSphere Essentials Plus Kit.  For disaster recovery, the storage appliance can replicate data to a second offsite appliance.

Expansion:  The limiting factor in the current configuration (performance only) will be:

  • Storage and disk I/O per second (IOPS) – Additional VM density per host can be achieved by adding additional storage shelfs and drives.  Also if IOPS becomes a limiting factor, a storage appliance replacement that supports Solid State Disk (SSD) or Serial Attached SCSI (SAS) drives can be performed.
  • Hosts – There are currently only two hosts in this configuration, adding a 3rd would allow for more VMs, greater redundancy and the ability to enable DRS for automatic workload provisioning.
  • Licensing – VMware licensing allows for a third host with up to two processors be added.  Additional licensing will be needed to add more than 3 hosts.
  • Memory – The host are configured with 32GB of memory each.  Assuming both hosts are functional this would allow for additional VMs to be added, however additional memory can be added to each of the host.

Back to VCAP-DCD Prep

I let myself lose focus last week, here is the plan:

1.  Watch APAC #vBrownBag recordings
2.  Finish Clustering Deep Dive
3.  Watch DR training on MyLearn
4.  Watch SRM training on VMware.com
5.  Read Storage Implementation from VMware Press
6.  Take notes (from slides) from the VCAP-DCD and VCDX #vBrownbags
7.  Re-read
8.  Review VCAP-DCD blueprint and study guides by Gregg Robertson (@GreggRobertson5) and Shane Williford (@coolsport00)

I also have a small startup looking for a basic office setup so I am going to “design” the hell out of it.

Example Architectural Decision – Virtual Machine swap file location via Josh Odgers (@josh_odgers)


Problem Statement

When using shared storage where deduplication is utilized along with an array level snapshot based backup solution, what can be done to minimize the wasted capacity of snapping transient files in backups and the CPU overhead on the storage controller having to attempt to deduplicate data which cannot be deduped?


1. Virtual machine memory reservations cannot be used to reduce the vswap file size


1. Reduce the snapshot size for backups without impacting the ability to backup and restore
2. Minimize the overhead on the storage controller for deduplication processing
3. Optimize the vSphere / Storage solution for maximum performance

Architectural Decision

1. Configure the HA swap file policy to store the swap file in a datastore specified by the host.
2. Create a new datastore per cluster which is hosted on Tier 1 storage and ensure deduplication is disabled on that volume
3. Configure all…

View original post 276 more words

Example Architectural Decision – DRS Automation Level by Josh Odgers @josh_odgers


Problem Statement

What is the most suitable DRS automation level and migration threshold for a vSphere cluster running an IaaS offering with a self service portal w/ unpredictable workloads?


1. Workload types and size are unpredictable in a IaaS environment, workloads may vary greatly and without notice
2. The solution needs to be as automated as possible without introducing significant risk


1. Prevent unnecessary vMotion migrations which will impact host & cluster performance
2.Ensure the cluster standard deviation is minimal
3. Reduce administrative overhead of reviewing and approving DRS recommendations


1.Use Fully automated and Migration threshold 1 – Apply priority 1 recommendations
2.Use Fully automated and Migration threshold 2- Apply priority 1 & 2 recommendations
3. Use Fully automated and Migration threshold 4- Apply priority 1,2,3 and 4 recommendations
4.Use Fully automated and Migration threshold 5- Apply priority 1,2,3,4 & 5 recommendations
5. Set DRS to manual…

View original post 208 more words