Tag Archives: performance

Service Level Agreements (SLAs) – A silent victim in the Consumerization of IT

Last week I was having a discussion with an Infrastructure as a Service (IaaS) provider and one of the questions I left with was how can I integrate your service with my existing internal infrastructure.  My thought was I could build a private-hyrbrid cloud (if that’s not a term yet I own it – my definition of a private-hybrid cloud is being able to provide on demand resources between an internal/traditional company owned data center and an external service provider who can dedicate private resources) between my internal infrastructure and and this vendor to provide on demand infrastructure, scaling and high availability.  This got me to thinking of how I might be able to meet the sometimes unreasonable SLA’s asked of technology groups from the business and further started to wonder why the SLA’s have kept increasing even though IT budgets are being slashed.

I started to think back to meetings between non-technology business leaders (e.g. sales, marketing, finance, etc…) and myself as we discussed what they wanted from IT.  Typically when I am architecting a system or network design one of my first questions is to those business leaders to explain what their expectations are.  On many occasions the answer has been “100% up-time.”  We all know in IT that’s not really reasonable which is why we add-in scenarios about maintenance and vendor bugs not counting against that SLA.  Now, if I am an IT person and my CEO and CFO say they want 100% up-time – well great I can certainly design you a very resilient, high performance infrastructure that can even overcome poor software code to recover from application or system crashes.  One problem typically comes up however – the desire to have 100% up-time does not typically equate to the budget to build that type of infrastructure.  When we are reviewing the design and budget to try and reach that 100% up-time requirement the comment I hear quite often is something along the lines of “Why does it cost so much, Facebook/LinkedIn/other consumer based website never goes down and  I use that for free” or “Why do we have to spend so much on storage?  I can upload all the pictures I want to Facebook  and I use that for free.”  Once you explain that those services you are consuming from Facebook or LinkedIn is not actually their business, their business is big-data, business intelligence and advertising I am typically able to re-focus the meetings on the real needs of the business and not a false expectation based on the perceived up-time of consumer based services and determine a real SLA for various systems and applications.

So while we typically think of the consumerization of IT in terms of BYOD and related needs such as security and monitoring or enterprise social networking, it reaches all the way through the network and infrastructure right into policies, procedures and service level agreements.  Have you had a similar experience when working on your projects or budgets?

Advertisements

VNX Best Practices for Performance by Nick Fritsch (@nfritsch)

Thanks Nick!

VMware View Mobile Secure Desktop Bootcamp Day 8 PCoIP

VMware View Mobile Secure Desktop Bootcamp – Day 8

PCoIP Optimzation

Real-protocol – like VoIP
Host-based pixel encoding – onlyl change pixels are sent
UDP based – no TCP overhead, app layer reliability

PCoIP Protocol Features
“Build to lossless” – lower quality updates on images in motion until static
Utilizes multiple codec dependig on content
Adaptive bandwidth consumption – uses as much bandwidth as possilbe
Client side caching of frequent data
WMI session monitoring

Text Codec
– Lossless text
– Increased compression on ClearType
– 10-20% bandwidth savings

Disable Build-to-lossless
– Build to “preceptually” lossless, cant tell for most use cases
– 10-15% bandwidth savings

Client-Side Caching
– Store frequent content on endpoint
– Sends ‘address’ and ‘location of content
– 30-40% bandwidth savings

PCoIP Monitoring tools
– Lakeside SysTrack
– Liquidware Labs
– Xangati

Free
– Perfmon
– PCoIP Log Viewer

SSL VPN encaps UDP in TCP packets, lose UDP benefits
Bypass WAN acceleration and IDS/IPS

Avoid usecases with round trip 300ms latency

Continue reading VMware View Mobile Secure Desktop Bootcamp Day 8 PCoIP

Great post – I find so many VM’s over subscribed for no good reason.

CloudXC

I thought this example may be useful to show the benefits of Right sizing a virtual machine.

The VM is an SQL Database server with 4 vCPUs on a cluster which is highly overcommitted with lots of oversized VMs.

As we can see by the below graph, the CPU ready was more or less averaging 10% and on the 24th of July most vCPUs spiked to greater than 30% CPU ready each. ie: 30% of the time the server is waiting to be scheduled onto the pCPU cores.

The performance of applications using databases hosted on the server were suffering serious issues during this time.

On the  24th the VM was dropped from 4 vCPUs, down to 2 vCPUs and the results are obvious.

CPU ready dropped immediately (even in a heavily over-committed environment) to around 1% and CPU utilization remained at around the same levels. Performance also improved for…

View original post 169 more words

Great IOPS Calculator

Yes I know most of the world already knows about this great IOPS calculator, but want to make sure I don’t forget so here is the link!

http://wmarow.com/strcalc/

Great walk through of analyzing SQL server performance.

VirtuallyMikeBrown

I was recently asked to pull the performance metrics for a new SQL cluster at work. In an effort to finally get back to blogging, I thought I’d share my results and how someone else may be able to look at their clusters for ways to improve. I should start by saying that although this analysis was performed on a two-node Windows Server Failover Cluster using Windows Server 2008 R2 x64 (WSFC, formerly MSCS) and SQL Server 2008, SQL-specific metrics are not pulled. Rather, I looked at the Big Four: CPU, Memory, Disk, and Network. The second node in the cluster, Node B, was analyzed because the application using the first node was not in production yet, so we knew that node would barely be utilized.

Using Microsoft System Center Operations Manager (a behemoth in its own right!), I was able to pull the previous six days’ worth of performance…

View original post 856 more words