A service level agreement (SLA) is a binding legal document that can help your organization if it’s done right and harm your organization if it’s done wrong. Trust me, it’s worth your time to pay close attention to what your SLAs contain.
Your examination (and willingness to revise, attach addendums, etc.) should begin with…
The basic elements of an SLA
Every SLA should include two elements:
Service elements, including…
- Service particulars — what’s included, what isn’t;
- Service availability/uptime parameters and standards;
- Specifics of any cost/service tradeoffs.
- Delineation of each party’s responsibilities; and
- Escalation procedures.
Management elements, including…
- Definitions of service metrics and measurement methods;
- Service report content, frequency, and processes;
- A dispute resolution process;
- An indemnification clause (if this is not already part of the contract) that protects you from third-party litigation due to service level breaches;
- Service agreement update mechanisms — this should occur at least annually, though many top-performing organizations do it quarterly; and
- A service agreement exit strategy that articulates provider performance levels (e.g., moving your data) if/when your relationship with the provider terminates.
Five key aspects of an SLA
Next, I suggest you focus on…
1 The metrics you monitor
Generally less is more, since greater metric complexity tends to create overlap, needless work, and less effective monitoring. Also, whenever possible, opt for automated monitoring, which will save you time and money.
Among the metrics you don’t want to neglect:
Service availability/uptime, typically measured in nines — remember, higher availability/uptime costs more, so you need to decide what’s practical for the service/application;
The Nines: Measuring IT and Application Uptime |
|||
Level of availability |
Percent of uptime |
Downtime |
Downtime |
1 nine |
90% |
2.4 hours |
36.5 days |
2 nines |
99% |
14 minutes |
3.65 days |
3 nines |
99.9% |
86 seconds |
8.76 hours |
4 nines |
99.99% |
8.6 seconds |
56.2 minutes |
5 nines |
99.999% |
0.86 seconds |
5.25 minutes |
6 nines |
99.9999% |
8.6 milliseconds |
31.56 seconds |
Security-related, notably patching and antivirus updating;
Defect rates indicating errors/failures in major deliverables (missed deadlines, incomplete backups/restores, coding errors/rework);
Operational outcomes, often measured via key performance indicators (KPIs), assuming the service provider’s role in those KPIs is measurable.
2 Verification of service levels
Typically, you can access service level data via an online portal. When it comes to mission-critical services, consider using third-party tools that automatically capture SLA performance data.
3 Disaster recovery testing/failover
Include a clause in any SLA covering mission-critical services and applications that reside in provider environments to require that these services undergo at least one annual disaster recovery test/failover — and make sure the SLA specifies a maximum failover time.
4 Governance/compliance
Make sure your service contracts and SLAs (which you retain and can retrieve, right?) require your providers to give you up-to-date financial and IT audit information.
5 Knowing your provider account manager
Stipulate in your services contract/SLA that you have the right to interview the provider’s account manager before this person is assigned to you so that you can ensure their ability to adequately communicate with your staff.
Don’t let the legalese of that SLA intimidate you. Take an extra moment for review, focus on its most critical aspects, and ensure that it suits your organization’s needs. You’ll be happy you did.
Until next time,
Tim