SLAs for SaaS: What to Promise, What to Avoid, and How to Measure

A Service Level Agreement converts your infrastructure reliability into a contractual commitment. Done well, it gives enterprise customers confidence that you’ll perform and gives you a defensible framework when things go wrong. Done poorly, or copied from a template without reading it, it creates obligations you can’t meet and remedies that can end customer relationships on a bad month.

The same principle that applies to DPA security commitments applies here: your SLA should reflect your actual architecture, not your aspirations. A single-region deployment cannot reliably promise 99.99% uptime. Promising it anyway doesn’t make your infrastructure more resilient. It just makes your contract inaccurate.

The Core Components of a SaaS SLA

Every SLA, regardless of complexity, needs to answer five questions: what are you promising, how do you measure it, what’s excluded, what happens when you miss it, and how do customers know how you’re doing.

1. Uptime Commitment

The uptime commitment is the number everyone focuses on: 99.9%, 99.95%, 99.99%. It expresses the percentage of time your service will be available during a measurement period, typically a calendar month.

The number you choose should be derived from your infrastructure stack, not from what sounds impressive in a sales conversation. The relevant questions: Are you running in a single region or multiple? Do you have redundancy at the database layer? What does your deployment process look like? Does a failed deploy take the service down? What’s your dependency on third-party services that could themselves go down?

A realistic framework by infrastructure tier:

Single-region, managed cloud (e.g. a standard Heroku or Render deployment): 99.5% to 99.9% is defensible. 99.99% is not.
Multi-region active-passive (primary region with failover): 99.9% to 99.95% depending on your failover automation.
Multi-region active-active with redundant data layer: 99.95% to 99.99% becomes achievable, but requires the engineering investment to match.

For more on what these numbers mean in practice, including actual downtime minutes per month, see The Math Behind SLA Uptime: 99.9% vs. 99.99% and What It Actually Means.

2. Measurement Methodology

Promising uptime without defining how it’s measured is meaningless. Your SLA needs to specify:

What counts as downtime. The narrowest definition, and the most provider-friendly, is complete service unavailability: the service is unreachable for all users. A broader definition includes partial outages (a subset of users affected) and degraded performance (the service is reachable but materially slower than normal). Where you land on this spectrum affects how often you’re technically in breach.

The measurement window. Monthly measurement is standard. Annual measurement is more favorable to providers since a bad month gets averaged out, but most enterprise buyers won’t accept it. Monthly is the norm.

How availability is calculated. The standard formula: ((total minutes in period - downtime minutes) / total minutes in period) × 100. Your SLA should specify this explicitly, including how downtime minutes are counted when an incident spans multiple hours.

Who does the measuring. Your own monitoring, a third-party status page provider, or some combination. Using your own monitoring exclusively gives customers less confidence; a third-party tool like a public status page adds credibility and removes disputes about whether an incident actually occurred.

3. Exclusions

Not all downtime counts against your SLA commitment, and your agreement needs to define what’s excluded. Standard exclusions include scheduled maintenance (with advance notice requirements), force majeure events, outages caused by customer actions or third-party integrations outside your control, and issues with the customer’s own network or infrastructure.

Exclusions are where SLAs get negotiated hard. Customers want narrow exclusions; providers want broad ones. The reasonable position is exclusions that are specific enough to be enforceable and tied to events genuinely outside your control, not a blanket carve-out for anything involving a third party.

For a full breakdown of how to structure exclusions, see SLA Exclusions: What Shouldn’t Count Against Your Uptime.

4. Remedies

When you miss your uptime commitment, what happens? The industry standard remedy is service credits, meaning a percentage of the affected month’s fees applied to a future invoice. Credits are not refunds. They keep the customer relationship intact while acknowledging the failure.

The alternatives, including pro-rata refunds and termination rights triggered by SLA misses, are significantly more dangerous for providers. A termination-for-cause right triggered by repeated SLA failures can give a customer an exit from a contract they wanted out of anyway, on your dime.

Structure your credit schedule as a tiered table: the worse the miss, the larger the credit. A common structure:

Monthly Uptime	Service Credit
99.0% – 99.X%	10% of monthly fees
95.0% – 98.9%	25% of monthly fees
Below 95.0%	50% of monthly fees

Cap total credits per calendar month, typically at 50% of monthly fees, and require customers to submit a claim within a defined window (usually 30 days after the incident) to be eligible. Credits that aren’t claimed within the window are forfeited.

For a deeper treatment of credit structure and why it’s the right remedy framework, see Service Credits: The SLA Remedy That Doesn’t Break Your Business.

5. Reporting

Enterprise customers want visibility into your uptime performance, not just a credit when something goes wrong. A public status page such as Statuspage or Instatus satisfies this requirement with minimal overhead. It gives customers a real-time view of service health, a record of past incidents, and a channel for incident communications that doesn’t require your team to manually update every customer during an outage.

Your SLA should reference your status page as the source of record for uptime reporting. Some enterprise customers will also ask for monthly uptime reports delivered directly, which you can accommodate with a simple automated export from your monitoring tooling.

What to Avoid

Promising uptime your infrastructure can’t support. Already covered, but worth repeating: the number needs to match your architecture. If your engineering team tells you 99.9% is achievable and 99.99% requires a significant infrastructure project, your SLA should say 99.9%.

Unlimited or uncapped remedies. An SLA without a credit cap or with termination rights tied to any miss creates open-ended exposure. One bad month shouldn’t threaten the contract.

Vague downtime definitions. “Materially degraded performance” without a definition of what materially degraded means is unenforceable in your favor and negotiable against you. Define it: response times exceeding X milliseconds for Y% of requests over a Z-minute window, as measured by your monitoring tooling.

SLA commitments that conflict with your DPA. Your SLA’s uptime targets and your DPA’s security and availability commitments need to be consistent. If your DPA implies high availability and your SLA promises 99.5%, you have an internal contradiction that a sophisticated buyer’s legal team will flag.

Committing to resolution times. You can commit to responding to incidents within a defined window. You cannot reliably commit to resolving them within one. Resolution time depends on the nature of the incident, and promising it contractually creates obligations you may not be able to honor during a complex outage.

Choosing Your Uptime Target: A Framework

If you’re unsure where to set your uptime commitment, work through these questions before you draft the number:

What does your current monitoring show? Pull your actual uptime data for the last 12 months. That’s your baseline.
What’s your deployment risk? If a bad deploy takes down the service for 20 minutes, how often does that happen? Factor it in.
What’s your scheduled maintenance window? If you take the service down for maintenance, how long and how often? This affects your achievable uptime unless you carve it out.
What do your dependencies look like? If a third-party API your product depends on goes down, does your service go down too? If yes, either exclude that dependency or factor its reliability into your commitment.
What are your customers asking for? Enterprise buyers will often tell you their minimum acceptable uptime in the RFP or procurement questionnaire. If their floor is 99.9% and you can deliver it, match it. If their floor is 99.99% and you can’t, that’s a conversation to have before you sign, not after.

No Boiler provides self-service legal document generation and educational content. This material and our service is not a substitute for legal advice. Please have a qualified attorney review any documents before relying on them. No Boiler is not a law firm, and communications with us do not create an attorney-client relationship or carry any expectation of confidentiality. Use of our platform and content is governed by our Terms of Service and Privacy Policy.