Service Continuity Management
Definition
"The practice of ensuring that the availability and performance of a service are maintained at sufficient levels in case of a disaster."
To fulfil the purpose, an organization needs to:
- Develop and manage service continuity plans
- Mitigate service continuity risks
- Ensure awareness and readiness
Key Terms
Disaster: a sudden unplanned event that causes great damage or serious loss to an organization. A disaster is significantly more severe than a normal incident.
Service continuity: the capability of the service provider to continue service operation at acceptable predefined levels following a disaster event or disruptive incident.
Business impact analysis (BIA): a key activity that identifies Vital Business Functions (VBFs) and their dependencies.
Vital business function (VBF): a capability or function of a business process that is required to ensure the success of the business.
Service continuity plans: a set of clearly defined plans describing how an organization will recover from a disaster and return to a pre-disaster condition, considering the four dimensions of service management.
Recovery Objectives
Two critical metrics define recovery expectations:
| Metric | Definition | Example |
|---|---|---|
| Recovery Time Objective (RTO) | Maximum acceptable time to restore service after disaster | Payment processing must be restored within 4 hours |
| Recovery Point Objective (RPO) | Maximum acceptable data loss measured in time | No more than 1 hour of transactions may be lost |
RTO and RPO drive costs. Shorter recovery objectives require more investment in redundancy, backup systems, and failover infrastructure. Organizations must balance their appetite for risk against the cost of achieving aggressive recovery targets. The BIA helps determine which services justify the highest investment.
Processes
Business Impact Analysis
- VBF identification: Determine which business functions are vital.
- Analyse the consequences of disruption: Assess financial, operational, and reputational impact.
- VBF interdependencies identification: Map dependencies between vital functions and supporting services.
- Determination of service continuity requirements: Define RTO and RPO for each vital function.
Developing, Exercising, and Maintaining Service Continuity Plans
- Scope definition: Define which services and functions are covered.
- Policy setting: Establish continuity policies and governance.
- Service continuity strategies development: Select appropriate strategies (hot standby, warm standby, cold standby, reciprocal arrangements).
- Service continuity plans development: Create detailed step-by-step plans.
- Awareness and exercise programme development: Design training and testing programmes.
- Performing exercises: Test plans through tabletop exercises, simulations, and full tests.
- Service continuity audit: Review plan effectiveness and compliance.
Response and Recovery
- Invocation: Formally activate the continuity plan when a disaster is declared.
- Executing service continuity plans: Follow the plan to restore services.
Recommendations for Practice Success
- Understand consumer expectations for continuity and recovery
- Know legal and regulatory continuity requirements
- Avoid creating all continuity plans at once: start with the most critical services
- Keep plans simple and clear: a plan nobody can follow under stress is useless
- Regularly test and practice continuity plans
- Automate response and recovery where possible
- Collaborate with suppliers who are part of the recovery chain
- Embed continuity management in value streams
Key Metrics
| Metric | What it measures |
|---|---|
| Products/services with documented continuity requirements and plans | Coverage |
| Timely updates to continuity plans | Currency |
| RTO achievement | Recovery speed |
| RPO achievement | Data loss prevention |
| Effective continuity measures | Control quality |
| Actual vs expected loss ratio | Risk assessment accuracy |
| Continuity awareness and readiness sessions | Training coverage |
| Continuity plans tested | Testing coverage |
Key Roles
- Service continuity manager: Coordinates continuity planning, testing, and response activities
Software Tools
- Business continuity planning tools
- Emergency management tools
- Risk management tools
- Service configuration management tools
- Remote administration and deployment tools
- Monitoring and event management tools
- Orchestration and integration platforms
- Business Process Modelling (BPM) tools
- Knowledge and document management tools
- Collaboration and communication tools
- Analysis and reporting tools