Preparations for emergencies and crises are usually based on the analysis of business processes, key resources, and their criticality or vulnerability in the event of a problem. Business impact analysis and risk management are used to assess potential impacts and define precautionary measures to minimize or mitigate risks.
The resulting emergency plans or crisis manuals contain defined countermeasures for the various events, instructions for employees etc., required action plans, checklists, alarm chains, responsibilities, and even the establishment of crisis teams and operations centers.
Contingency planning also includes planning for post-crisis recovery: Checks on what specifically needs to be observed during the emergency or crisis, prioritization, responsibilities, etc. in order to be best prepared for the subsequent restart and to support it in the best possible way.
Concrete exercises and training, simulations, playing through crises, and real tests are key drivers of improvement. Simple checks to see whether everyone knows what to do in the event of an incident and whether instructions are known or understood are usually the first steps.
Experience shows again and again: Only what is actually prepared and practiced — what is actually available and tested — works!
Successful emergency or crisis management in the event of an incident
In the acute phase, crisis management is usually passive, i.e. reactive: Activities are immediately focused on overcoming the crisis.
The better active crisis management (prevention, avoidance) was carried out in the past — i.e. the better the preparations for potential crises were — the easier it will be to deal with acute problems. Timely consideration of risks in connection with emergencies and crises — and countermeasures to deal with them — dramatically reduces response times in the event of an incident, and decisions to be made are better secured than acting on an ad hoc basis.
Management for the successful handling of crises usually consists of the planning and definition of important goals, the achievement of which is to be ensured within the framework of crisis management, as well as corresponding strategies and measures. And, of course, more stringent control.
The more quickly actual crises are identified, the more room for maneuver remains for successful management.
Business Continuity Management (BCM)
Places focus on non-disruption of business processes. This involves clarifying which — and in what form — processes are to be maintained and what measures, priorities, and resources are required to be able to quickly resume business operations in the event of an incident.
From an IT perspective, the necessary data backup mechanisms and disaster recovery measures for restoring data and systems are defined on the basis of the resulting specifications and derived parameters (e.g., RTO/RPO or MTTF/MTTR), among other things.
However, the implementation of adequate backup/recovery mechanisms is only one area: Business continuity management as such goes much further and takes a holistic view; BCM encompasses the robustness of IT as a whole in dealing with problems, disruptions, and failures and the implementation of appropriate technical and organizational measures to support business processes as seamlessly as possible.
Resilience of organization and IT
Resilience is essentially the organization’s ability to withstand disruption or to anticipate change and act quickly and successfully, even in an environment of rapidly changing conditions.
For the resilience of an organization, the actions of its employees and managers — as well as technical and organizational content — are the decisive factor. This includes maturity in dealing with disruptions and sharing experiences, but also how decisions are made in special situations and under variable conditions.
From an IT perspective, resilience is evident not least in cases of cybercrime, GDPR breaches, or major disruptions in general. IT resilience not only concerns business continuity management, usually with a focus on the areas of IT infrastructure and IT operations, but also the area of software or system development. A corresponding software stack is therefore also important for the (technical) resilience of IT as a whole: Resilient software design helps to develop robust software that, in the event of an unexpected error situation, acts in such a way that the user either does not notice the problem at all or, in the sense of good (self-)healing, is quickly able to use it again.