Flow Chart for Disaster Recovery and Continuity

Disaster Recovery Plan Process

During a mid-Monday morning, Jake stared at the old UNIX server, his blank stare hidden from the management team that had just gathered outside the IT Data Center. This lone server had just stopped production, not a pretty statement for a manufacturing site. The managers also had the same look on their faces as they asked, "Why were we not prepared for this?"

It must have been 15°C in the IT Data Center. Sweat rolled down Jake's forehead. It's going to be a long day, he thought.

What drives productive action in the moment of despair? As you ponder this deep, universal question, tag along with me as we look at theDisaster Recovery Plan—A Simple, Technical Flowchart.

In the previous chapter, we looked at the process-orientedBusiness Continuity Plan. Now, let us look at the technical plan.

TheDisaster Recovery Plan (DRP) is part of the Business Continuity Strategy. The IT Business Continuity Plan (BCP)serves as an umbrella document to the Disaster Recovery Plan. In the event of a disaster, the IT Business Continuity Plan will be activated. Depending on the IT Recovery Team's assessment, the Disaster Recovery Plan may also be activated and implemented. Each critical IT resource will have an assignedDisaster Recovery Script (refer to Chapter 5) that includes installation, configuration, and testing. All of these documents rely on a robustBusiness Impact Analysis. Refer to Appendix A for a diagram on how the DRP document structure fits between the BCP and DRS.

To further clarify the difference between both plans:

  • TheBusiness Continuity Plan applies to the functions, operations, and resources necessary to ensure the continuation of business processes if normal operations are disrupted or threatened.
  • TheDisaster Recovery Plan (at a technical level) is a documented process used to outline the recovery path of IT resources in the event of a disaster. The Disaster Recovery Plan may be used as a continuation of the BCP or be used on its own, for IT resources out of the scope of the Business Continuity Plan. Refer to the following diagram for the document hierarchy and process.

The Disaster Recovery Plan (DRP) Document Format

Writing a strategic plan can take many routes. I have chosen an approach that makes it simple to find direction when time is of the essence. Instead of paragraph after paragraph of instructions, this Disaster Recovery Plan includes a flowchart as its base, which outlines the Disaster Recovery Plan and extends into the Disaster Recovery Script requirements. Both the DRP and the DRS work together and make up the technical Recovery Strategy. This approach might seem new but I will break it up for you, into easily digestible sections. By the end of this chapter, you will have a good grasp of this simple, technical Recovery Strategy.

Disaster Recovery Plan – Base Structure

In a nutshell, the DRP document contains standard sections like Purpose, Scope, and Responsibilities, but at its core, it handles the "What to do?" question through an easy-to-follow flowchart. This plan is all about recovery; it's about the core actions (at a technical level) that you need to complete when a disaster strikes. The flowchart is fully described throughout the plan, leading you to all the necessary documentation: Disaster Recovery Scripts, Incident Reports, and Change Control Requests.

Following a logical course of action, the Disaster Recovery Plan leads into the Disaster Recovery Script requirements, which serve as the foundation for every new DRS that is developed. These DRP flowchart actions serve as the blueprint for creating Disaster Recovery Scripts. See this blueprint as a starting point for each new DRS.

The DRP document also includes annexed documents that contain the Recovery Priority Table, Suppliers/Services Contact Information, IT Recovery Team Members, the Disaster Recovery Report Form, and the Deviation Form.

The following diagram is an example of a Disaster Recovery Plan flowchart for the IT Data Center. In this particular example, software recovery is based on tape backup. Note that the IT Disaster Recovery Plan will need to be tailored to the specific business requirements and environment.

Disaster Recovery Plan Flowchart

Disaster Recovery Plan Activation

Now, let's break the Disaster Recovery Plan flowchart into easily digestible pieces.

Disaster Recovery Plan Activation

Following safety precautions carefully, analyze the impact of the disaster. Identify the scope of the impact on the affected IT resources, external resources, and subsystems. Evaluate the event that has occurred, from a technical standpoint. Once the initial damage assessment is completed, a decision can be made on whether or not to activate the Disaster Recovery Plan.

The Disaster Recovery Plan is activated when an IT resource is expected to be or has been affected by a disaster and as a result, a key business process may be or is compromised. The DRP may be activated individually (if the IT resource is not within the scope of the BCP) or through the prior activation of the IT Business Continuity Plan (by the IT Recovery Team at Stage 6 of the BCP). For example, say the ABC server crashes and fails to boot up. The ABC server is the business's online sales server. Online sales is a key business process covered in the site's BCP. As part of the BCP recovery process, the DRP may be activated. On the other hand, the scenario might be that the ABC server is a secondary backup server that supports no key business process. In that case, it's covered only in the Disaster Recovery Plan. Remember that whether or not the Business Continuity Plan covers an IT resource depends on the results of the Business Impact Analysis, covered in Chapter 2.

Responsibility

Business Continuity Leader (BCL)

  • Assesses the level of the disaster.
  • Activates the Disaster Recovery Plan.
  • Monitors the recovery process and provides regular reports on recovery status (if necessary) to appropriate groups (i.e., Senior Management, etc.).
  • Approves expenditures relating to the recovery process.

Business Continuity Coordinator (BCC)

  • Initiates the disaster notification process.
  • Activates the Disaster Recovery Plan (if the BCL is not available).
  • Serves as liaison between IT and the Senior Management team and escalates issues to Senior Management.
  • Acts as team leader for IT Recovery Team.
  • Tracks actual progress/completion of recovery activities against the projected sequence of recovery events.

Disaster Recovery Plan—A Simple, Technical Flowchart

Although the DRP described in this chapter is aligned to a GxP-regulated environment, it can easily be adapted to most types of business scenarios. Let's look at each component of the flowchart to enhance our understanding.

The Disaster Recovery Plan is initiated with three critical steps:

  • Select the appropriate Disaster Recovery Script.
  • Open an Incident Report.
  • Open a Change Control Request.

Disaster Recovery Plan – Initiation

Disaster Recovery Plan—Initiation

Once minimum network components (external resources and subsystems) are available and operational, the Recovery Strategy shall flow along as follows:

a) Select the appropriate Disaster Recovery Script  for the affected IT resource. I will go over the Disaster Recovery Scripts (DRS) in Chapter 5. Just know that the DRS is the actual step-by-step document that is used to recover the IT resource. If Jake and his management had been diligent, they would have had the Disaster Recovery Plan activated, and then would have picked up a copy of the specific UNIX Server DRS and followed its precise steps on how to recover the server.

b) Open an Incident Report to document the unplanned event, identify a possible root cause, and any required preventive/corrective actions.

Incident reports serve to record the following:

  • Whodid that?
  • Whendid it take place?
  • Whathappened?
  • Wheredid it take place?
  • Whydid that happen?

Add to this list;how it was fixed andwhat preventive/corrective controls were put in place, so it does not occur again. Jake could only answer the What question; the server was dead, and he had no plan in place to recover it. Refer to Chapter 8 for more information related to Incident Management.

Events occur when you least expect them. I remember a busy day at IT when Jake received a call from the engineering team.

"Jake, we need to decommission the old training trailer, and there are some cables here that seem to be from IT equipment. Can we cut them?"

"What color are they?" Jake asked.

"Some are blue," the engineering tech answered.

"Okay, sure, go ahead, cut them," Jake replied.

Ten seconds later, DISASTER! The other color was grey, and those were the site's primary fiber cables.

c) Open an Emergency Change Control Request to document changes to the affected IT resource. Most systems (especially in GxP-regulated environments) are qualified or validated. This means that some documentation (usually some form of installation or operational qualification test) has been completed to assure that the IT resource was installed, configured, and operates as expected. An unplanned event (disaster) likely may have caused a change to occur to the qualified or validated state of the IT resource, causing the need to open a Change Control Request (CCR) that will capture and document the recovery actions with pre-approval of the business owners and Quality Assurance.

The Emergency Change Control Request serves as a bridge between the DRP and the Disaster Recovery Scripts. When creating the Change Control Request, use the CCR "Plan Section" in the form to identify the specific DRS required to recover the IT resource. Think of the CCR as the approval to proceed and execute the Disaster Recovery Scripts. Identifying the Disaster Recovery Script in the CCR connects the DRS (detailed recovery instructions) to the Disaster Recovery Plan (technical recovery plan).

Why an "Emergency" Change Control request? Emergency Change Control Requests usually require less pre-approval. Refer to Chapter 9 for more information related to IT Change Control Management.

Disaster Recovery Script (Procedure Blueprint)

As I noted at the start of this chapter, the Disaster Recovery Plan flowchart leads to the Disaster Recovery Script. Don't overthink this approach. All we are doing here is creating a Disaster Recovery Plan document that also includes a blueprint for the Disaster Recovery Script's procedure section. This blueprint will then be used to develop specific Disaster Recovery Scripts for each IT resource that supports a key business process. The Disaster Recovery Plan outlines a path for recovery and also provides the foundation for your DRS recovery procedure.

Again, the Disaster Recovery Plan document is just that—a documented plan that states what you need to do on your way toward recovery.

It initiates the recovery path by doing these three things:

  • Identify the applicable DRS.
  • Open an Incident Report.
  • Open a Change Control Request.

The Disaster Recovery Plan also includes a set of general actions that can be used to jump-start the process of creating a new Disaster Recovery Script. These base actions will require additional specific recovery instructions as per the affected IT resource. In practice, the DRP leads you to the Disaster Recovery Script's specific steps for hardware and software recovery. Note that the grey colored action items in the following flowchart reflect the DRS procedure blueprint. Steps (d)-(t) are general actions executed using the actual IT resource Disaster Recovery Script. (Covered in the next chapter.) Also, note that the Disaster Recovery Script's procedure blueprint is divided into two primary areas:

  • Specific Hardware Recovery Approach
  • Specific Software Recovery Approach

In Chapter 5, I will go over the Disaster Recovery Scripts.

DRS Procedure Blueprint

Specific Hardware Recovery Approach

Steps (d)-(h) are based on the selected hardware recovery approach used by the business. These steps are covered in the DRP for design purposes. Execution of the steps is performed using the DRS. The hardware recovery approach used will depend on your business requirements and the IT resources you are trying to recover. Grey colored action items in the following flowchart represent the hardware recovery approach using a new IT resource.

Specific Hardware Recovery Approach

At Step (d), determine if this is a hardware recovery. If the answer is yes, follow the applicable actions of Steps (e)-(h), determining if it will be recovered on the original hardware component or not. Depending on the type of disaster, the original affected IT hardware may be re-imaged, and its data restored. But what if we are confronted with the original IT resource burnt into the shape of a glazed donut? We need to have planned for that, and that is where we reach for the IT resource replacement—yes, the one that the plan had taken into accountbefore the disaster occurred. If the impact is more significant, like seeing the clouds overhead from inside the IT Data Center, then we need to have planned for that possibility as well, and have had an alternate site identified for the Disaster Recovery Process. It's all about thewhat-ifs. What if the server OS gets corrupted? What if the IT resource burns down? What if the IT Data Center gets blown away during a tornado? Know where to stop, based on your business's requirements and risk appetite. Remember, it's not at the Disaster Recovery Plan stage that these risk scenarios are evaluated; it's during the Business Impact Analysis covered in Chapter 2.

If the original hardware is not available, then identify its replacement. Then follow the Disaster Recovery Script instructions for the hardware recovery. Note: If the recovery will only consider the software aspect of the IT resource, then go to step (i).

Specific Software Recovery Approach

Steps (i) to (o) are based on the selected software recovery approach used by the business. These steps are covered in the DRP for design purposes. Execution of the steps is performed using the DRS. Data storage and recovery are discussed later in the chapter. The software recovery approach used will depend on your business requirements and the IT resources you are trying to recover. Grey colored action items in the following flowchart represent the software recovery approach using tape storage.

Specific Software Recovery Approach

(i) Determine the software to be restored and their installation requirements. Take into consideration required disk space, server processors, client workstations, and connectivity, among other technical specifications. Is this a GxP application like SAP1 or an off-the-shelf business application like Minitab2? You need to know the difference. Again, this is not a consideration that pops up out of nowhere during the Disaster Recovery Plan execution. The Disaster Recovery Scripts shall have pre-established IT resource recovery requirements based on the results of the BIA.

(j)-(k) If necessary,contact service providers and/or vendors for support. This applies when critical applications are vendor-installed and have been affected by the disaster. Having quick access to who to call can make the difference between having no idea of what to do next or buying a few hours by delegating some responsibility to the vendor. It does not make you less responsible, but it does give you some breathing room. Remember, management is standing at the IT Data Center door.

(l)-(m) If necessary,identify the most recent backup media, including IT resource images. Don't wait for the disaster recovery effort to be underway to check if your backup/restore process or image process has been working as expected. It's a sad day in your career when you find out that all the backups you believed were successfully written to tape are blank or on unreadable tape cartridges.

Images need to be up-to-date. If this is not possible, then Change Control Requests need to be kept accurate to track the changes that were not included in those images. I suggest that images be created after each new change to the IT resource, with a second copy of the images/backups stored at an external site. Storage and recovery options are discussed later in the chapter.

(n) Manually (if applicable) install/configure the software as per the applicable Disaster Recovery Script. One note here is to be careful that you do not inadvertently change the original user rights without the required authorization. It's easy to focus on quickly getting everyone on the system, but a month later, end up with a SOX IT Compliance audit and get clobbered.

(o)Restore Data. This is the most crucial step of the Disaster Recovery Plan—Simple, Technical Flowchart. It will not have mattered if you installed the OS correctly if there is no user/application data. The FDA mandates we have procedures and controls in place to protect our data integrity and availability. When the FDA refers to data integrity, they mean the data stays the same when it's backed up and when it's restored. That data needs to be protected from any unauthorized change throughout its whole life cycle, which means having controls in place that have been well tested and documented. Assure that all the data was recovered and that its integrity was maintained.

Disaster Recovery Plan Wrap Up

Steps (p) to (t) are used to wrap up the DRS and recovery efforts.

Chapter continues…

1 SAP is a German software company whose products allow businesses to track customer and business interactions. SAP is especially well-known for its Enterprise Resource Planning (ERP) and data management programs.

2 Minitab is a statistical software package used for improvement in statistics-based projects.

More of Chapter 4 – IT Business Continuity Plan – in the book.

brysonyougung.blogspot.com

Source: https://www.ivancorderotorres.com/chapter-4-it-disaster-recovery-plan/

0 Response to "Flow Chart for Disaster Recovery and Continuity"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel