Novell LAN Disaster Backup and Recovery (DBAR) Review
Purpose: Ensure that Information Technology (IT)
assets on LAN servers are properly backed up and the business
owner has the capability to recover and restore the computer environment
in the event of a system failure, or natural/man-made disaster.
1. Review and evaluate disaster recovery plans to ensure
business requirements are met.
- Verify that plans include:
- Identification of business continuity team (LAN Administrator(s),
key users, systems people, vendors), respective phone numbers,
and responsibilities
- Identify levels of disruption (i.e. complete server failure,
partial loss of data, program-level, database)
- Documented scenarios/action plans for levels of disruption
which include specific procedures and assignment of responsibilities
- Identification of all critical applications software and data
files
- Identification of hardware and equipment needed
- Prioritized application software and data
- Identification and security requirements for alternative processing
locations
- Risk-analysis for each area- impact analysis, acceptable downtime,
disaster definition
- Plan test procedures
- Restore procedures for program and data including any dependencies
- Determine in any necessary service level agreements have been
established with hardware and software vendors
- Determine if any necessary service level agreements have been
established with business users. If so, analyze agreement and
note exceptions (if plan won't meet customer needs, etc.).
- Verify procedures exist for server and system shutdown and
system restart. For example:
a. Scheduling considerations
b. Backup journalizing
c. Disk/Tape archiving
d. Program sequence and dependencies
e. Retention of source documents
f. Re-input of data
g. Internal/external reliance
2. Ensure specific procedures are in place to adequately
review DBAR procedures.
- Verify that someone is responsible for maintenance of the
plan.
- Ensure plan is current.
- Ensure timely review and approval by appropriate levels of
management.
- Verify an authorized distribution list for contingency plans
exists.
3. Ensure backup and recovery plans are tested on a periodic
basis.
- Verify all critical data and programs are available in off-site
storage.
- Ensure the ability to reconstruct systems from backups.
- Ensure the ability to recover at an alternate site (if appropriate).
4. Verify that all the information, files, data, and production
resources required to resume business processing are appropriately
identified and backed up.
A. The following steps can be applied to all business areas:
- Identify critical systems and application data, file names,
communication equipment, configuration settings, programs, documentation,
manuals, and supplies.
- Prioritize applications based on business need and risk analysis
(i.e. business impact, acceptable downtime).
- Verify provisions for hardware recovery exist including contracts
for hot-sites, cold-sites, vendor replacement agreements, and
reciprocal agreements or "good neighbor" policies (i.e.
borrow server from another LAN, etc.).
- Identify interdependencies between business units, functions
and/or application systems.
- Determine is backup jobs are scheduled in an automatic and/or
manual fashion and that they have been run as scheduled.
- Verify that backups of all LAN applications are performed
nightly and properly labeled.
- Verify that all stand-alone PC applications are regularly
backed up and properly labeled.
- Verify that backup media (i.e. tapes, diskettes, source code)
are stored in a physically secure location, both onsite or offsite
on a regular basis.
- Identify procedures for reinstallation of data and programs,
including any dependencies.
- Verify employees are reminded to backup critical files from
their PC hard drives to tape or diskette regularly (i.e. weekly
or monthly).
- Verify that the backups are kept offsite (i.e. the ITA takes
it home and returns it the following day). Do they use Glastonbury
facility?
- Verify that the backups are properly labeled.
- Verify that all users back up their local PCs on a regular
basis (monthly) .
- Verify that the backup media is kept in a fire proof safe.
B. If the business area is using the BK4 process, perform the
following steps (Note: Each File Server has an attached PC dedicated
to backing up the server. The PC and the backup application are
often referred to as BK4's):
- Verify that backups of all File Servers are performed nightly.
Determine what procedures exist if backup has failed.
- View the Presets (Volumes to be backed up and kick off times)
for a given File Server and compare to directory structure listing.
Verify all critical directories are being backed up.
- Verify backup reports are reviewed every morning. View Backup
Log Files for a few file servers. From Daily Verification Menu
select Display Daily Backup Log. Verify no "BAD Backup"
occurrences or wrong dates appear.
- Verify Transfer of backups from external drive on Unattended
Backup PC to Optical cartridge was successful. From Daily Verification
Menu, select Display Daily XCopy Verification Log. Verify all
transfers were successful.
- Verify that 5 days of optical cartridge backups exist in the
media safe for all file server backups.
- View a few individual backup reports for a few different file
servers. An individual backup report is produced for each Preset
on the file server that has been backed up. Select Report Manager
from Daily Verification Menu, highlight report and select View
from options at top of screen. Unsuccessful file backups are grouped
at bottom of report, so press <end> to get to bottom. Note
unsuccessful files and determine what action was/will be taken
(e.g. HO intervention)
- Verify backup reports are transferred to optical on a regular
basis.
C. If the business area is using the ARCserve process, perform
the following steps:
- Identify employees authorized to use ARCserve. Verify that
these employees have Supervisor or equivalent status.
- Verify that backups of all File Servers are performed nightly
(or as deemed necessary by business need). Determine what procedures
exist if backup has failed (e.g. HO intervention, production interruptions).
- Verify that if any missed targets (servers and/or workstations)
cannot be backed up, they are rescheduled in a "Make up"
job, if deemed necessary by business need. Select "Yes"
in the AUTOMATICALLY RESCHEDULED MISSED TARGETS field on the Auto
Pilot Set Configuration Form (select Auto Pilot Tape Management,
Auto Pilot Sets, <Ins>- to display the Select
Set Script Picklist).
- Determine whether weekly tapes are to be saved permanently.
If so, toggle the PRESERVE WEEKLY TAPE field on the Auto Pilot
Set Configuration Form to "Yes".
- Obtain a listing of all Tape Usage Reports (or a sample of).
Select Tape Usage Log from the Administration Menu. Select
tapes you wish to obtain detailed information on and hit <Enter>.
Verify that tapes in usage are have not exceeded their expiration
dates and/or have not been used excessive number of hours (as
determined by the LAN Administrator).
- Obtain a listing of the Locate/QFA Restore Report by selecting
Locate/QFA Restore from the File Tracking System and QFA
Restore Menu. For those operations that have been "Canceled"
or "Failed", look at corresponding operations in the
Full Log from the Auto Pilot Menu. Verify with LAN Administrator
the impact of canceled or failed operations and identify actions
taken.
- Determine whether a verification method is being used during
backup operations. Obtain a sample of operations from the Full
Log and check for "Verification Completed".
- Obtain a copy of the File Tracking report. Select Generate
Report from the File Tracking System and QFA Restore Menu.
Compare to the directory structure listing to verify that all
critical directories are backed up.
- Verify backup reports are transferred to optical on a regular
basis, if required by the business area.
5. Verify that all critical file servers, database servers,
and special device PC's have proper power backup and mirrored
disks if critical.
- Ensure each File Server, Database server, and special device
PC has a separate Uninterrupted Power Supply (UPS) attached.
- Ensure UPS's are tested on a regular basis, or based on the
manufacturer's recommendation.
- Determine if critical servers and special device PC's are
mirrored.
- Walk through process to see what happens after hours if backup
process is unattended.