Context
Tool mentors explain how a tool can perform tasks, which are part of ITUP processes and activities. The tasks are listed as Related Elements in the Relationships section.
You can see the details of how processes and activities are supported by this tool mentor, by clicking the links next to the icons:
Details
IT Service Continuity Management is concerned with managing the ability of an organization to continue to provide a
pre-determined and agreed level of IT services to support the minimum business requirements, following an interruption
to the business. Other ITUP mentors have discussed the creation and planning for IT service continuity along with
preparation and execution of the process. This mentor discusses measuring the performance of the process.
IT service continuity performance management effectively helps you to include a feedback loop into your process, which
allows you to make both coarse and fine adjustments as needed. The IT service continuity plan has already documented
information on SLAs and QOS. The preparation, testing, and execution have demonstrated that recovery is possible. Now
it is important in an ongoing manner to ensure that operations continue to run smoothly, to flag any issues that might
arise, to provide recommendations on what to do in case issues arise, to determine and specify who is responsible for
corrective action in the case of an issue, to ensure that the responsible party is notified to take action and to do it
automatically to ensure that the process is consistent and repeatable. By identifying exceptions and trends, insights
and lessons learned can be folded back into the process in a constant cycle of measurement and improvement.
The IT service continuity plan includes information on which computers are to be backed up along with their respective
SLAs. It is important to determine who is responsible for ensuring that each storage resource is backed up. For
example, in some organizations it might be the IBM® Tivoli® Storage Manager administrator who is responsible to ensure
that a particular database is backed up and in that case if the backup fails the Tivoli Storage Manager administrator
must take corrective action. In other organizations it might be the database owner who is responsible.
Tivoli Storage Manager supports queries so that an administrator can determine the status of the system. At any point
in time, an administrator can issue ad-hoc queries to review status and metrics of the system and can then disseminate
the information as needed. This method is a very manual way of handling the process.
Tivoli Storage Manager provides a feature called Tivoli Storage Manager operational reporting, which is specifically
designed to automate this process. Operational reporting supports both the reporting and monitoring of Tivoli Storage
Manager. Reports and monitors can be scheduled and they can be viewed interactively, on a Web site, or in e-mail where
the subject line of the e-mail provides a status on whether a Tivoli Storage Manager server is running smoothly or if
it has issues and needs attention.
Operational reporting is highly customizable and extensible, allowing existing rules and sections to be adjusted or
removed and new sections and rules to be added. Multiple Tivoli Storage Manager servers are supported. Multiple reports
and monitors can be run for a single Tivoli Storage Manager server and each report or monitor can be sent to multiple
recipients. The difference between a report and monitor is that when a report runs, it will query the Tivoli Storage
Manager server, compare the results to the rules, flag any issues, and provide customizable recommendations on how to
resolve any issues. In its sections, it reports on a wide variety of metrics that can be used to track SLA conformance.
When a report is scheduled to run, it will run and send information to the list of recipients regardless of whether
there are any issues or not.
A monitor, on the other hand, is intended to run much more frequently. Monitors use the same rule-based mechanism as
reports but they will notify recipients only if any rules are triggered, that is, if there are any issues. Monitors can
also optionally and conditionally execute statements in a self-healing fashion. For example, if a tape drive goes
offline, a rule can check for that and see that it is offline. The monitor can then list all drives along with their
status, it can then issue a command to Tivoli Storage Manager to tell it to bring the drive back online, and can then
re-list all the drives including status information. In this way, administrators can manage Tivoli Storage Manager by
exception. In this case one or more administrators will automatically be notified that a drive went offline, they'll
see that in a self-healing fashion the drive was brought back online, and they'll be able to see the list and status of
the drives before and after the action was attempted.
Tivoli Storage Manager operational reporting also includes the ability to automatically notify node owners of failed or
missed backups and provides customizable instructions telling the node owner how to make corrections and who to call if
further help is needed. In support of managing responsibility, operational reporting can send notification to specific
node owners with details of their backup operations. For the case where the Tivoli Storage Manager administrator is
responsible for ensuring that a database application is backed up, they can be automatically notified if there is a
problem and if the database owner themselves can be notified.
Operational reporting provides an efficient, automated, customizable, and repeatable method of measuring the state of
Tivoli Storage Manager operations where rules can be configured to validate whether SLAs are being met. The resulting
information can be used to determine which areas need improvement, and the information can be sent to the most
appropriate set of people with the correct roles, responsibilities, and skills to address any issues as specified in
the IT service continuity plan.
For more information, refer to: http://publib.boulder.ibm.com/infocenter/tivihelp/index.jsp?toc=/com.ibm.itstorage.doc/toc.xml
Search for:
-
"Operational Reporting"
-
"Scheduling"
For More Information
For more information about this tool, click on the link for this tool at the top of this page.
|