Tool Mentor: TNO - Detect and Log Event

Context

Tool mentors explain how a tool can perform tasks, which are part of ITUP processes and activities. The tasks are listed as Related Elements in the Relationships section.

You can see the details of how processes and activities are supported by this tool mentor, by clicking the links next to the icons:

Event Management
- Detect and Log Event

Details

The IBM® Tivoli® Netcool/OMNIbus solution provides out of the box monitoring capability across the Network and IT infrastructure. Using software based Netcool Probes, the Netcool suite collects Fault, Performance, Availability, and Security events in the form of SNMP traps, log messages, as well as proprietary event formats directly from the EMS or device. There are over 200+ dedicated probes included in the Netcool suite, which provide comprehensive coverage of systems, applications, and all major vendor's LAN and WAN routers, switches, firewalls, and load balancers. This increases to 1000+ when the Micromuse Alliance partner integrations and the provided SNMP and Syslog rules files are included. Rules generation tools are provided to generate compatible rules files from MIB files further extending the range of devices that can be supported in a single installation.

The Netcool Probes rules parse events into an extensible common event format enriched by the use of local lookup data files. Enriched events are forwarded to the Netcool ObjectServer high-speed memory-resident database where the OMNIbus Automation system filters, correlates and prioritizes events for display and management in customizable operator views.

The Netcool Knowledge Library provides a set of formally tested 'Ready to Run' probe rules for specific devices that identify which alarms indicate actual failures, allowing repair efforts to focus on the issues (or root causes) that truly affect the operation of the infrastructure, without the distraction of the symptomatic or informational events. The device specific rules dictate how events should be correlated by providing greater detail on the specific containment of events for a particular device.

The following commented script extract from the Netcool Knowledge Library rules files shows part of the processing of an SNMP trap where the tokenized data from the trap varbinds ($1, $2 etc) are tested and mapped to standard fields for inclusion in the common event structure:

@Class = "40057"
switch($generic-trap)
{
case "0"|"1": ### coldStart, warmStart
##########
# $1 = sysUptime - The time (in hundredths of a second) since the
#        network management portion of the system was last
#        re-initialized.
# $2 = whyReload - This variable contains a printable octet string
#        which contains the reason why the system was last
#        restarted.
##########
$sysUptime = $1
$whyReload = $2
details($sysUptime,$whyReload)
@Summary = @Summary + ": " + $2
@Identifier = @Identifier + " " + $2
case "2"|"3": ### linkDown, linkUp
if(nmatch($OID4, "1.3.6.1.2.1.31.1.1.1"))
{
##########
# $1 = ifIndex - A unique value for each interface. Its value
#        ranges between 1 and the value of ifNumber. The value
#        for each interface must remain constant at least from
#        one re-initialization of the entity's network
#        management system to the next re-initialization.
# $2 = ifAdminStatus - The desired state of the interface. The
#        testing(3) state indicates that no operational packets
#        can be passed.
# $3 = ifOperStatus - The current operational state of the
#        interface. The testing(3) state indicates that no
#        operational packets can be passed.
# $4 = ifName - The textual name of the interface. The value of
#        this object should be the name of the interface as
#        assigned by the local device and should be suitable
#        for use in commands entered at the device's `console'.
#        This might be a text name, such as `le0' or a simple
#        port number, such as `1', depending on the interface
#        naming syntax of the device. If several entries in the
#        ifTable together represent a single interface as named
#        by the device, then each will have the same value of
#        ifName. Note that for an agent which responds to SNMP
#        queries concerning an interface on some other
#        (proxied) device, then the value of ifName for such an
#        interface is the proxied device's local name for it.
#        If there is no local name, or if this object is otherwise
#        not applicable, then this object contains a
#        zero-length string.
##########
if(regmatch($2, "^.*[A-Za-z].*$")||regmatch($3, "^.*[A-Za-z].*$"))
{
$MIBFileNotNull = 1
}
$ifIndex = $1
$ifAdminStatus = lookup($2, ifAdminStatus) + " ( " + $2 + " )"
$ifOperStatus = lookup ($3, ifOperStatus) + " ( " + $3 + " )"
$ifName = $4
details($ifIndex,$ifAdminStatus,$ifOperStatus,$ifName)

The rules files and associated ObjectServer Automations provide the administrator with powerful tools to manage the event flow to optimize performance and ensure that the most relevant information is quickly presented to the end User. Event routing between the probe and tables in multiple ObjectServers may be designed to manage a variety of conditions to suit local needs, for example,

Static rules
- Routing specific event types to a temporary table for pre-processing before being made available for viewing. Pre-processing may be by automation or by an external system such as Impact.
- Events that do not require attention may be discarded at the probe, or forwarded for archiving
Dynamic rules
- routing lower severity events to interim tables during event storms allowing operator processing to be focused on a lower volume of important alarms

A single event consists of 26 standard fields including, for example, affected Node or Device, repeating event count and Problem description or Summary:

The underlying ObjectServer schema can be modified to provide additional information as required. The event may be structured to common standard forms for example x733. All fields of the event can be shown, but it is more likely to show only those fields which are important to an operator at first glance. A pull-down list is available to display all fields in a selected event.

Many management tools leverage an inclusive data paradigm where entities must be specifically included, as nothing is monitored by default. This approach inherently has some limitations with the result that the scope of what can be managed is often compromised. Modeling process can take a long time reducing the effectiveness of monitoring in a dynamic network.

Netcool/OMNIbus leverages an Exclusive data paradigm. In other words, things must be specifically excluded, as all reachable devices are monitored by default. This offers several benefits in the form of reduced implementation/expansion time and significantly more thorough management of the service impacting environment. This also has a profound affect on the depth of intelligence provided as well as on the ability to accurately pinpoint the true cause of service affecting problems.

With respect to managed elements, Netcool/OMNIbus has no system enforced limit, as its functionality is not dependent on a finite model. The ObjectServer database holds current events for those elements with active alarms so if a probe has access to an element its events will be reported in the ObjectServer regardless of whether the element appears within models used by other parts of Netcool. This is one of the key strengths of the Netcool/OMNIbus capability that when an element is added to the Network that is accessible to an appropriate probe or monitor, events from that element will be processed through all Netcool/OMNIbus functionality to the display layer without need for intermediate modeling. Events, therefore, are available in real-time to automations, related systems, and user views.

Comprehensive data logging and archive capabilities are provided throughout the applications. All log functions are optional and configurable to meet local needs including:

Logging at the probe to capture raw event data.
System and user generated Journal entries to record actions against individual events
Database Audit to in-memory tables and external log files
Event and Journal archive through Netcool/Gateways to a range of RDBMS for long term archive and historical reporting

For More Information

For more information about this tool, click on the link for this tool at the top of this page.