Section 1 - Overview
Mutiny is a self-contained appliance for monitoring network-attached devices such as servers, switches, routers, printers and other manageable devices. It has been designed to be simple to use, being aimed at the person who is more interested in the actual data gathered rather than the science of gathering the data.
The purpose of Mutiny is to monitor the systems on the network that are considered to be critical to the operation of the "Network Services" and to send alerts (Email and SMS messages) to the network's operators when problems are detected.
Mutiny gathers data from the devices on the network using a variety of standard Internet (TCP/IP) protocols, but the most important are "ping" and SNMP (the Simple Network Management Protocol). In order to get the best use out of Mutiny, all the devices it monitors should allow Mutiny SNMP read access, but, unlike other SNMP based network monitors, it is not necessary for the user, or even the administrator, to know very much about the protocol or how it works. In fact there are only 4 requirements:
1. To know that SNMP exists.
2. To know how to enable SNMP-read access on the critical network-attached devices such as Windows Servers, switches and routers.
3. To know how to set SNMP security using the "Community" (or "Community String") and restricting access over the network to a list of allowed monitoring devices.
4. To appreciate that when monitoring devices on the other side of a firewall, the firewall must be configured to allow SNMP access to those devices.
and that is all, because once Mutiny is instructed to monitor a specific device that is running SNMP (one one of our other polling technologies), it will automatically set up everything it needs to monitor that device and start collecting data immediately.
Mutiny's front-end is entirely web based and it therefore can only be accessed using a Web-browser. In order to access the Mutiny system, you only need to type in its IP address or DNS name into the address box of your browser
There are five user levels for accessing Mutiny:
"Super Admin" mode allows full access to the Mutiny monitoring and system configuration.
"Admin" mode allows full access to the Mutiny monitoring configuration.
"Engineer" mode is a view only user who has the ability to edit node properties.
"View" mode allows the user to view the Mutiny system, but not make any changes.
"Embedded/Published" mode allows single views to be linked from an intranet, sharepoint etc.
With the exception of 'Published" all users requires a username and a password access to Mutiny.
1.1. The Main Monitoring Views
In Mutiny it is possible to define a number of "Views" of the network service, each one containing the devices that are relevant to that service. Each view can be selected to show nodes as;
- Icon view
- Cards view
- Grid view
- Table view
- Graph view
- Top Tens
- Map view
- Wallboard
A number of Views are also pre-defined and after logging into Mutiny you will see one of these; the "All Nodes View" of the nodes that are being monitored. As its name implies, All Nodes shows all of the devices, or nodes that are being monitored by the Mutiny appliance and these are displayed as icons that represent either single devices:
Or group of devices
Each icon has a name displayed beneath it that indicates which particular device (or group of devices) it represents. The icons will also have a Status Indicator Symbol to show the overall status of the device or group.
The full list of Mutiny Statuses and the Symbols used is as follows:
|
The symbols on the single-device icons indicate the worst state of all of the properties that are being monitored for that device. Thus, if an icon is shown at Warning, then at least one if its monitored properties will be in a Warning state. Note also, that none of the monitored properties can be in a Critical state; otherwise the device icon will be shown as Critical as this is worse than Warning.
The symbol on the group icons indicates the worst state of any of the devices in the group, so if a group icon is Critical, then at least one of the devices in the group will be in a Critical state.
1.2. The Mutiny Polling Cycle
Mutiny will contact each of the devices that it is configured to monitor once every minute (or multiple times per minuite for ICMP). This known as the polling cycle and although it is possible to increase this time so that devices are polled less frequently, this is not recommended as the Alerting has been optimized to operate at this rate. Polling involves contacting each of the devices and asking them for the status of all of their monitored properties. As, most of this data is obtained using SNMP, the first part of the process involves checking that the device is still alive (using ping) and checking that it is responding to SNMP requests. If Mutiny does not see any SNMP responses from a particular device it will revert to monitoring it as "ping-only". Likewise if there is no ping response from a device during a polling cycle, Mutiny will turn off all SNMP monitoring for subsequent polling cycles until is sees a ping response. By backing-off in this manner, Mutiny reduces the amount of traffic it puts out onto the monitored network when connectivity problems have been observed.
1.3. Drilling Down to Obtain Detailed Status Information
The Mutiny interface allows you to "drill-down" into a device or group to investigate the status of individual properties. Moving the mouse to one of the device icons that are shown with a Warning or Critical symbol will display a small message box listing all of the properties that are in a Warning or Critical state.
In order to initiate the drill-down procedure, move the mouse to one of the device or group icons and "double-click".
Double-clicking on a group icon will switch to the View that is represented by the group icon and show all of the devices in the View.
Double-clicking on a device icon will open the main Status Panel for that device (or node).
1.4. The Main Status Panel
The Status Panel for a device lists all of the sets of properties that are currently being monitored on that device. The list will vary depending on the type of device and its operating system (if it has one), but a typical set for a Microsoft Windows server is shown below:
You will see that each of the property sets (such as Disk) is represented by a hyperlink and you can click on these to drill down further into the monitored properties and their Status panels. The Status panel also includes summary information for the device listing its configured node name, IP Address as polled by mutiny, DNS name (if it is registered in PTR) and its operating system.
Working from top to bottom of the main Status Panel, the property sets are and their associated Status panels as follows:
1.5. Ping Status Panel
The Ping property set shows the results that Mutiny has obtained by pinging one or more of the IP addresses associated with network interfaces on the devices. The result for each address will be either success (OK Status - a ping-reply response has been received), or failure (Critical Status - no ping response has been obtained). When a ping reply has been received the Round Trip Time (RTT) will be shown in milliseconds.
The tick boxes next to each interface select which of the IP addresses are being polled by Mutiny. These can be selected or de-selected as required.
The overall Ping Status for the device is defined as the worst case of statuses of all of the pinged interfaces. When Mutiny is configured to ping more than 1 IP address and the result is that some respond but others do not, the overall Ping Status is set to Warning; whilst if none respond, the overall Ping Status is set to Critical. The overall Ping Status can also be set to be Ignored if "Node Polling" has been set to "Ignore Connectivity" in the node Configure panel (see later).
Because the information shown in any Status panel can be as much as 1 minute old (longer if the polling cycle has been increased), a "Test" button is provided in each.
Clicking "Test" will instruct Mutiny to immediately perform the appropriate tests (in this case ping all the know IP addresses on the device) and display the results.
If Data Collection is enabled for the node,(described later in the manual) the IP address will be linked; pressing this link will load the quick graph of the round trip time with selectable time periods.
Also shown are 3 buttons marked "OK", "Warning" and "Critical" which are used to configure if and how Alerts are sent if a ping Event is detected. These are dealt with later in the "Events and Alerts" Section.
1.6. SNMP Status Panel
Mutiny checks that the SNMP agent built into the device is responding by asking it for its "uptime". This is usually a measure of how much time has elapsed since the device was last restarted as the SNMP agent is initiated at boot time. However, on more complex systems such as servers, it is normally possible to restart the SNMP service independently of the device. In most cases the actual uptime is not important to Mutiny as it is merely performing a simple request to check everything is OK.
Although it is possible in some circumstances for the SNMP agent to simply "die" and leave the rest of the device running normally, SNMP failure usually indicates that the devices is struggling for resources and may be about to fail.
Note that obtaining an SNMP response at all, requires that Mutiny knows the Community String for this device and that the internal security settings allow the device to return SNMP responses to Mutiny's SNMP requests (see "How SNMP Works" Section later).
As with all other Status panels, there is a "Test" button to obtain the current SNMP Status of the device and other more detailed SNMP information. There are also 2 buttons configure the alerting when SNMP OK and Critical Events are detected.
1.7. Interfaces, Status and Thresholds Panels
The Interfaces panel shows the status of all monitored network interfaces on the device. Mutiny uses SNMP to monitor the state of the interface (Up or Down), the current network usage in Bytes/s and also as a percentage of the maximum possible bandwidth based on the current speed on the interfaces and the error rate (errors/s). Tick boxes next to each interface determine whether each individual interface is monitored or not.
At the time that the device is added into Mutiny the properties of each interface are read via SNMP and the following are displayed in a table in the Interfaces Status panel with a row of properties for each interfaces as follows:
The interface Index number.
The interface description.
A Status Indicator Symbol (OK, Warning, Critical, Not-polled or Ignored) showing the overall status of the interface.
The connectivity state of the interface ("Up", "Down" or "Unknown").
The input data rate in Bytes/s.
The input Usage shown as a percentage of maximum bandwidth based on the current interface speed.
The input error rate in errors/s.
The output data rate in Bytes/s.
The output Usage shown as a percentage of maximum bandwidth based on the current interface speed.
The output error rate in errors/s.
A tick box to define whether each particular interface is polled or not. All interfaces that are up at the time of discovery will have their tick boxes automatically checked.
At the bottom of the panel there is a tick box that turns on and off the polling of all interfaces en mass.
As with all other Status panels, there is a "Test" button to display the current state of all the interfaces on the device. Three are also two sets of 3 buttons ("OK", "Warning" and "Critical") to configure the alerting when interface Status and Usage Events are detected.
Also at the bottom of the panel is a button to display the Interfaces Thresholds panel:
As with the Interfaces Status panel, a table of Usage thresholds is displayed with a row of properties for each interface as follows:
The interface Index number.
The interface description.
A Status Indicator Symbol (OK, Warning, Critical, Not-polled or Ignored) showing the overall status of the interface.
The connectivity state of the interface ("Up", "Down" or "Unknown").
The input Usage Critical threshold shown as a percentage of maximum bandwidth based on the current interface speed.
The input Warning threshold shown as a percentage of maximum bandwidth based on the current interface speed.
The output Usage Critical threshold shown as a percentage of maximum bandwidth based on the current interface speed.
The output Warning threshold shown as a percentage of maximum bandwidth based on the current interface speed.
A tick box to define whether the "Interface Down" connectivity state of each particular interface is ignored not (ticked means, ignore connectivity state).
At the bottom of the panel there is a tick box that sets "Ignore Connectivity" on and off for all interfaces en mass.
The Status of each interface will be set to Critical if it is polled and the interface connectivity state is "Down" and the "Ignore Connectivity" tick box is not ticked.
The Status of each interface will be set to Critical if it is polled and the interface connectivity state is "Up" but either the input or output Usage is greater than or equal to its Critical threshold.
The Status of each interface will be set to Warning if it is polled and either the input or output error rate is greater than or equal to 60 errors per minute.
The Status of each interface will be set to Warning if it is polled and the interface connectivity state is "Up" but either the input or output Usage is greater than or equal to its Warning threshold.
1.8. CPU Usage panel
The CPU usage panel contains Critical and Warning thresholds. As Mutiny can monitor both windows and Unix based operating system the values contained are displayed in 2 different ways; for windows the CPU values are shown as a decimal percentage i.e. 0.85 means 85% utilisation for Unix the value represents load average, therefore the value might be any positive number representing the number of processes queuing for a processor.
Passing these thresholds will trigger the Warning or Critical events.
As with the other panels the show button shows us the CPU utilisation now, a dual or quad core processor will show as individual CPUs.
If data collection has been enabled and there is graphing data to display, then the utilisation will be hyperlinked, pressing this hyperlink will open the quick graphing panel, by default, displaying the last week's data.
Select a time period from the pull down to re-draw the graph.
1.9. Memory usage panel
The memory status panel allows you to set percentage thresholds for the memory used. Passing these thresholds will trigger the Warning or Critical events. As with the CPU panel above, if data collection has been enabled and there is graphing data to display, then the utilisation will be hyperlinked, pressing this hyperlink will open the quick graphing panel, by default, displaying the last week's data.
1.10. Disk Usage panel
During discovery, Mutiny will add any fixed disks found and add them automatically to the disk status panel. Mutiny uses a table of disk sizes to set default warning and critical thresholds.
These thresholds can be edited to suit your needs. Passing these thresholds will trigger the disk events.
Pressing the [Show] button at the bottom of this panel will display all attached storage including removable devices.
1.11. Processes panel
During discovery, Mutiny looks for keys processes from an internal table and adds any seen running to the list of monitored processes.
Name
The name of process or service.
Status
The status indicator depicts the state of the process.
Process Count
The number of instances of this process that are currently running.
Should Run
This tick box sets whether the process should or should not be running. If it is ticked, then Maximum and Minimum boxes can be used to define the maximum and minimum number of each process that should be running. A Critical status will be set if the number of instances of the process found to be running is less than the Minimum value. A Warning status will be set if the number of instances of the process is found to be greater than the Maximum value. The default setting is Minimum of 1 and no maximum.
If the box is not ticked, then the Maximum and Minimum boxes cannot be used. A critical status will be set if any instances of the named process are found to be running.
Minimum
This box is used to set the minimum number of each process that should be running.
Maximum
This box is used to set the maximum number of each process that should be running.
1.12. Agents panel
Provides details on the state of the node's processes monitored through agents. Also permits their activation or deactivation from the monitoring procedure. N.B. if the agent is a Remote Agent it can also be monitored separately as a process.
Agents are either local to mutiny or remotely installed on the node. Agents are used to display additional data like temperature or the status of UPS etc. agents can also contain other environmental data derived from manufacturer's system agents installed on the device like RAID or Power supplies.
1.13. IP Services panel
A collection of service tests at the TCP level or higher, to determine whether a service is functioning or not.
Service tests include:
HTTP- Checks for content on a page.
IMAP - Tests an IMAP mailbox.
POP3 - Tests a POP3 mailbox
MySQL - Tests a MySQL instance
Open Port - Checks to see if a service [port can be opened
SMTP - Configures the setting of the parameters for polling a node's smtp service
DNS - Configures the setting of the parameters for polling a node's dns service
Each test has a warning and critical thresholds as well as a graphed response time.
This graph shows the response of the mutiny webserver over the period of a week.
1.14. Status Panel Buttons
1.14.1. Configure
Establishes the node's identification and polling procedure details. Also allows the selection of the node's icon type and remote connection method.
Node Name
Allows the assignment of the name of the node that will be display in Mutiny.
IP Address
Allows the setting of the node's primary IP address. This address will be used by Mutiny to poll the node.
DNS Name
Displays the name that the Mutiny system retrieves from the local DNS server. (This assumes that a local DNS server has been added to the System Configuration page of the Admin section.)
sys Name
The node's system name as read by SNMP.
Set Name As
Up to 3 buttons will be displayed (if the data is available) that can be used set the Node Name to either the IP address, the SNMP sysName or DNS Name.
Icon Type
Allows the selection of the icon type that is used depicts the node in the Mutiny Views.
Connect Method
Allows the selection of the URL to allow remote connection to the node.
The node's polling procedure establishes details on the node's polling procedure.
Node Polling
Sets the polling type for the node. Options are:
On - Node will be polled.
Ignore Connectivity - Node will be polled, but Ping failure will not be propagated to the overall node status.
Off - Node will not be polled.
Node Alerts
Enables or disables Alerting from the node when an Event is logged. Options are:
On - Node Alerts will be generated and sent as defined by the Event/Alert setting for each contact.
Off - Node Alerts will be generated and queued (as above), but they will not be sent.
Poll Interval(mins)
Sets the minimum polling interval for the node (in minutes). the recommended value is 1 minute.
Ping Polling
Determines whether or not the node is pinged at the start of each polling cycle. It is recommend that this is set to "on"
SNMP Polling
Determines the parameters used to poll the nodes basic SNMP information. Options are:
Always - Always poll the node with SNMP.
Only if Ping OK - Only poll the node with SNMP if the ping status is OK.
Off - Do not poll the node with SNMP.
SNMP Agent
Allows the setting of the parameters needed to poll the node's SNMP agent.
Port Number
Sets the UDP/IP port number for SNMP access (normally 161).
Community String
Allows the assignment of the password to access the node's SNMP information.
Retries
Sets the maximum number of SNMP retries in which to obtain a positive response before the SNMP agent is considered to be "down".
System Polling
Determines the parameters used to poll the nodes host resources SNMP information. Options are:
Always - Always poll the node's system resources SNMP.
Only if Ping OK - Only poll the node's system resources if the ping status is OK.
Only if SNMP OK - Only poll the node's system resources if the SNMP status is OK.
Off - Do not poll the node's system resources.
System Agent
Allows the setting of the parameters needed to poll the node's host resources SNMP agent.
Same as SNMP
Check this box to mirror the values of the SNMP Agent settings (recommended).
Port Number
Sets the UDP/IP port number for SNMP access (normally 161).
Community String
Allows the assignment of the password to access the node's SNMP information.
Retries
Sets the maximum number of SNMP retries in which to obtain a positive response before the SNMP agent is considered to be "down".
1.14.2. Configure panel sub buttons
Update SNMP Info
Updates any changes made to the node's SNMP based information. This information updates the nodes' properties panel.
Reset Polling
Sets the nodes polling properties back to their default values. The action is similar to deleting and then re-adding the node, except that historical data is preserved.
Adapters
Accesses the panel to set the node's SNMP-Polling Adapters.
Save SNMP Walk
Accesses the panel to save an SNMP walk from the node. This walk file can then be Emailed to Mutiny so that the feasibility of writing specific SNMP adapters can be assessed.
1.14.3. Properties
Accesses the node's graphing and data analysis section. The values in the fields are populated at time of discovery or refreshed by pressing the [Reset Polling] Button.
1.14.4. SNMP Info
Executes a range of commands on the node that obtain detailed System information.
1.14.5. Collect Data
Accesses the panel for establishing the node's data collection details.
Collect System Data, stores CPU, Memory and Disk historical data for graphing.
Collect Traffic Data, stores Interface traffic data.
1.14.6. Connect
Activates a remote connection to the node.
(Telnet is recommended when connecting to a terminal with a Unix based operating system or VNC if Windows based.)
The number of available connect methods is defined in the [Connect Strings] screen under administration.
Please bear in mind that the methods for making a remote connection are constrained to protocols and applications that can be launched from a URL. (you are working in a browser!)
1.14.7. Graphing
Only visible if [Data Collection] is enabled, accesses the node's graphing and data analysis section.
Depending on the available data, one or more buttons will be listed down the left had side to select between Traffic, ISDN, QoS Data and System Data.
For each group there is a pull-down to select between the available elements, a units selection and a time period pull-down.
After making your selections press the [Graph Data] button to generate the output. The graph image can be copied or saved by "right clicking" on the image.
To output the graphing data into a csv format press the [Save Data] Button.
1.15. Event Panels
Where a status panel has the event buttonsyou can configure what alert action to perform when triggered.
The image shows a contact enabled for email alerts for all 3 shifts with a 10 minute delay, but not enabled for Page/SMS. The additional action at the bottom can be used with or without a contact.
The available action types are:
Send V1 trap
Send V2c Inform
Custom
The list of custom actions is subject to bespoke agents being developed by Mutiny to end-users specifications.
For detailed operation of the alerting available in Mutiny see the section on Alerting.
You should considder using the tracked views method for a simpler system-wide alerting policy see: https://mutiny.freshdesk.com/support/solutions/articles/13000087793-tracked-views