Hello there guys, welcome once again for another time as we go through another session of exploring knowlegde.
“Comfort and prosperity have never enriched the world as much as adversity has.”
In this article, we will be taking a look at some of the monitoring tools you might wish to consider for your network or your infrastructure. We will focus on the opensource tools looking at their features and their components in a nutshell. They have not been arranged in any order and gives a sneak preview of what they offer.
A very important part for an organization or one who administers systems is the ability to tell how his house is faring. Being able to know when something is not taking place as expected can really boost performance and reduce the amount of time troubleshooting for anomalities. To succeed in that, there are tools that have to be your best friend because they will aid you in this prudent quest. To that end, therefore, there are a number of tools we can utilize to gather and process what is taking place inside our network and servers.
From Cacti’s site, this tool “is a complete network graphing solution designed to harness the power of RRDTool’s data storage and graphing functionality. Cacti provide a fast poller, advanced graph templating, multiple data acquisition methods, and user management features out of the box. All of this is wrapped in an intuitive, easy to use interface that makes sense for LAN-sized installations up to complex networks with thousands of devices.”(Cacti.net, 2018).
Cacti harness the power of RRDtool which is an OpenSource industry standard data logging and graphing system for time series data. Thi high performance tool RRDtool can be easily and seamlessly integrated into scripting languages such as shell scripts, perl, python, ruby, lua or tcl applications.
The features of Cacti include the following:
- Data Gathering: Cacti has a functionality for data input. This gives the users the freedom to develop custom scripts for gathering data from the target devices. Nonetheless, it comes bundled with SNMP which is an industry data gathering technology. What is more is that Cacti comes with a PHP-based poller having the benefits of executing scripts, retrieving SNMP data, and updating the RRD files
- User Management: Cacti has this rich feature where multiple users with their accounts can be set up. The administrator has the flexibility of allocating a given portion of privileges to a given user.
- Display of graphs: There are three different ways to view your graphs viz, tree view, list view and preview view. These three views have their benefits, for example, the tree view gives users the ability to create hierarchies of graphs and also the chance to place those graphs on the tree. A large number of graphs can be managed this way. The list view as the name suggests is simply a list of the available graphs and links you to the actual graph when clicked. The last preview view gives a visual of all of the graphs in one large list where you can quickly peruse and look at the graphic graphs.
- Templates: There are three different types of templates: Data Templates, Graph Templates, and Host Templates. It eases the burden of defining all data sources and graphs without using Templates at all which can be quite painful. The data template provides a skeleton for an actual data source. The Host Templates groups all Graph Templates and Data Query for a given device type. What is more exciting is that you do not need to create all Templates on your own. Templates can be found out of the box and there is a very simple feature where such templates can be imported into your cacti platform.
- Alerting mechanisms: Cacti can be configured to send mail alerts in case pre-defined variables or thresholds have been exceeded or not achieved. This makes your nights awesome since you do not have to start looking for problems when those calls come in. It will pinpoint that a certain service is down or facing particular anomalies.
- Reporting: Cacti can generate reports in accordance with your configuration.
Below are important Cacti Articles to help you started:
I recently stumbled upon LibreNMS monitoring tool and I have to say that it has been developed so well. Its installation guide on Ubuntu 18 is on our site. LibreNMS is a community-based fork of the last GPL-licensed version of Observium with plenty of features.
The tool is based on PHP/MySQL/SNMP and monitors the network together with your servers.
- What is cool about libreNMS is the fact that it is auto-discovering. You do not have to tell it if your device is a Cisco, Juniper, Windows or Linux based. It automatically gathers this information like a charm using protocols such as CDP, FDP, LLDP, OSPF, BGP, SNMP, and ARP.
- It goes an extra mile and discovers the interfaces on your router or switch which is pretty impressive. It also attempts to draw the connection details of your network but requires assistance from you.
- LibreNMS can group interfaces based on their description’s prefix, for example, “Transit:”, “Peering:”. Which is shown under the “ports” drop-down.
- Alerts: Like most monitoring tools, libreNMS also has the monitoring functionality which can be highly customized.
- Scalability: As your network grows, its distributed polling feature allows horizontal scaling of your system.
- LibreNMS has a billing system. Yes, this tool has one. This can be done through the generation of bandwidth bills for ports on your network in accordance to usage or transfer.
- LibreNMS has an Andriod and Apple Apps which can be used to view and manage your network. This is such a breath of fresh air.
- Support or various authentication mechanisms such as radius, LDAP, Active Directory and more.
Nevertheless, you can integrate it to any other system via its API access.
This tool is a beast and hence encourage you to take a look at what is happening inside its engine. There is much more than the article can reveal including security through
You can get started by checking How to Install and Configure LibreNMS on Ubuntu 18.04 LTS with Nginx guide.
From nagios.org, “Nagios monitors your entire IT infrastructure to ensure systems, applications, services, and business processes are functioning properly. In the event of a failure, Nagios can alert technical staff of the problem, allowing them to begin remediation processes before outages affect business processes, end-users, or customers.”
It is a tool that began way back in 1999 and has grown to include other products currently but all focused on monitoring. Let us have a look at the features it has for your consideration.
- Monitoring of a large number of devices: Nagios has the capabilities of monitor applications, services, operating systems, network protocols, system metrics and infrastructure components with a single tool. This makes it a jack of all trades which can be quite beneficial if you want one tool to cover a wide range of services and devices.
- Multi-tenancy: Having many users logged into the interface simultaneously boosts efficiency and even improves your business since interested stakeholders can have a real-time look at the status of the infrastructure. It can also limit views to only user-specific network and hence accommodate more in one platform. You can only see what belongs to you.
- Reporting: Nagios ensures that Service Level Agreements are met by producing reports which can be enhanced by plugins from third party vendors. This makes it highly flexible and customizable.
- Visibility: With a centralized web interface where you can see everything, it can be easy to detect outages.
- Notifications: Nagios has alerting functionality. The alerts can be sent via SMS and mail which translates to the simplified management of your infrastructure.
One interesting feature Nagios has is how event handlers allow the automatic restart of failed applications and services. In case you are intrigued by Nagios, please visit their site for more insights.
Prometheus is an open-source system monitoring and alerting toolkit originally built at SoundCloud(prometheus.io, 2018). It works well for recording any purely numeric time series.
It fits both machine-centric monitoring as well as monitoring of highly dynamic service-oriented architectures(prometheus.io, 2018) For graphic visualizations, Prometheus supports tools such as Grafana for data visualization and export.
There are guides about installations and integrations on our site about Prometheus and Grafana.
Feel free to look at them if you might be interested.
The main features of this tool include(prometheus.io, 2018):
- A multi-dimensional data model with time series data identified by metric name and key/value pairs
- A flexible query language to leverage this dimensionality
- No reliance on distributed storage; single server nodes are autonomous
- Time series collection happens via a pull model over HTTP
- Pushing time series is supported via an intermediary gateway
- Targets are discovered via service discovery or static configuration
- Multiple modes of graphing and dash-boarding support
From its site, “Zabbix is the ultimate enterprise-level software designed for real-time monitoring of millions of metrics collected from tens of thousands of servers, virtual machines, and network devices.” It is capable of monitoring not only Linux but Windows, Solaris, IBM AIX.
Zabbix contains many features and we shall go over them in a nutshell.
- Collection of Metrics: It has various methods through which it can collect the metrics being desired including: Multi-platform Zabbix agent(Zabbix agent may run on various supported platforms, including Linux, UNIX, and Windows, and collect data such as CPU, memory, disk and network interface usage from a device.), SNMP and IPMI agents, Agentless monitoring of user services, Custom methods, Calculation and aggregation and end-user web monitoring
- Detection of anomalies in your set-up: Zabbix is able to detect problem states within the incoming metric flow in an automatic fashion.
- Better visualization presentation: According to the Zabbix developers, the interface gives its users multiple ways of presenting a visual overview of your infrastructure and environment. These can be in-form of Widget-based dashboards, Graphs, Network maps, and Slideshows.
- Notifications: The server can send messages or mail. A lot more can be done as far as alerts are concerned. For example, the messages can be customized based on the recipient’s role or with runtime and inventory information. Moreover, the messages can be configured to focus on the root causes of the arising problem using the Zabbix Event correlation mechanism.
- The use of templates: This feature allows you to Use out-of-the-box templates for most of the popular platforms and to Monitor thousands of similar devices by using configuration templates
- Scalability: Zabbix uses proxies which send collected information in the environment it sits in a central Zabbix server. The Use of Zabbix proxies may greatly simplify the maintenance of an environment monitored by Zabbix and increase the performance of the central Zabbix server. This shows how the monitoring system can scale in a distributed fashion.
- Zabbix has an API and hence can be used to integrate it to any system in the infrastructure.
Below are Zabbix guides available on our blog.
We hope you have gained a lot from the article. There are other monitoring tools available out there such as Sensu, Systat and many more that the article’s scope could not allow. Cheers guys.