Protecting a network from external threats is an essential function of every organization. To do this, organizations use a ton of different tools and strategies. One such strategy is flow-based monitoring.
What is Flow-based Monitoring?
Typically, when you enter the URL of a website, you are essentially requesting information from a web server. This request is transmitted to the web server through a series of physical devices such as routers, switches, and cables. The information from the web server reaches your device and your browser displays the same in the format in which it's designed. This data communication happens in the form of packets, where data is split into multiple packets, and sent through the cables. The receiving device collects all the data packets and puts them in the right order based on the numbering sent by the sending device. After assembling all the packets, the complete information is displayed to you.
The flow of these data packets can tell you a lot about the activity in your network. For example, if there's an unusual increase in the volume of data packets, it could signify a Denial of Service (DoS) attack. Likewise, below-normal data flows could indicate that some system is down or there's network latency.
Flow-based monitoring is a common way to detect changes in your network traffic, and tells you a lot about,
- How is your network working?
- Imminent attacks
- Issues such as latency
- The rate of utilization of network resources
Due to these important insights, every organization implements flow-based monitoring to better understand its network's performance. That said, there are many types of flow-based monitoring, and organizations can decide which of the following types they want to implement.
Types of Flow-based Monitoring
Many flow monitoring solutions are available, and the most popular of them are:
- NetFlow
- sFlow
- IPFIX
Let's take a detailed look into these different solutions.
NetFlow
Developed by Cisco in the 1990s, NetFlow was one of the first monitoring solutions that provided insights into your network's performance.
In NetFlow, a device like a router captures the header information of a few data packets with similar characteristics. After the packets become dormant or a predefined amount of time has elapsed, the packets are sent to a flow collector for storage. Next, these packets are sent to a flow analyzer, that in turn, generates reports or graphs as the case may be.
sFlow
Sampled Flow, or sFlow in short, works differently when compared to NetFlow.
NetFlow collects sequential packets with similar characteristics, while sFlow randomly collects a few samples at a predetermined interval. In other words, sFlow collects packets, say once every 10 seconds, and collates these packets together to get an understanding of your network's historical functioning.
Also, it collects the complete packet information and not just the header data like NetFlow.
The collected packets are sent for further analysis in almost real-time, so the state of the network can be known as they occur. Unlike NetFlow, there is no intermediary for storing data packets.
IPFIX
The Internet Protocol Flow Information Export (IPFIX) is similar to NetFlow, except that it's an open standard and, hence, is supported by non-Cisco vendors as well. The formats are identical as well.
In this article, we will focus on sFlow, followed by the Host sFlow agent.
Features of sFlow
Before we jump into the agent that collects data packets based on sFlow, let's take a quick look at sFlow's features.
- Highly scalable.
- Uses open standards.
- Collects only samples of data packets.
- Captures complete packet information and partial packet payloads.
- Supports Ingress/Egress monitoring.
- Supports IPv6, MPLS, and VLAN.
- Works based on a preconfigured sampling rate.
- As many vendors support sFlow, your data packets are more interoperable in these flow types when compared to the proprietary NetFlow.
- Reduces the impact of data packet collection on CPU utilization and bandwidth.
The downside of sFlow is that it's hard to determine the appropriate sampling rate and frequency, and this could impact the accuracy of the information collected.
Now that you know all about sFlow, let's get into the data collection mechanism using the Host sFlow agent.
What's the Host sFlow Agent?
The Host sFlow agent is responsible for exporting the physical and virtual performance metrics through the sFlow protocol. This agent is highly scalable, works across multiple vendors and operating systems, and has minimal impact on your systems.
The biggest advantage is that Host sFlow is an open-source implementation of the sFlow monitoring standard. As a result, the Host sFlow works well across vendors and, in the process, simplifies deployment. Also, it's ideal to get visibility in the application layer, such as response time.
Host sFlow agent works well on Windows, Linux, Solaris, AIX, and FreeBSD operating systems. You can use it on hypervisors like Hyper-V, Nutanix, XenServer, and KVM, and on switches such as Arista EOS, Cumulus Linux, DENT, OpenSwitch, and SONiC.
Next, let's talk about installing and configuring Host sFlow.
Installing Host sFlow
Installation is fairly straightforward. Download the Host sFLow installer and use the System Center Virtual Machine Manager (SCVMM) to install it. Here are the steps to install it.
- As soon as you download the installer on SCVMM, restart the SCVMM.
- Navigate to Switch Extension Managers on SCVMM and add the extension manager for sFlow.
- SCVMM automatically configures the sFlow agent based on the DNS-SD standard. However, you can also manually install it, especially if you want to change one or more aspects of DNS-SD. Note that this configuration can't be changed later.
- Enable the extension and deploy it on the hosts.
With this, you're all set to collect metrics through sFlow. Next, let's see how you can manually configure the Host sFlow agent.
Configuring the Host sFlow Agent
Let's now see how you configure the Host sFlow agent across different environments.
Manual Installation in SCVMM
Earlier, we talked about automatic configurations in SCVMM, and here are the steps if you want to manually configure them to suit your specific requirements.
- Choose the sFlow agent under Switch Extension Managers in SCVMM, right-click, and select “Properties”.
- Choose the extensions tab and click on sFlow agent. Next, click on the “Add Property” button at the bottom of the dialog box.
Here are some properties you can configure manually.
- SFLOW_DNSSD You must change this value to “off”, otherwise the installation will revert to DNS-SD configuration.
- SFLOW_COLLECTOR Enter the hostname or the IP address of the sFlow collector. This can be a single collector or CSV file containing many collectors.
- SFLOW_POLLING As the name suggests, this is the frequency of polling. The default value is 20 seconds.
- SFLOW_SAMPLING This is the default sampling rate. The default value is 1 for every 256 data packets.
With this, your manual configuration is ready for use.
Installation on Windows
By default, the sFlow agent will be configured using the DNS-SD configuration. However, the convenient aspect of installing sFlow on Windows is that you can change the configuration at any time after the installation. Simply head to the registry settings and change the value of the parameters you want. Finally, restart the sFlow service and the new registry parameter values will take effect.
Here's a look at the registry values you can change. Note that these instructions apply only for the hsflowd-win-<version>-x64.msi or hsflowd-win-<version>-x86.msi installers.
- Collector The name of the host to which the system must send the sFlow data. It can be a hostname or an IPv4 or IPv6 address. You can also have multiple collectors, and they must be separated by commas. This is a required parameter, and if not enabled, the DNS-SD configuration will take effect.
- samplingRate The sampling rate determines the nth packet that must be sampled and sent to the host. The default value is 400, which means, one in every 400 packets will be sampled. You can set this value to 0 to completely disable packet sampling. This is an optional parameter.
- pollingInterval This is another optional parameter that sets the period between sampling. The default value is 30 seconds.
- agent address This is the unique IPv4 or IPv6 address given by an agent to uniquely identify itself.
Setting these parameters can give you more control over how often and where your sFlow data is exported. You can also check the logs at %SystemDrive%\ProgramData\Host sFlow Project\Host sFlow Agent\hsflowd.log if you're running hsflowd as a Windows service.
Installation on Linux
In Linux systems, you can configure Host sFlow in the/etc/hsflowd.conf file. Below are some of the parameters you can configure if the DNS-SD configuration is turned off.
- Polling – This is the frequency of sampling, and the default value is 30 seconds.
- Sampling – This parameter denotes the rate at which a packet must be sampled. It is denoted as sampling. <speed> = N. The default value is 400.
- HTTP Sampling – This value applies to HTTP, and by default, the value is 10.
- Collector – This is the hostname that's represented as an IP address along with an optional port value. You can have as many collectors as you want, and the values you have set will apply to all of them.
Make the necessary changes to the above parameters to customize your sFlow speed and destination.
Before we end this section, let's take a brief look at DNS-SD, the default standard mentioned earlier.
What's DNS-SD?
Domain Name System – Server Discovery (DNS-SD) is a set of standard values that are universally supported on all servers and devices. In many ways, this is a protocol or a method to use DNS for identifying servers and even distributing the default configuration information to them.
The biggest reason for using DNS-SD is its universal acceptance, as this helps with interoperability. Also, using DNS-SD, you just have to add a few lines of code to the zone file of the concerned DNS. This configuration is automatically sent to all servers in the same DNS. This configuration comes in handy, especially when you want to configure all the servers in a data center.
This DNS-SD is simple to use and acts as a good alternative to CLI or SNMP, both of which are arduous in comparison. Also, you can use DNS-SD across physical, cloud, and virtual environments, and it's highly scalable as well.
With this, we have covered extensively how you can install and configure Host sFlow across different environments. Next, let's talk about a few services that are helpful to handle sFlow monitoring data.
Third-Party Tools for sFlow
The above-mentioned configuration steps may seem easy to people who are familiar with systems and registry files. But what about those who are learning the ropes? Sometimes, even the experienced sysadmins would prefer to use apps with intuitive interfaces through which they can handle sFlow monitoring data.
To cater to these varying groups of people, many companies have developed platforms for handling sFlow monitoring, and below are some of the popular ones.
Paessler PRTG – FREE TRIAL
PRTG's sFlow monitoring capabilities capture the data packets and display them on a dashboard for easy viewing. It even sends notifications when specific events occur or in case of any malfunction. In all, it optimizes your network and provides the visibility you need. You can start with a 30-day free trial and up to 100 free sensors indefinitely.
ManageEngine NetFlow Analyzer
ManageEngine's NetFlow Analyzer is another comprehensive choice for an in-depth network traffic analysis. It supports sFlow and other leading flow technologies and helps you get to the root cause of the problem quickly. Moreover, its forensics data also help with capacity planning and budgeting.
Wireshark
Wireshark is an open-source packet analyzer tool for identifying patterns in network traffic and using them for troubleshooting and analysis. It is simple to use and stores data for offline analysis when needed.
SolarWinds NetFlow Traffic Analyzer
SolarWinds NetFlow Traffic Analyzer monitors and analyzes the flow of network traffic to understand your network's performance. Using this information, you can better optimize traffic and bandwidth usage. Furthermore, its advanced reporting capabilities also help with internal auditing and compliance.
All the above tools provide deeper insights into your network using the sFlow technology. The exact choice depends largely on your organization's specific needs, environment, integration with the existing stack, and other pertinent factors.
Final Thoughts
In all, many organizations use flow-based monitoring to analyze and understand traffic patterns, and in the process, protect the network and devices from imminent attacks. Though there are different flow technologies, we focused on the sFlow technology, where a random data packet is taken as a sample and its contents are examined to understand network traffic patterns. These data packets are sent to other tools for further analysis through the Host sFlow agent, an open-source implementation of the sFlow standard. Next, we talked in depth about this agent and how you can install it across different devices. Lastly, we also looked at third-party tools that can comprehensively collect sFlow data and analyze them to provide the insights you need for troubleshooting and protection.
We hope all this information comes in handy for you to set up sFlow monitoring in your system and make the most of the data it provides.
For more such user guides, browse through www.ittsystems.com.