Publish date July 10, 2018
This is the second blog on Azure E2E monitoring and its features. In this blog, we are going to do deep dive and explore so of the cool features related to network performance monitoring aka Network Performance Monitor (NPM). One of the essential features such as ExpressRoute Monitoring is now possible in NPM.
This blog will talk about “What is Express route?” its benefits, Capabilities, why use it and how easy it is to use?
Based on my experience, I have come up with the following summary of capabilities of NPM (Network Performance Monitor). NPM is used to help detect network issue like traffic blocking, routing errors, issue that usually the other NPM solutions failed to detect; I am going to explain this in following sections.
Azure ExpressRoute is used to create private connections between Azure data centers and infrastructure on-premises or in a co-location environment. One thing to be noted for those who do not know about ExpressRoute connection is not to go over the public Internet as the ExpressRoute offers faster speeds, lower latencies, and more reliable as compared to typical public internet connections. As per experience in some use cases, using the ExpressRoute connection to transfer data between on-premises systems and Azure can give client significant cost benefits as well.
With the help of Express route, There is a connection to Azure at an EXPRESSROUTE location, i.e., Direct connect to Azure from your existing WAN link, such as an MLPS (Multiprotocol label switching) VPN, provided by a network service provider. In the picture below, you can see how a customer network connects with ExpressRoute circuit on primary and secondary connection and on the other side it is connected to Microsoft peering on public IPs and Azure private peering for a Virtual network.
Next, I would like to explain what Network Performance Monitor (NPM) is, NPM is a typically cloud-based hybrid network monitoring solution that helps monitor network performance between various points, such as network infrastructure, network connectivity to service or application endpoints, most importantly monitor the performance of your Azure ExpressRoute.
How to monitor connectivity to Azure VNETs via Express route?
With the help of NPM, one can monitor the packet loss and network latency between on-premises locations. When we say on-premises locations, this typically includes office sites, branch offices, and data centers.
There is a threshold that needs to be selected, which is typically called benchmark for your loss or latency in the network. With the help of NPM, proactive alerts to notify when this occurs or reached threshold limit can be set.
Network state recorder can use real-time values and historical trends. Suppose there is an issue of slowness or loss of a day, then with the help of recorder, we can recall earlier day’s state and see what went wrong during that state. This is useful to provide RCA or investigate to catch the transient fault/issues.
How can end to end visibility get in your express route connections?
It is generally observed that on your Azure workload, it is difficult to address or identify the bottleneck of latency since Express route connection has various components. Components such as on-premise network, ExpressRoute circuits, local edges routers, VNets, public peering, O365 services and ISPs.
With the help of NPM interactive topology view, you can get E2E visibility, and view components and also latency contributed by each hop to identify the troubled segment. I am sure your application business owner always wants to know where and what is the issue. Once you identify the faulty piece, you can go ahead and rectify it ASAP.
The image below provides a quick view of topology illustration of where and how the Azure VM on the left is connected to on-Perm VM and use Primary /Secondary connections of EXPRESSROUTE. In summary, nine on-premises hops (shown by dashed lines) are initially compressed.
You can expand the map and choose to view all on premises hops to understand the latency occurred in each hop.
NPM help to diagnose a lot of circuit connectivity issues, a couple of recent issues I would like to point out here are as follows:
The circuit is down, degradation of performance due to peak utilization; sometimes traffic is not flowing through the primary circuit at all.
As soon as your On-perm resource and your VNET connectivity are lost, you will get a notification. This helps you to do a quick, proactive action before any tickets are raised, or end users escalation happens. In the illustration below, red marks are unidentified networks and are not passing through any circuits.
Based on my experience this usually happens in a network that has traffic routing issues and could be because your primary circuit is down, and you set it to automatic routing of traffic, start via the backup route. This is normally termed as traffic not following intended route. If this automatic routing occurs well and good, but if it is manual then it leads to downtime. With NPM you can now set an alert and proactively address your configuration issue to resolve.
Above alerts, help to understand bandwidth utilization on each VNET.
Similarly, you can get details on PM (Performance Monitor) and SEM (service end monitor) details with these features. To know more about E2E monitoring of MS Azure, get in touch with YASH advisory and cloud service team today.
Image credit: MS Azure.
Get more than what you think with YASH Cloud Services
Shiv Kishan Suthar -Technical Architect- IMS @ YASH Technologies