Performance Counters Added to MassTransit

One feature that is often overlooked in software development is the output of information that can be observed by operations once the application is in production. Fortunately, many open source projects are leveraging log4net to provide a configurable level of runtime information that can be useful in figuring out why a system is behaving a certain way (and face, if you’re looking, it’s more than likely behaving badly). Logging, however, is only one view into an application — one that might not deliver the appropriate information in a useful way.

Anyone who has used a computer with any interest is familiar with system monitoring tools. Task Manager (or if you’re really cool Process Explorer) is the first place Windows users look when their system starts to crawl, Mac users turn to Activity Monitor, and I’m sure Linux users have some really obscure command-line tool as well. These coarse grained tools are usually enough for users, however, an operations team needs a higher degree of visibility into application — particularly if they are expected to determine how to tune the application for better performance.

For operations on Windows, Performance Monitor provides detailed information for running applications in real-time. On a web server, it is easy to find out how many threads your ASP.NET application is using, as well as how many requests are queued. That information can be correlated with processor utilization to help determine if the bottleneck is the CPU, the network, or possibly even the database server. When it comes to troubleshooting issues on a live system, more information is always helpful to determine the source of the problem.

To support this level of visibility in MassTransit, performance counter support has been added. Performance counters in .NET are part of the System.Diagnostics namespace. There are various counter types that can be defined, including counts, rates, and averages. When an application wants to output performance counters, it has to create a category and specify the counters that are included in the category. For instance:

ConsumerThreadCount = new RuntimePerformanceCounter("Consumer Threads",
	"The current number of threads processing messages.",
	PerformanceCounterType.NumberOfItems32);

ReceiveRate = new RuntimePerformanceCounter("Received/s",
	"The number of messages received per second",
	PerformanceCounterType.RateOfCountsPerSecond32);

These are two of the counters defined by the MassTransit category. The first is a count that is updated when the number of threads in use changes. The second is a rate which gets incremented once for every message received. The actual calculation and display of the rate is handled by the performance monitoring tools – the application does not need to calculate it.

ConsumerDuration = new RuntimePerformanceCounter("Average Consumer Duration",
	"The average time a consumer spends processing a message.",
	PerformanceCounterType.AverageCount64);

ConsumerDurationBase = new RuntimePerformanceCounter("Average Consumer Duration Base",
	"The average time a consumer spends processing a message.",
	PerformanceCounterType.AverageBase);

This counter is used to report the average consumer duration of a message. For an average, two counters are used. One is the base which is incremented for each occurrence and the counter is the actual count that is added. So for each message, the base is incremented once and the duration is incremented by the amount of time spent executing the consumer.

In adding performance counter support, I wanted to do it in a way that didn’t leak the details of updating performance information throughout the framework. It was at this point that I turned to the Magnum Pipeline. Using the pipeline to publish the metrics allowed me to isolate the actual performance counter interface to a single method in a single class for the service bus. So instead of passing interfaces around all the components that make up the bus, a single event aggregator is passed instead. When you start up the bus, the performance counter code subscribes to the events as shown:

_eventAggregatorScope.Subscribe(message =>
	{
		_counters.ReceiveCount.Increment();
		_counters.ReceiveRate.Increment();
		_counters.ReceiveDuration.IncrementBy((long) message.ReceiveDuration.TotalMilliseconds);
		_counters.ReceiveDurationBase.Increment();
		_counters.ConsumerDuration.IncrementBy((long) message.ConsumeDuration.TotalMilliseconds);
		_counters.ConsumerDurationBase.Increment();
	});

Now, when the bus receives a message, it sends the event to the event aggregator (an instance of the Magnum Pipeline).

var message = new MessageReceived
	{
		MessageType = messageType,
		ReceiveDuration = receiveTime,
		ConsumeDuration = consumeTime,
	};

	_eventAggregator.Send(message);

Since the Magnum Pipeline is publish/subscription, additional consumers could also opt-in to the MessageReceived event and perform other actions as well. I also plan to add counters per message type, allowing a finer grained view at message counts and consumer durations.

While the main story behind this post is the new counters available in MassTransit, my hope is that this brief introduction to performance counters was useful as well. You can learn more about performance counters from various articles that have been posted (such as a good one on CodeProject). You can check out the Magnum Pipeline in the Magnum project which is hosted at GoogleCode.

Related Articles:

Post Footer automatically generated by Add Post Footer Plugin for wordpress.

About Chris Patterson

Chris is a senior architect for RelayHealth, the connectivity business of the nation's leading healthcare services company. There he is responsible for the architecture and development of applications and services that accelerate care delivery by connecting patients, providers, pharmacies, and financial institutions. Previously, he led the development of a new content delivery platform for TV Guide, enabling the launch of a new entertainment network seen on thousands of cable television systems. In his spare time, Chris is an active open-source developer and a primary contributor to MassTransit, a distributed application framework for .NET. In 2009, he was awarded the Most Valuable Professional award by Microsoft for his technical community contributions.
This entry was posted in .net, c#, masstransit. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • John

    Thanks for the informational article – and also for pointing to the article on CodeProject. All the examples I’ve seen use Performance Counters and Performance Monitor to show currently executing performance.

    What if I’m interested in collecting the same performance data for a production system but I don’t want to watch PerfMon all the time? Is there a way of capturing and logging the data collected by my Performance Counters so that I can analyze my system’s performance data periodically as opposed to in real-time?

  • http://www.lostechies.com/members/phatboyg/default.aspx Chris Patterson

    @John
    I’m pretty sure there are monitoring suites that do just that – capture the performance data for playback at some time in the future.

    I think PerfMan is one, but there are likely many others such as HP OpenView and so forth.

  • http://www.facebook.com/michael.chelomanov Michael Chelomanov

    Hi

     

    Let me introduce Enigma NMS from NETSAS – http://netsas.com.au

    Enigma NMS is a true enterprise grade network management and
    monitoring solution trusted by Queensland Government, where it has been
    deployed in many departments managing and monitoring many thousands of network
    nodes, servers, environmental equipment and applications.

    It empowers network support staff with functionality, which
    allows them to provide superior and proactive network management services to their
    users.  Enigma deployment resulted in
    increased network availability, revenue and sharp reduction in effort required
    to support their very large and complex enterprise network and server
    infrastructure. In fact for many clients Enigma NMS is the ONLY tool they use.

    Range of functions built into the single product is just
    amazing and most of performance monitoring functions is enabled out of the box.
    Enigma NMS is vendor agnostic, multi-tenant, multi-user solution with extreme
    automation and integration of all network related objects, which live in the
    same database. No more inter-product integration headaches. All this which
    makes is ideal network management and monitoring solution for enterprise
    environments of any size and complexity.

    Fully customizable and integrated carriage management module
    provides clear and detailed view of your entire networking environment
    including connected WAN and MAN links, VLANs, MPLS VRF, IP Routes, Network
    users, movement of all your network assets, consumed bandwidth and all network
    activity.

    Please Note: All following features are delivered within the
    single product (Platinum Edition)

    Network Performance
    Monitor – 200,000+ interfaces, 1min stats, never rolled (CPU, Memory,
    Ping, Errors, Discards, Broadcasts, Traffic Utilisation (Bits/Packets per
    sec)

     

    QoS Monitor – 50,000+ QoS
    Classes Utilization and Drops, 1min stats, never rolled, zero
    configuration and zero maintenance

     

    Port Monitor – Auto
    detection and monitoring of Layer 2 and 3 trunks

     

    Device Locator – by MAC or
    IP Address

     

    Dynamic Physical Topology
    Maps with integrated carriage and status colour coded devices and
    interfaces

     

    Visibility of all IP
    Routes, IP ARP Entries, VLANs, VTP and MSTP Domain, BGP, EIGRP and OSPF
    Peers

     

    Application Monitor – with
    Web Content and Response Time Monitoring

     

    Server Monitor – 1000+
    servers, CPU, MEM, File System Utilization, Installed Software and
    Monitoring of Running Processes

     

    Wireless Monitor – Auto
    discovered WLC, LWAP, WLAN – VLAN Mapping, Mobile Clients

     

    VM Monitor – Auto discovered
    VM Hosts, VM Guests, Resource utilisation

     

    Asset Manager – All
    Hardware and Software modules on all managed devices, historical tracking

     

    IP Address Manager – IPv4
    and IPv6

     

    Traffic Analyser – all
    versions of NetFlow, unlimited sources, zero maintenance

     

    IP SLA Monitor – 20,000+
    probes, zero configuration and zero maintenance.

     

    VRF Monitor – VRFs,
    Interfaces memberships, routing, TE Tunnels

     

    Environment Monitor – UPS,
    Temperature, Voltage, anything!

     

    SYSLOG Monitor –
    customizable matching patterns and actions

     

    SNMP Trap Monitor –
    customizable matching patterns and actions

     

    User Activity Monitor –
    real time visibility of all commands entered via CLI across all network

     

    Configuration Manager –
    vendor independent, auto config downloads and scheduled config changes on
    multiple devices

     

    Maintenance Contract
    Monitor – proactive notifications on contract expiration

     

    Carrier Services
    Management – fully customizable to any operational or business
    requirements

     

    Carriage Bill Validation –
    minimization of telecommunication expenses, nearly instant ROI!

     

    High-Availability Cluster
    – Virtual IP and real-time data replication, highest reliability and
    business continuity

     

    Incident and Change
    Management

     

    Intrusion Detection
    Monitor

     

    Cisco NBAR Monitor

     

    Integration with LDAP,
    DNS, NTP, SMTP, TACACS, SMS

     

    Following are examples of operational challenges, which
    network managers could face on daily basis.

    Enigma NMS can provide quick answers to all following
    questions and many more:

     

    • How many network nodes in my network and what vendors they
    belong to?

    • What hardware models I have in my network?

    • What hardware modules are installed in my network devices?

    • What is my network physical and logic topology?

    • What IOS versions out there in my network and where they
    are?

    • What maintenance contract I have and when they expire, so
    I can renew them in time?

    • How many IP Subnets in my network and where they are,
    which next IP Subnet I can use, are there enough free IP addresses for new
    client machines?

    • 30 Network Nodes went down 10 minutes ago, which node is
    to blame (root cause).

    • What Microsoft product I have installed on all my PCs?

    • Are any of my application servers running out of memory or
    disk space.

    • I want to monitor all my databases and web servers and be
    notified when they die.

    • How many TenGigabit links do I have and what they are?

    • Do I have any duplicate IP Addresses and where they are?

    • How many Vlans and VTP domains out there, where they are
    and what nodes/interfaces belong to particular Vlan?

    • How many physical trunks there is and where they are? I
    want them to be monitored automatically?

    • How many physical trunks went down today, last week or
    month and for how long, are they connected to WAN carriage?

    • What are the mostly utilised links or all links showing
    errors or discarding packets?

    • Are there any links with duplex mismatch, which will cause
    severe performance degradation?

    • Our users are complaining about slow application response,
    can I quickly identify if there is anything wrong with my network or they
    should to talk to application support staff?

    • What is my network availability, monthly, weekly, daily?

    • How many outages were in my network and what was the
    outage reason?

    • I want to know what the maximum traffic utilisation on
    gigabit access ports in my network.

    • What my engineers doing on network devices?

    • What is my network availability yearly trend, is my
    network becoming more or less stable?

    • What SLA my network devices are covered by?

    • What carrier services are being used in my network and what
    nodes/interfaces they are connected to?

    • Which client devices, e.g. Printers, Servers, Workstation,
    etc are connected to my network and what their names?

    • If I take down particular network node for maintenance,
    how many network clients will be affected and what they are?

    • Which traffic is traversing my network and possibly
    causing congestion?

    • What are the top talkers and applications consuming my WAN
    bandwidth, is it all legitimate traffic, who is accessing what on Internet and
    when?

    • What is the latency across my network?

    • We need to roll out new application, do we have enough
    bandwidth capacity for it?

    • New application requires 5 new servers with gigabit NICs,
    do we have enough network port capacity in our data centre?

    • How can add or change access-list to 500 network devices
    without having to telnet into each device?

    • How many devices have been rebooted last night and why?

    • At remote site, I need to provide network connectivity for
    10 more people, do I have enough network ports?

    • What is the status of my UPS? Are the batteries still in
    good condition or need to be replaced?

    • Cisco TAC has sent me critical bug notice, how I can
    quickly identify network nodes which need IOS upgrade?

    • Are there any multi-cast (Video) streams in my network and
    where they are coming from, if one of my IP Video camera goes off the air can I
    quickly find it?

    • What spare network equipment I have and where is it, do I
    have enough spares to support my entire network infrastructure?

    • Have the configs on my network devices been changed
    yesterday, last week or month? Who and when made the change?

    • Have all my network devices configs been backed up?

    • My core switch went down last night, why? What was state
    this device was at the time of the failure and what is the most probable cause?

    • Can I suppress all alarms from affected nodes during
    scheduled maintenance?

    • One of my application servers died Friday night, which was
    not discovered till Monday, where is it connected to the network? I want to be
    notified next time it happens.

    • What Cisco hardware visible on the network, which is not
    in my management database?

    • How many Cisco IP Phones out there and where they are,
    what are the extension numbers, users names etc?

    • We have purchased new Voice-grade WAN link to overseas,
    can I monitor that the carriage provider is giving me the Quality of Service I
    have paid for?

    • Are there any unauthorised wireless devices, which could
    pose security risk and how long have they been connected to the network?

    • I would like all HP, Cisco and 3COM network devices ONLY
    to be discovered and added to the database.

     

     
     

  • Jeff Smith

    Is there an example available of an implementation of this? I’m attempting to add these performance counters to our monitor but it’s not clear how you’re accessing the RuntimePerformanceCounter objects from the _counters collection when subscribing to the bus. I’m new to both Mass Transit and performance counters so any help would be greatly appreciated.