Saturday, November 12, 2011

Enterprise solution for Logging

This blog and its companion blog are about building example solutions for a distributed architecture. The whole idea behind a distributed architecture is to loosely couple expert systems together using a messaging infrastructure. As examples for expert systems we use Notifications and Logging because almost all projects include them, and as such they are great candidates for being transformed into expert systems. This blog focuses on Logging.
Almost all development projects include a log solution. Projects typically choose log4j or log4net or a custom solution. For large companies these log solutions can cause problems. The benefits to logging is that they contain information about the running of an application, the work it has done and the problems it has encountered along the way. This information is valuable to the development teams to learn how the application behaved and what needs to be done to improvement it. This great benefit can potentially become a liability. In the wrong hands this information can be used to discover the applications vulnerabilities. Log systems also typically generate volumes of content. In an enterprise with a great many applications under management, this volume of information can be excessive and very expensive to store, organize and generally manage. Just making sure that developers configure their application according to the corporations standards requires many meetings and a lot of coordination. Disk capacity for Logs has to be ensured to avoid failure. Ultimately the typical log solution tightly couples the system to a component that is not mission critical.
In conclusion, logging needs to be a managed process, that is standardized, secured and monitored and decoupled from the primary application, leaving it do what it needs to do without having to worry about something that is not central to its task. This way each environment can be set up according to the needs of that environment. A development environment will save all log events, successive environments will log less until the product reaches production at which point only the most important messages are logged. Having a centralized logging system ensures that all environments behave in the same way, there's no need for coordination, app developers are given components to integrate with and are not allowed to write to disk. An expert system is required. As this is series on distributed expert systems, let's look at a solution that uses an advanced messaging systems at it's core. As you can see from the image, the solution will end up including a notification system, but that will be described in another blog.

Applying the pattern to Logging

We've discussed advanced messaging before, as well as the ever so reliable product RabbitMQ that is based on the distributed telecom language Erlang. The message bus solution can be clustered and therefore scales well, it is very fast and reliable. It is also available as a cloud solution from VMWare. The advantage of using a message bus between an application and an expert system is that it completely decouples the two, there's one universal solution to logging, one standard that is set by the corporation and can include any number of components to ensure that the content is secured and far away from 'prying eyes'.
Putting such a system together is very easy. All it requires is a very simple Facade that simply hands off the logging functionality to the message bus to be picked up by the 'Expert' logging system. That expert system, at the very least is as complicated as the original solution (that was implemented umpteen times across all applications), but now that this is centralized we can focus a little more attention to making sure the content is secured - both in terms of content, as well as access.


Before I start to code a solution, I usually try to define my namespaces by looking at my layer diagram. Logging is a utility, and there will be many other utilities, notifications, exception management, session management, caching, wmi instance logging, the list goes on. Most of these utilities are candidates for standardization even going so far as to create dedicated expert systems for. So a utilities namespace makes sense at a root level. The MessageBus is part of the infrastructure, much like a Database server is or web services that we may consume. So the infrastructure namespace will also exist off the root. The expert systems that provide these type of plumbing services I have aggregated into a 'Monitor' namespace, not sure if that's the right name, but it will do for now.
The layer diagram looks like this.

the logging interface, for the purposes of this example we define an extremely simple interface
namespace YEA.Utilities.Logging
    public interface ILog
        void write(string message, LogType level);
The Interface is implemented by a Logger class that implements the write method as follows.
        public void write(string message, LogType level)
            var sc = new LogToMessageConverter(level);
            var publisher = _gf.GetPublisher(_publisherName, sc.ConvertObjectToMessage);
And the serialization of the log message implemented by the following:
public static class LogPropertiesStandards
        public const string LogLevelPropertyName = "LogLevel";
    public class LogToMessageConverter : YEA.Infrastructure.Gateway.Converter.StringToMessageConverter
        public LogType LogLevel { get; set; }
        public LogToMessageConverter(LogType level)
            LogLevel = level;
        public override Message ConvertObjectToMessage(RabbitMQ.Client.IModel channel, object packetToSend)
            Message message = base.ConvertObjectToMessage(channel, packetToSend);
            message.Properties.Headers.Add(LogPropertiesStandards.LogLevelPropertyName, (int)LogLevel);
            message.RoutingKey = UtilitiesHelper.getMyProgramFullName();
            return message;
The converter derives from the simple string to message converter (line 5) and we get the message by calling the base class conversion method ( line 14). All that remains is to set a few properties and the routing key appropriately. The routing key is essentially the application namespace of the calling executable. The log directories on the receiving side will replicate the application namespace. If the namespace follows industry standards, the log files of each department will be located in the same subtree and securing and granting access to the subtree is simplified.
namespace: Company.Organization.Department.BusinessUnit.ApplicationGroup.Application will result in
directory: <drive>:Company/Organization/Department/BusinessUnit/ApplicationGroup/Application.Log
The expert logging system uses log4net with dynamic configuration to save log entries to the directories show above. here's a code extract of the important bits.
using log4net;
using log4net.Repository.Hierarchy;
using log4net.Core;
using log4net.Appender;
using log4net.Layout;
using System;
using System.IO;
using System.Configuration;
using System.Text;

namespace YEA.Monitor.LogManager
    public static class Logger
        private const string LOG_PATTERN = "%d [%t] %-5p %m%n";
        private const string ConfigDriveKey = "LogDrive";
        private static char Drive;

        static Logger()
            Drive = GetDriveLetter();
            log4net.LogManager.GetRepository().Configured = true;
        private static char GetDriveLetter()
            var driveLetter = ConfigurationManager.AppSettings[ConfigDriveKey];
            if (driveLetter == null)
                throw new ConfigurationErrorsException(string.Format("Configuration key: {0} is expected", ConfigDriveKey));
            return Convert.ToChar(driveLetter);

        static void CreateDir(string name)
        private static string ConvertToPath(string name)
            if( string.IsNullOrWhiteSpace(name) )
                throw new ArgumentException("A valid name has to be supplied to the logging application");
            var names = name.Split('.');
            if (names.Length == 1)
                return Drive + @":\InCorrectNamespaceDump\" + name + ".log";

            var builder = new StringBuilder(Drive + @":");
            for (int i = 0; i < names.Length - 1; i++)
            return builder.ToString();
        static IAppender CreateAppender(string repositoryName)
            PatternLayout patternLayout = new PatternLayout();

            patternLayout.ConversionPattern = LOG_PATTERN;

            var path = ConvertToPath(repositoryName);
            var builder = new StringBuilder(path);
            var fileName = builder.ToString();

            RollingFileAppender roller = new RollingFileAppender();
            roller.Name = repositoryName;
            roller.Layout = patternLayout;
            roller.AppendToFile = true;
            roller.RollingStyle = RollingFileAppender.RollingMode.Size;
            roller.MaxSizeRollBackups = 4;
            roller.MaximumFileSize = "100KB";
            roller.StaticLogFileName = true;
            roller.File = fileName ;
            return roller;

        public static ILog Get(string name)
            //var h = CreateHierarchy(name);
            var log = log4net.LogManager.GetLogger(name);
            var l = (log4net.Repository.Hierarchy.Logger)log.Logger;

            if (l.GetAppender(name) == null)
                l.Level = Level.All;
            return log;
The interesting bit in this implementation of log4net is that it has to dynamically create an appender as a new application (represented by it's namespace) starts sending log messages. The method CreateAppender does that and the Get method assigns the appender if the Logger it has retreived doesn't have one.
So what happens in this scenario if the log directory accidentally get's full? The 'expert system' will likely fail. But because it is loosely coupled to the application that has sent the log request, thanks to the messagebus that sits between them, the application isn't affected. In fact it isn't even aware of the failure. The messagebus will continue to collect log requests, queueing them up until the 'expert system' has resolved the issue, at this point all log requests are processed and it's as if the failure never happened. The message bus will gives the system the buffer ( depending on it's disk capacity) it needs to fix the inevitable temporary outages. The other advantage is that maintenance windows can be implemented without forcing all dependent applications to stow working. All of this adds to a more stable system, that can tolerate failure without causing a chain reaction of failures downstream potentially affecting mission critical applications.
The added advantage of decoupling the logging system from the applications that use it, is that there's now no direct relationship between the location of the log files and the location of the applications that generate them. The two could be located on completely separate networks or domains with one not giving a clue to the location of the other. That makes it that much easier to prevent unwanted eyes from gaining access to an applications vulnerabilities.
Here is where you can find the entire source code solution discussed in this blog

No comments:

Post a Comment