Sunday, November 6, 2011

Introducing Advanced Messaging

History:
JPMorgan Chase developed the Advanced Message Queue Protocol (AMQP) to reinvigorate the messaging technologies that had become stagnant. This event has opened the door for many other companies to enter the industry, companies such as RedHat, Microsoft, Novell, Credit Suisse, Deutche Boerse Systems, Goldman Sachs and a number of smaller companies, such as Iona technologies, 29West and Rabbit technologies.
AMQP has now been taken over by the AMQP working group that includes a number of very large and smaller companies. More information can be found at the AMQP website


Advanced messaging is taking off, judging by the list of large companies embracing this technology, I'd say that AMQP is here to stay and it's time to get familiar with it.


What is AMQP
AMQP is a wire level protocol - it describes the data going over the wire, much as SOAP or RMI. The intention is for AMQP to promote interoperability between messaging systems.
It also provides a advanced publish and subscribe features which this blog is going to introduce.


What can we do with AMQP?





Advanced messaging introduces exchanges and routing/subscription keys to messaging. A publisher transmits messages that contain routing keys to an exchange, and a queue is bound to an exchange using subscription keys. This separates the publisher from the queue and from the consumer and allows for a number of interesting implementation variations.


Let's cover these new features in a little more detail.
Exchanges, Queues and Bindings
Exchanges are essentially brokers that determine which queue or queues will receive a message. Queues are bound to exchanges using subscription keys, and the broker determines which queue to send the message by comparing the routing key with the queue bindings subscription key. When a match is made, the message is passed to the queue. This architecture allows for multiple copies of a message to be sent to various queues that are bound to it when the keys match.


Routing and Subscription keys.
Routing keys are composed of dot separated strings, each string is considered a level in the key hierarchy.
Subscription keys are composed the same way but include two wild card characters.
*, will match a single string within one level, you can use more than one of these in a subscription key.
#, will match anything below a certain level - only one is allowed.
combinations of * and # are allowed.
for example, the following table illustrates how subscription keys match to routing keys.



routing key subscription keys

order.# order.UK.*
order.UK.store1
order.US.store1
order.CA.store2
order.UK.store2
order.CA.store2
order.UK.store2.shipment
shipment.UK.store2


A queue can be bound to multiple exchanges and to each exchange with multiple subscription keys. That allows us almost limitless variations in how we can bind a queue to exchanges.


E-Commerce Gateway.


Let's review an example of an retail system. The front end is either a point of sale system, e-commerce or wholesale order entry system. It's an international company with retail operations all over the world.


Regardless of the system, all orders and sent to the order exchange, and queues are bound to the exchange so that they pick up what they're interested in.








In the above example we have three e-commerce systems transmit orders to the order exchange. The e-commerce applications are unaware of the queues that are bound to the exchange, nor are they aware of what will be done with the orders once they're put in the exchange. Depending on the business need some systems will pick the order up immediately, like the warehouse management system, the business wants to account for the revenues as fast as possible and can only do so when the order ships. The warehouse is set up to immediately handle orders for it's region as they come on the queue. The accounting system doesn't have this sense of urgency, it recognizes revenue on a daily basis and picks the orders up twice a day, once at noon and once just before the end of the business day. Not shown is an email marketing system that listens to all orders and emails order confirmation emails on the hour. There's an archiving system that archives all messages as they come for auditing purposes.


The beauty and elegance of this system is that many systems can be tied together without impeding each other in any way. The failure of one component does not impact another, messages continue to get queued and once the system comes online again, processing starts where it left off with no loss of information. Integration is simple and the system is robust.


RPC Mode.








AMQP has another feature that allows us to use it in a 'RPC' mode. The client transmits a request to the exchange and multiple listeners pick it up, this allows us to distribute load amongst a variable amount of programs and have them respond to the client using a 'replyto' queue.


The 'replyto' queue can be made to be exclusive to the client and only last as long as the client is connected to the AMQP application. Messages are assigned a 'replyto' queue and the remote process responds to the client by placing a message on that queue. The client associates the response to the request using a correlation_id. The remote process has to make sure that it applies the correlation_id to it's response message and responds using the 'reply_to' queue.


In the example above, the query is distributed amongst several search engines. For this design to be effective the searchable space has to be divided into many smaller pieces, each search engine queries the searchable space it is responsible for ranks the results and responds to the client. The client collects the responses from the various engines, sorts them appropriately and returns the results to it's client.


The only drawback that I can see is that the client doesn't know when the response is complete. There is no 'end of response' marker to let the client know that all respondents have answered. One way around that is for the client to know how many responses to expect and simply count until all have been received. But at that point the client is strongly coupled to the number of processes behind the message wall. Another way is for the apps to coordinate amongst each other and determine who is the last to respond. The process who is last adds an 'end of response' marker that the client recognizes. This approach is overly complex for my taste. Perhaps the simplest way out of this is to simply have one process handle the clients request, a response is then by default the only response the client will receive. The problem with all of these solutions is that the system is no longer fault tolerant or robust and implementing a webservice call or something similar probably makes more sense.


Parallel processing.
The simplest implementation of the advanced messaging is one we're familiar with, parallel task distribution.
The producer sends messages to an exchange, a queue is bound to it and multiple consumers listen to the queue and process the messages. The amount of consumers listening to the queue can be varied, each consumer processes requests in parallel to the other and the system can make short work of the queue.

4 comments:

  1. Would it make sense to use a messaging for the different components WITHIN the ecommerce system(catalog, interface with payment gateway, etc).

    ReplyDelete
    Replies
    1. I think it does. Messaging decouples those components from one another and that has a stabilizing effect. When one component is taken offline for maintenance for example, the others aren't impacted because the middle layer (AMQP) is storing those messages for when the component comes back up. Otherwise, this typically causes cascading failures, or requires that all system owners to agree to take their systems offline for a specific amount of time.

      I hope this answers your question.

      Delete
  2. It does. Nice articles and good timing. I'm reasearching building a payment system that interfaces with a payment gateway, and how messaging plays a role in that. Now I just need to find out what additional things NServiceBus/MassTransit provide on top of RabbitMQ. My devSense (typical to Spidey Sense) is telling me that RabbitMQ itself will suffice. Keep up the good work, someone is reading. :)

    ReplyDelete
    Replies
    1. Thanks. I agree that RabbitMQ will probably be enough. If you need to bounce ideas, feel free to do so.

      Delete