Agents, Objects and Guaranteed Delivery

Well designed and implemented systems do not require guaranteed delivery. Guaranteed delivery may have some marginal value in point-to-point implementations but if message delivery reliability is important at a business level, this should be implemented at the business level.

Agents, Objects and Guaranteed Delivery

All modern computing platforms display the same characteristics; large, complex, distributed, and heterogeneous.

These systems comprise sets of co-operating computational agents, each of which executes specific code. These agents:

  • have limited visibility of the overall process
  • display limited computational capabilities
  • access data from multiple data stores, in numerous locations
  • execute asynchronously

Agents and objects differ, although both display many common characteristics. Objects are computational entities that perform actions – or “methods”- and communicate by message passing. Objects, however, do not have control over their behaviour. When an object (O) is invoked to perform a method (M), it performs the required action. Such invocation may occur at any time.

When building a system by grouping objects into sets of co-operating entities, each object is designed and implemented within an architecture designed to display a given behaviour, deliver a specified business function, or generate a specific output. In such a system, it may be assumed that when O receives an invocation from O(n) requiring it to perform M, such invocation results in M being performed. This cannot be taken for granted in multi-agent based systems. Typically these systems comprise large numbers of co-operating agents designed by numerous organisations over many years of development. These systems are heterogeneous, opaque, fragmented, brittle and often display behaviour that was neither intended, designed, anticipated or even thought possible.

In such a system there is no guarantee that agent (A) will perform M simply because such action has been requested. The only assurance is that can be given is that if a request is sent, it is received.

Guaranteed Delivery

Guaranteed delivery, sometimes referred to as “guaranteed messaging” or “reliable messaging” is a middleware function that seeks to provide an assurance that a published message is received by a consuming application; that the consuming application(s) receive the messages in a determined order, and such messages are delivered once and exactly once.

Message Ordering

The mechanism used by guaranteed delivery for message ordering is first in / first out (FIFO). This is managed by attaching a sequencing number to each message, the consuming application returning a receipt, and these receipts being married with the originating message. This requires a unique ID to be assigned to each message and the messages being persisted until all receipts have been returned.

This can raise a number of complexities particularly when used across multi-agent systems interlinked by a heterogenous mix of messaging environments. In this circumstance there is no certainty that each transport layer will transmit the messages in any required order – unless such order is mandated by the process logic. If not mandated, FIFO will determine the sequencing of the execution events in the order by which the messages are received by the message bus.

Should message ordering be critical to the process, the preferred approach is to embed the required ordering as a permanent feature in the message semantics, rather than adding it as a transient feature at the messaging level. The production environment may then be monitored to ensure correct behaviour and should variant behaviour be identified, a notification may be generated.

If sequencing information is already embedded in the message payload then transient sequencing information at the message bus level introduces a further complexity. Invariably the circumstance will arise where the message header g attribute will differ from that specified in the message body. The question then becomes; which is the correct sequencing?

Further complexity arises when a single agent initiates multiple concurrent and parallel actions across n-co-operating agents. In this circumstance FIFO cannot function as a message ordering property.

Assigning a unique ID to each message also raises a number of difficulties. In heterogeneous systems, there is no assurance that this ID will survive as a process migrates from one messaging environment to another. To the contrary, it is more likely that a unique ID assigned by one vendor’s guaranteed delivery software will not survive the migration to another vendors messaging infrastructure – the result being both a loss of the end-to-end process visibility and a breakdown of the guaranteed delivery functionality.

If event ordering is critical to a business process, this ordering should be embedded in the business logic, not added as a transient feature at the message bus – and should ordering be permanently mandated by the process logic, no value is added by introducing guaranteed delivery functionality.

Once and Only Once Delivery

Similar difficulties arise with guaranteed delivery “once and only once” functionality. If it is important at a business level to ensure that process state is determined by once only/exact delivery, adding the assurance that a message has been delivered to a consuming application is of little value.

In multi-agent systems there is no assurance that any agent will perform any action simply because it has received a message. At a business level the receipt required is not that a message has been delivered but that the agent is performing the action(s) required.

This again is a question of system behaviour and, as with message ordering, guaranteed delivery should not be used as a proxy for ensuring correct system behaviour. If once only/exact delivery is critical to a process, this should be determined in the process semantics and production monitored for correct execution.

Further complications arise should syntactic errors be contained in a message. In this circumstance, guaranteed delivery will return a receipt confirming the message has been delivered but this is of little significance if the consuming application rejects the request.

Even if the message is syntactically and semantically correct, what happens if the receiving application does not respond? Should the message be sent a second time? Resends do nothing more than raise the complication as to whether the second message a duplication or a new request.

This may be addressed by using unique identifiers. If each message represents a unique business activity, each should be assigned its own unique ID. This again, is a business level function, not something to be managed at the transport level.

If unique IDs are assigned at the business level these can then be used to identify and reject duplicated messages. If duplicated messages are being removed at the business level, this is a function of the process semantics and not a subsystem function.

As with message ordering, little value is added by introducing only once/exact delivery at the messaging level. As with message ordering, if only once/exact delivery is critical to the business process, this should be determined by the process semantics and not managed at the transport level.

Logging and Auditing

Assigning transient sequencing information at the messaging level means that once the returned receipts have been married with their respective messages, this information is lost. Whilst it is possible to assign sequencing information detailing to the order by which the messages emanated from the message bus, in the absence of implementing guaranteed messaging across the entire infrastructure, this is of little value for logging and auditing purposes.

A preferred approach is to continuously monitor the production environment and extract the AS-USED state of the implementation. If it is observed that sets of execution sequences, across the entire environment, differ from that required, the non-compliance sequences may be isolated for investigation and diagnosis. The value extracted by this approach is not the marginal enhancement of identifying with certainty that a message has been delivered, but in extracting an audit trail of the execution paths delivered, thus enabling continuous process improvement and optimisation.

Performance Cost

Guaranteed delivery introduces multiple components of production environment overhead. As every published message requires the return of a receipt, this doubles the number of messages on the infrastructure.

The added overhead of signing sequencing information, assigning unique message IDs, persisting of every message, marrying the receipts to the published messages, conducting consistently checks and performing all the other functions necessary to guaranteed messaging consumes multiple layers of compute resource with an inevitable negative impact on system performance.

Performing these functions at the transport level will also have a higher impact than anticipated for two reasons. Firstly, since the transport may be common across large numbers of agents, many of which may not need guaranteed delivery, these agents will be impacted anyway. Secondly, invariably the lower level transport layer will not have all the information available at the higher business level and thus will not be able to perform the functions required as efficiently.

Whilst measuring this may be complex, the cause is simple: identifying with certainty that a message has or has not been delivered is of marginal value: the important information that needs to be extracted is not certainty of message delivery but certainty that the consuming application is performing the actions required.

A preferred approach is to copy each message to an environment entirely decoupled from production, with the message copy being transmitted to this environment via a dedicated message queue.

Once the decoupled environment is furnished with a copy of the messages, these can be sorted into their execution sequences enabling an AS-USED representation of system behaviour to be extracted. In terms of performance cost, the only impact on production performance using this approach is a minimal amount of CPU consumed to copy and transmit each message.

Summary

Well designed and implemented systems do not require guaranteed delivery. Guaranteed delivery may have some marginal value in point-to-point implementations but if message delivery reliability is important at a business level but this should be implemented at the business level.

When dealing with heterogeneous multi-agent systems, continuous production environment monitoring will ensure the behaviour of the AS-USED state remains in compliance with delivery expectations. If non-compliant execution sequences are evident, these may be identified and isolated.

Guaranteed delivery does not ensure system behaviour is as desired nor should it be used as a proxy for such.   Simply ensuring that a message transmitted by agent (A) is received by consuming agent (B) is of marginal value. The value to be extracted is visibility over the end-to-end processes. This enables system behaviour to be identified and continuous process compliance to be implemented.

That is, the value lies in process visibility, not message delivery certainty.

← Return to News