Tag Archives: BPM

Business Process Monitoring

BPM over promised and under delivered

One Saturday night the other week I was typing away on a book that I’m working on (probably called The new instability. How cloud computing, globalisation and social media enable to you to create an unfair advantage) and I let out what was probably a quite involved tweet without any context to explain it.

[blackbirdpie id=”59188145870213120″]

Recently I’ve been thinking about the shift we’re seeing in the business environment. The world seems pretty unstable at the moment. Most business folk assume that this is simply a transition between two stable states, similar to what we’ve seen in the past. This time, however, business seems to be unable to settle into a new groove. The idea behind the book is that the instability we’re seeing is now the normal state of play.

Since Frederick Taylor’s time we’ve considered business – our businesses – vast machines to be improved. Define the perfect set of tasks and then fit the men to the task. Taylor timed workers, measuring their efforts to determine the optimal (in his opinion) amount of work he could expect from a worker in a single day. The idea is that by driving our workers to follow optimal business processes we can ensure that we minimise costs while improving quality. LEAN and Six Sigma are the most visible of Taylor’s grandchildren, representing generations of effort to incrementally chip away at the inefficiencies and problems we kept finding in our organisations.

This is the same mentality – incremental and internally focused, intent on optimising each and every task in our organisations – that we’ve used to apply technology to business. Departmental applications were first deployed to automate small repetitive tasks, such as tracking stock levels or calculating payrolls. Then we looked at the interactions between these tasks, giving birth to enterprise software in the process. Business Process Management (BPM) is the pinnacle of our more recent efforts – pulling in everything from our customers through to suppliers to create optimal straight through processes for our organisation to rely on.

Some vendors have taken this approach to its logical extreme, imagining (and trying to get us to buy) a single technology platform which will allow us to programme our entire business: business operating platforms1)Ismael Ghalimi (2009), Introducing the Business Operating Platform, IT|Redux. They’re aligning elements in the BPM technology stack with the major components found in most computers under the (mistaken) assumption that this will enable them to create a platform for the entire business. Business as programmable machine writ large.

The problem, as I’ve pointed out before2)Business is not a programming challenge @ PEG, is that:

Programming is the automation of the known. Business processes, however, are the management and anticipation of the unknown.

Business is not a computer, with memory, CPUs and disks, and the hope of creating an Excel with which we can play what if with the entire business is simply tilting at windmills.

The focus of business is, and always has been, problems and the people who solve them. Technology is simply a tool we’ve used to amplify these people, starting with the invention of writing through to modern SaaS applications and BPM suites. While technology has had a previously unimaginable impact on business, it can’t (yet) replace the people who solve the problems which create all the value. People collaborate, negotiate, and smash together ideas to find new solutions to old problems. Computers simply replicate what they are told to do.

We’ve reached Taylorism’s use-by date. Define the perfect task and fit the man to the task no longer works. The pace of business has accelerated to the point that the environment we operate in has become perpetually unstable, and this is pushing us to become externally focused, rather than internally focused. We’re stopped worrying about collecting resources to focus on our reactions to problems and opportunities as they present themselves. Computing (calculating payrolls, invoices, or gunnary tables) is less important as it can be obtained on demand, and we’re more concerned with the connections between ourselves and our clients, partners, suppliers and even our competitors. And we’re shifted our focus from collecting ever more data as it becomes increasingly important to ask the questions which enable us to make the right decisions and drive our business forward.

Success today in today’s unstable environment means matching the tactic – process – to the goal we’re trying to achieve and our current environment, with different tactics being using in different circumstances. Rather than support one true way, we need to support multiple ways.

There has been some half steps in the right direction, with the emergence of Adaptive Case Management (ACM)3)Keith D. Swenson (2010), Mastering the Unpredictable, Meghan-Kiffer Press. being the most obvious one. A typical case study for ACM might be something like resolving SWIFT payment exceptions. When the ACM process is triggered a knowledge worker creates a case and starts building a context by pulling data in and triggering small workflows or business processes to seek out data and resolve problems. At some stage the context will be complete, the exception resolved, and the final action is triggered. Contrast this with the standard BPM case study, which is typically a compliance story. (It’s no surprise that regulations such as SOX drove a lot of business processes work.) BPM is a task dependency tool, making it very good at specifying the steps in the required process, but unable to cope with exceptions.

So what do we replace the Talyorism’s catch cry with? The following seems to suit, rooted as it is in the challenge of winning in a rapidly changing environment.

Identify the goal and then assemble the perfect team to achieve the goal.

Note: This was also posted on noprocess.org.

References   [ + ]

1. Ismael Ghalimi (2009), Introducing the Business Operating Platform, IT|Redux
2. Business is not a programming challenge @ PEG
3. Keith D. Swenson (2010), Mastering the Unpredictable, Meghan-Kiffer Press.

Michelangelo’s approach to workflow discovery

Take any existing workflow — any people driven business process — and I expect that most of the tasks within it could best be described as cruft.

cruft: /kruhft/
[very common; back-formation from crufty]

  1. n. An unpleasant substance. The dust that gathers under your bed is cruft; the TMRC Dictionary correctly noted that attacking it with a broom only produces more.
  2. n. The results of shoddy construction.
  3. vt. [from hand cruft, pun on ‘hand craft’] To write assembler code for something normally (and better) done by a compiler (see hand-hacking).
  4. n. Excess; superfluous junk; used esp. of redundant or superseded code.
  5. [University of Wisconsin] n. Cruft is to hackers as gaggle is to geese; that is, at UW one properly says “a cruft of hackers”.

The Jargon File, v4.4.7

Capturing and improving a workflow (optimising it even) is a processes of removing cruft to identify what really needs to be there. This is remarkably like Michalangelo{{1}}’s approach to carving David{{2}}. When asked how he created such a beautiful sculpture, everything just as it should be, Michalangeo responded (and I’m paraphrasing):

[[1]]Michelangelo Buonarroti[[1]]
[[2]]Michelangelo’s David[[2]]

Michelangelo's David
Michelangelo’s David

David was always there in the limestone; I just carved away the bits that weren’t David.

Cruft is the result of the people — the knowledge workers engaged in the process — dealing with the limitations of last decade’s technology. Cruft is the work-arounds and compensating actions for a fragmented and conflicting IT environment, an environment which gets in the road more often than it supports the knowledge workers. Or cruft might be the detritus of quality control and risk management measures put in place some time ago (decades in many instances) to prevented an expensive mistake that is no longer possible.

Most approaches to workflow automation are based on some sort of process improvement methodology, such as LEAN or Six Sigma. These methods work: I’ve often heard is stated that pointing Six Sigma at a process results in a 30% saving, each and every time. They do this by aggressively removing variation in the process — slicing away unnecessary decisions, as each decisions is an opportunity for a mistake. These decisions might represent duplicated decisions, redundant process steps, or unnecessarily complicated handoffs.

There’s a couple of problems with this though, when dealing with workflow. Looking for what’s redundant doesn’t create an explicit link between business objectives and the steps in the workflow, explicitly justifying each step’s existence, making it hard to ensure that we caught all the cruft. And the aggressive removal of variation can strip a process’s value along with its cost.

Much of the cruft in a workflow process is there for historical reasons. These reasons can range from something bad happened a long time in the past through to we don’t know why, but if we don’t do that then the whole thing falls over. A good facilitator will challenge seemingly obsolete steps, identifying those steps who have served their purpose and should be removed. However, it’s not possible to justify every step without quickly wearing down subject matter experts. Some obsolete steps will always leak through, no matter how many top-down and bottom-up iterations we do.

We can also find that we reach the end of the processes improvement journey only to find that much of the process’s value — the exceptions and variation that make the process valuable — has been cut out to make the process more efficient or easier to implement. In the quest for more science in our processes, we’ve eliminated the art that we relied on.

If business process management isn’t a programming challenge{{3}}, then this holds even truer for human driven workflow.

[[3]]A business process is not a programming challenge @ PEG[[3]]

What we need is a way to chip away the cruft and establish a clear line of traceability between the goals of each stakeholder involved in the process, and each step and decision in the workflow. And we need to do this in a way that allows us to balance art and science.

I’m pretty sure that Michalangeo had a good idea of what he wanted to create when he started belting on the chisel. He was looking for something in the rock, the natural seems and faults, that would let him find David. He kept the things that supported his grand plan, while chipping away those that didn’t.

For a workflow processes, these are the rules, tasks and points of variation that knowledge workers use to navigate their way through the day. Business rules and tasks are the basic stuff of workflow: decisions, data transformation and hand-offs between stakeholders. Points of variation let us identify those places in a workflow where we want to allow variation — alternate ways of achieving the one goal — as a way of balancing art and science.

Rather than focus on programming the steps of the process, worrying if we should send an email or a fax, we need to make this (often) tacit knowledge explicit. Working top-down, from the goals of the business owners, and bottom-up, from the hand-offs and touch-points with other stakeholders, we can chip away at the rock. Each rule, task or point of variation we find is measured against our goals to see if we should chip it away, or leave it to become part of the sculpture.

That which we need stays, that which is unnecessary is chipped away.

Business is like a train…

The following analogy popped up the other day in an email discussion with a friend.

Running a business is a bit like being the Fat Controller, running his vast train network. We spend our time trying to get the trains to run on time with the all too often distraction of digging the Troublesome Trucks out of trouble.

Improvement often means upgrading the tracks to create smoother, straighter lines. After years of doing this, any improvement to the tracks can only provide a minor, incremental benefit.

What we really need is a new signalling system. We need to better utilise the tracks we already have, and this means making better decisions about which trains to run where, and better coordination between the trains. Our tracks are fine (as long as we keep up the scheduled maintenance), but we do need to better manage transit across and between them.

Swap processes for tracks, and I think that this paints quite a nice visual picture.

Years of processes improvement (via LEAN, Six Sigma and, more recently, BPM) had straightened and smoothed our processes to the point that any additional investment has hit the law of diminishing returns. Rather than continue to try and improve the processes on my own, I’d outsource process maintenance to a collection of SaaS and BPO providers.

The greater scale of these providers allows them to invest in improvements which I don’t have the time or money for. Handing over responsibility also creates the time and space for me to focus on improving the decisions on which process to run where, and when: my signalling system.

This is especially important in a world where it is becoming rare to even own the processes these days.

We forget just how important a good signalling system is. Get it right and you get the German or Japanese train networks. Get it wrong and you rapidly descend into the second or third world, regardless of the quality of your tracks.

BPM is not a programming challenge

Get a few beers into a group of developers these days and it’s not uncommon for the complaints to start flowing about BPM (Business Process Management). BPM, they usually conclude, is more pain than it’s worth. I don’t think that BPM is a bad technology, per se, but it does appear to be the wrong tool for the job. The root of the problem is that BPM is a handy tool for programming distributed systems, but the challenge of creating distributed systems is orthogonal to business process execution and management. We’re using a screw driver to belt in a nail. It’s more productive to think business process execution and management as a (realtime) planning problem.

Programming is the automation of the known. Take as stable, repeatable process and automate it; bake the process into silicone to make it go fast. This is the same tactic that I was using back in my image processing days (and that was a long time ago). We’d develop the algorithms in C, experiment and tweak until they were right, and once they were stable we’d burn them into an ASIC (Application-Specific Integrated Circuit) to provide a speed boost. The ASICs were a lot faster than the C version: more than an order of magnitude faster.

Programmers, and IT folk in general, have a habit of treating the problems we confront as programming challenges. This has been outstandingly successful to date; just try and find a home appliance or service that doesn’t have a programme buried in it somewhere. (It’s not an unmitigated success though, such as our tumble drier is driving us nuts if its overly frequent software errors.) It’s not surprising that we chose to treat business processes automation and management as a programming problem once it appeared on our radar.

Don’t get me wrong: BPM is a solid technology. A friend of mine once showed my how he’d used his BPM stack to test its BPEL engine. As side from being a nice example of eating your own dog food, it was a great example of using BPEL as a distributed programming tool to solve a small but complex problem.

So why do we see so many developers complaining about BPM? It’s not the technology itself: the technology works. The issue is that we’re using it to solve problems that it’s not suited for. The most obvious evidence of this is the current poor state of BPM support for business exception management. We’ve deployed a lot of technology to support exception management in business processes without really solving the problem.

Managing business exceptions is driving the developers nuts. I know of one example where managing a couple of not infrequent business exceptions was the major technical problem in a very significant project (well into eight figures). The problem is that business exceptions are not from the same family of beasts as programming exceptions. Programming exceptions are exceptional. Business exceptions are just a (slightly) different way to achieve the same goal. All our compensating actions and exception stacks just get in the way of solving the problem.

On PowerPoint, anything can look achievable. The BPMN diagram we shared with the business was extremely elegant: nice sharp angles and coloured bubbles. Everyone agreed that it was a good representation of what the business does. The devil is in the details though. The development team quickly becomes frustrated as they have to deal with the realities of implementing a dynamic and exception rich business processes. Exceptions pile up on top of exceptions, and soon that BPMN diagram covers a wall, littered as it is with branch and join operations. It’s not a complex process, but we’ve made it incredibly complicated.

Edward Tufte's take on explaining complex concepts with PowerPoint
A military parade explained, a la PowerPoint

We can’t program our way out of this box, trying to pile on more features and patches. We can rip the complications out – simplifying the process to the point that it becomes tractable with our programming tools (which is what happened in my example above). But this removes all the variation which which makes the processes so valuable. (This, of course, the dirty secret of LEAN et al: you’re trading flexibility for cost saving, making your processes very efficient but also very fragile.)

Or we can try solving the problem a different way.

Don’t treat the automation of a business processes as a programming task (and I by this I mean the capture of imperative instructions for a computer to execute, no matter how unstructured or parallel). Programming is the automation of the known. Business processes, however, are the management and anticipation of the unknown. Modelling business processes should be seen as a (realtime) planning problem.

Which comes back to one of my common themes: push vs pull models, or the importance of what over how. Or, as a friend of mine with a better turn of phrase puts it, we need to stop trying to invent new technologies and work out how to use what we already have more effectively. Rather than trying to invent new technologies to solve problems that are already well understood elsewhere, pushing the technology into the problem, a more pragmatic approach is to leverage that existing understanding and then pull in existing technologies as appropriate.

Planning and executing in a rapidly changing environment is a well understood problem. Just ask anyone who’s been involved with the military. If we view the management of a business processes as a realtime planning problem, then what were business exceptions are reduced to simply alternate routes to the same goal, rather than a problem which requires a compensating action.

Battle of Gaugamela (Arbela) (331BC)
Take that hill!

One key principle is to establish a clear goal – Take that hill!, or Find that lost shipment! – articulate the tactics, the courses of action we might use to achieve that goal, and then defer decisions on which course of action to take until the decision needs to be made. If we commit to a course of action too early, locking in a decision during design time, then it’s likely that we’ll be forced to manage the exception when we realise that we picked the wrong course of action. It’s better to wait until the moment when all relevant information and options are available to us, and then take decisive action.

From a modelling point of view, we need to establish where are the key events at which we need to make decisions in line with a larger strategy. The decisions at each of these events needs to weigh the available courses of action and select the most appropriate, much like using a set of business rules to identify applicable options. The course of action selected, a scenario or business process fragment, will be semi independent from the other in the applicable set, as it addresses a different business context. Nor can the scenario we pick cannot be predetermined, as it depends on the business context. Short and sharp, each scenario will be simple, general and flexible, enabling us to configure it for the specific circumstances at hand, as we can’t anticipate all possible scenarios. And finally, we need to ensure that the scenarios we provide cover the situations we can anticipate, including the provision of a manual escape hatch.

Goals, rules and process: in that order. Integrated rather than as standalone engines. Pull pull these established technologies into a single platform and we might just be closer to a BPM solution inline with what we really need. (And we know there is nothing new under the sun, as this essentially a build on Jim Sinurs rules-and-process argument, and borrows a lot from STRIPS, PRS, dMARS and even the work I did at Agentis.)

As I mentioned at the start of this missive, BPM as a product category makes sense and the current implementations are capable distributed programming tools. The problem is that business process management is not a distributed programming challenge. Business exceptions are not exceptional. I say steal a page from the military strategy book – they, after all, have been successfully working on this problem for some time – and build our solutions around ideas the military use to succeed in a rapidly changing environment. Goals, rules and processes. The trick is to be pragmatic, rather than dogmatic in our implementation, and focus on solving the problem rather then trying to create a new technology.

Complexity isn’t (at least in enterprise IT)

There’s a lot of talk these days about complexity in enterprise IT. The heterogeneous solutions we’re building today seem more complex than the monolithic solutions of the past. But are they really? I’ll grant you that a lot of what is being built at the moment is complicated. Complex though? I don’t think so. The problem is that we’re building new things in old ways when we need to be building new things in new ways.

I’ve always used a simple rule of thumb when thinking about complexity. Some folk like to get fancy with two and three dimensional models that enable us to ascribe degrees of complexity to problems. While I find these models interesting, my focus has always been on how do I solve the problem in front of me. What is the insight that will make the hard easy? For me, one simple distinction seems to provide the information I need. Is a solution complex? Or is it complicated?

Something is complicated if the model we use to understand the problem requires patches, exceptions to make it work. The model might be simple and well understood, but we’re forced to patch the model for it to succeed when confronted by the real world; we’re adding epicycles. It’s not a complex system, but it is complicated.

Adding epicycles didn't manage to keep the earth at the centre of the universe
Adding epicycles didn't manage to keep the earth at the centre of the universe

On the other hand, something is complex if it’s difficult to develop a consistent model for the problem. While we might have a well understood model, it’s definitely not simple, requiring a great deal of academic and tacit knowledge to understand. There’s no epicycles, but there are a large number of variables involved, and their interactions are often non-linear.

While this binary separation might not be strictly true (the complicated can sometimes be complex), I find that the truly complex problems are rare enough that the rule of thumb is useful most of the time. After all, that’s what a rule of thumb is. The few times that it breaks down, experience comes to the rescue.

Distinguishing between the complex and the complicated is not hard; just look for the epicycles. Planning engines – such as material planning or crew scheduling – are a good example of a complex solutions. Business processes management is a good example of a complicated solution. Psi calculus – the model at the heart of a modern BPM engine – is well understood and BPM engines work as described. However, managing business exceptions is a mess as support for them is tacked on rather than an inherent part of the model. Smashing together psi calculus, transactions and number of other models has resulted in epicycles.

Most of the problems we’re seeing in enterprise IT are complicated, but not complex. Take the current efforts to create IT estates integrating SaaS and public cloud with more traditional enterprise IT tools, such as on-premesis applications and BPO. Conventional approaches to understanding and planning IT estates are creaking at the seems. The model – the enterprise integration patterns – which we’ve used for so long is well understood, but it’s creaking at the seams as we bolt on more epicycles to cope with exceptions as they arise.

Dr. Khee Pang
Dr. Khee Pang

A great piece of advice from a former lecturer of mine always comes to mind in situations like this. As I’ve mentioned before:

If you don’t like a problem, then change it.
KK Pang

Our solution is complicated because we’re trying to solve the wrong problem. We need to change it.

The problem with BPM is that business exceptions are not exceptional, but are simply alternative ways of achieving the same goals. To resolve the epicycles we need to shift the problems centre of gravity, moving the earth from the centre of the universe to a more stable orbit. If business exceptions are not exceptional, then we should simply consider them as different business scenarios, and use a scenario based approach to capturing business processes. The epicycles then melt away.

I think we can use a similar approach to help us with the challenges we’re seeing in today’s IT estates, the same challenges which are trigger some of the discussion on complexity. The current approach to planning and provisioning IT is data centric; most applications are, after all, just large data management engines. This data-centric approach is forcing us to create epicycles as we attempt to integrate SaaS, cloud, and a whole raft of new tools and techniques. The solution is to move from a data-centric approach, to decision-centric approach. But that’s a different blog post.

The value of information

We all know that data is valuable; without it it would be somewhat difficult to bill customers and stay in business. Some companies have accumulated masses of data in a data warehouse which they’ve used to drive organizational efficiencies or performance improvements. But do we ever ask ourselves when is the data most valuable?

Billing is important, but if we get the data earlier then we might be able to deal with a problem—a business exception—more efficiently. Resolving a short pick, for example, before the customer notices. Or perhaps even predicting a stock-out. And in the current hyper-competitive business environment where everyone is good, having data and the insight that comes with it just a little bit sooner might be enough to give us an edge.

A good friend of mine often talks about the value of information in a meter. This makes more sense when you know that he’s a utility/energy guru who’s up to his elbows in the U.S. smart metering roll out. Information is a useful thing when you’re putting together systems to manage distributed networks of assets worth billions of dollars. While the data will still be used to drive billing in the end, the sooner we receive the data the more we can do with it.

One of the factors driving the configuration of smart meter networks is the potential uses for the information the meters will generate. A simple approach is to view smart meters as a way to reduce the cost of meter reading; have meters automatically phone readings home rather than drive past each customer’s premisses in a truck and eyeball each meter. We might even used this reduced cost to read the meters more frequently, shrinking our billing cycle, and the revenue outstanding with it. However, the information we’re working from will still be months, or even quarters, old.

If we’re smart (and our meter has the right instrumentation) then we will know exactly which and how many houses have been affected by a fault. Vegetation management (tree trimming) could become proactive by analyzing electrical noise on the power lines that the smart meters can see, and determine where along a power line we need to trim the trees. This lets us go directly to where work needs to be done, rather than driving past every every power line on a schedule—a significant cost and time saving, not to mention an opportunity to engage customers more closely and service them better.

If our information is a bit younger (days or weeks rather than months) then we can use it too schedule just-in-time maintenance. The same meters can watch for power fluctuations coming out of transformers, motors and so on, looking for the tell tail signs of imminent failure. Teams rush out and replace the asset just before it fails, rather than working to a program of scheduled maintenance (maintenance which might be causing some of the failures).

When the information is only minutes old we can consider demand shaping. By turning off hot water heaters and letting them coast we can avoid spinning up more generators.

If we get at or below seconds we can start using the information for load balancing across the network, managing faults and responding to disasters.

I think we, outside the energy industry, are missing a trick. We tend to use a narrow, operational view of the information we can derive from our IT estate. Data is either considered transactional or historical; we’re either using it in an active transaction or we’re using it to generate reports well after the event. We typically don’t consider what other uses we might put the information to if it were available in shorter time frames.

I like to think of information availability in terms of a time continuum, rather than a simple transactional/historical split. The earlier we use the information, the more potential value we can wring from it.

The value of data
The value of data decreases rapidly with age

There’s no end of useful purposes we can turn our information too between the billing and transactional timeframes. Operational excellence and business intelligence allow us to tune business processes to follow monthly or seasonal cycles. Sales and logistics are tuned on a weekly basis to adjust for the dynamics of the current holiday. Days old information would allow us to respond in days, calling a client when we haven’t received their regular order (a non-event). Operations can use hours old information for capacity planning, watching for something trending in the wrong direction and responding before everything falls overs.

If we can use trending data—predicting stock-outs and watching trends in real time—then we can identify opportunities or head off business exceptions before they become exceptional. BAM (business activity monitoring) and real-time data warehouses take on new meaning when viewed in this light.

In a world where we are all good, being smart about the information we can harvest from our business environment (both inside and outside our organization) has the potential to make us exceptional.

Update: Andy Mulholland has a nice build on this idea over at Capgemini‘s CTO blog: Have we really understood what Business Intelligence means?