Tag Archives: Systems engineering

Some new rules for IT

The other week I had a go at capturing the rules of enterprise IT{{1}}. The starting point was a few of those beery discussions we all have after work, where we came to wonder how the game of enterprise IT was changing. It’s the common refrain of big-to-small, the Sieble to Saleforce.com transition which sees the need for IT services (internal or external) change dramatically. The rules of IT are definitely changing. Now that I’ve had a go at old rules, I thought I’d have a go at seeing what the new rules might be.

As I mentioned before, enterprise IT has historically been seen as an asset management function, a production line for delivering large IT assets into the IT estate and then maintaining them. The rules are the therefore rules of business operations. My attempt at capturing 4 ± 2 rules (with friends) produced the following (in no particular order):

[[1]]The rules of Enterprise IT @ PEG[[1]]

  • Keep the lights on. Much like being a trucker, the trick is to keep the truck rolling (and avoid spending money on tyres). Otherwise known as smooth running applications are the ticket to the strategy table.
  • Save money. Business IT was born as a cost saving exercise (out with the rooms full of people, in with the punch card machines), and most IT business cases are little different.
  • Build what you need. I wouldn’t be surprised if the team building LEO{{2}} blew their own valve tubes. You couldn’t buy parts of the shelf so you had to make everything. This is still with us in some organisations’ strong desire to build – or at least heavily customise – solutions.
  • Keep the outside outside. We trust whatever’s inside our four walls, while deploying security measures to keep the evil outside. This creates an us (employees) and them (customers, partners, and everyone else) mentality.

[[2]]LEO: Lyons Electronic Office. The first business computer. @ Wikipedia[[2]]

Things have changed since these rules were first laid down. From another post of mine on a similar topic{{3}} (somewhat trimmed and edited):

[[3]]The IT department we have today is not the IT department we’ll need tomorrow @ PEG[[3]]

The recent global financial criss has fundamentally changed the business landscape, with many are even talking about the emergence of a new normal{{4}}. We’ve also seen the emergence of outsource, offshore, cloud computing, SaaS, Enterprise 2.0 and so much more.

Companies are becoming more focused, while leaning more heavily on partners and services companies (BPO, out-sourcers, consultants, and so on) to cover those areas of the business they don’t want to focus on. We can see this from the global companies who have effectively moved to a franchise model, though to the small end of town where startups are using on-line services such as Amazon S3, rather than building their own internal capabilities.

We’re also seeing more rapid business change: what used to take years now takes months, or even weeks. The constant value-chain optimisation we’ve been working on since the 70s has finally cumulated in product and regulatory life-cycles that change faster than we can keep up.

Money is also becoming (or has become) more expensive, causing companies and deals to operate with less leverage. This means that there is less capital available for major projects, pushing companies to favour renting over buying, as well as creating a preference for smaller, incremental change over the major business transformation of the past.

And finally, companies are starting to take a truly global outlook and operate as one cohesive business across the globe, rather than as a family of cloned business who operate more-or-less independently in each region.

[[4]]The new normal @ McKinsey Quarterly[[4]]

So what are the new 4 ± 2 rules? They’re not the old rules of asset management. We could argue that they’re the rules of modern manoeuvre warfare{{5}} (which would allow me to sneak in one of my regular John Boyd references{{6}}), but that would be have the tail wagging the dog as it’s business, and not IT that has that responsibility.

[[5]]Maneuver warfare @ Wikipedia[[5]]
[[6]]John Boyd @ Wikipedia[[6]]

I think that the new rules cast IT as something like that of a pit crew. IT doesn’t make the parts (though we might lash together something when in a pinch), nor do we steer the car. Our job is to swap the tyres, pump the fuel, and straighten the fender, all in that sliver of time available to us, so that the driver can focus on their race strategy and get back out on track as quickly as possible.

With that in mind, the following seems to be a fair (4 ± 2) minimum set to start with.

  • Timeliness. A late solution is often worse than no solution at all, as you’ve spent the money without realising any benefit. Or, as a wise sage once told me, management is the art of making a timely decision, and then making it work. Where before we could take the time to get it right (after all, the solution will be in the field for a long time and needs to support a lot of people, so better to discover problems early rather than later), now we just need to make sure the solution is good enough in the time available, and has the potential to grow to meet future demand. The large “productionisation” efforts of the past need to be broken into a series of incremental improvements (à la Gmail and the land of perpeputal beta), aligning investment with both opportunity and realised value.
  • Availability. Not just up time, but ensuring that all stakeholders (both in and outside the company, including partners and clients) can get access to the solutions and data they need. There’s little value in a sophisticated knowledge base solution if the sales team can’t use it in the field to answer customer questions in real time. Once they’ve had to fire up the laptop, and the 3G card, and the VPN, the moment has passed and the sale lost. Or worse, forcing them to head back to the bricks and mortar office. As I pointed out the other week, decisions are more important than data{{7}}, and success in this environment means empowering stakeholders to make the best possible decisions by ensuring that the have the data and functions they need, where they need, when they need it, and in a format that make it easy to consume.
  • Agility. Agility means creating an IT estate that meet the challenges we can see coming down the road. It doesn’t mean creating an infinitely flexible IT estate. Every bit of flexibility we create, every flex point we add, comes at a cost. Too much flexibility is a bad thing{{8}}, as it weighs us down. Think of formula one cars: they’re fast and they’re agile (which is why driving them tends to be a young mans game), and they’re very stiff. Agility comes from keeping the weight down and being prepared to act quickly. This means keeping things simple, ensuring that we have minimum set of moving parts required. The F1 crowd might have an eye for detail, such as putting nitrogen{{9}} in the tyres, but unnecessary moving parts that might reduce reliability or performance are eliminated. Agility is the cross product of weight, speed, reliability and flexibility, and we need to work to get them all into balance.
  • Sustainability. Business is not a sprint (ideally), and this means that cost and reliability remain important factors, but not the only factors. While timeliness, availability and agility might be what drive us forward, we need still need to ensure that IT is still a smooth running operation. The old rules saw cost and reliability as absolutes, and we strived to keep costs as low, and reliability as high, as possible. The new rules see us balancing sustainability with need, accepting (slightly) higher costs or lower reliability to provide a more timely, available or agile solution while still meeting business requirements. (I wonder if I should have called this one “balance”.)

[[7]]Decisions are more important than data @ PEG[[7]]
[[8]]Having too much SOA is a bad thing (and what we might do about it) @ PEG[[8]]
[[9]]Understanding the sport: Tyres @ formula1.com[[9]]

While by no mean complete or definitive, I think that’s a fair set of rules to start the discussion.

BPM is not a programming challenge

Get a few beers into a group of developers these days and it’s not uncommon for the complaints to start flowing about BPM (Business Process Management). BPM, they usually conclude, is more pain than it’s worth. I don’t think that BPM is a bad technology, per se, but it does appear to be the wrong tool for the job. The root of the problem is that BPM is a handy tool for programming distributed systems, but the challenge of creating distributed systems is orthogonal to business process execution and management. We’re using a screw driver to belt in a nail. It’s more productive to think business process execution and management as a (realtime) planning problem.

Programming is the automation of the known. Take as stable, repeatable process and automate it; bake the process into silicone to make it go fast. This is the same tactic that I was using back in my image processing days (and that was a long time ago). We’d develop the algorithms in C, experiment and tweak until they were right, and once they were stable we’d burn them into an ASIC (Application-Specific Integrated Circuit) to provide a speed boost. The ASICs were a lot faster than the C version: more than an order of magnitude faster.

Programmers, and IT folk in general, have a habit of treating the problems we confront as programming challenges. This has been outstandingly successful to date; just try and find a home appliance or service that doesn’t have a programme buried in it somewhere. (It’s not an unmitigated success though, such as our tumble drier is driving us nuts if its overly frequent software errors.) It’s not surprising that we chose to treat business processes automation and management as a programming problem once it appeared on our radar.

Don’t get me wrong: BPM is a solid technology. A friend of mine once showed my how he’d used his BPM stack to test its BPEL engine. As side from being a nice example of eating your own dog food, it was a great example of using BPEL as a distributed programming tool to solve a small but complex problem.

So why do we see so many developers complaining about BPM? It’s not the technology itself: the technology works. The issue is that we’re using it to solve problems that it’s not suited for. The most obvious evidence of this is the current poor state of BPM support for business exception management. We’ve deployed a lot of technology to support exception management in business processes without really solving the problem.

Managing business exceptions is driving the developers nuts. I know of one example where managing a couple of not infrequent business exceptions was the major technical problem in a very significant project (well into eight figures). The problem is that business exceptions are not from the same family of beasts as programming exceptions. Programming exceptions are exceptional. Business exceptions are just a (slightly) different way to achieve the same goal. All our compensating actions and exception stacks just get in the way of solving the problem.

On PowerPoint, anything can look achievable. The BPMN diagram we shared with the business was extremely elegant: nice sharp angles and coloured bubbles. Everyone agreed that it was a good representation of what the business does. The devil is in the details though. The development team quickly becomes frustrated as they have to deal with the realities of implementing a dynamic and exception rich business processes. Exceptions pile up on top of exceptions, and soon that BPMN diagram covers a wall, littered as it is with branch and join operations. It’s not a complex process, but we’ve made it incredibly complicated.

Edward Tufte's take on explaining complex concepts with PowerPoint
A military parade explained, a la PowerPoint

We can’t program our way out of this box, trying to pile on more features and patches. We can rip the complications out – simplifying the process to the point that it becomes tractable with our programming tools (which is what happened in my example above). But this removes all the variation which which makes the processes so valuable. (This, of course, the dirty secret of LEAN et al: you’re trading flexibility for cost saving, making your processes very efficient but also very fragile.)

Or we can try solving the problem a different way.

Don’t treat the automation of a business processes as a programming task (and I by this I mean the capture of imperative instructions for a computer to execute, no matter how unstructured or parallel). Programming is the automation of the known. Business processes, however, are the management and anticipation of the unknown. Modelling business processes should be seen as a (realtime) planning problem.

Which comes back to one of my common themes: push vs pull models, or the importance of what over how. Or, as a friend of mine with a better turn of phrase puts it, we need to stop trying to invent new technologies and work out how to use what we already have more effectively. Rather than trying to invent new technologies to solve problems that are already well understood elsewhere, pushing the technology into the problem, a more pragmatic approach is to leverage that existing understanding and then pull in existing technologies as appropriate.

Planning and executing in a rapidly changing environment is a well understood problem. Just ask anyone who’s been involved with the military. If we view the management of a business processes as a realtime planning problem, then what were business exceptions are reduced to simply alternate routes to the same goal, rather than a problem which requires a compensating action.

Battle of Gaugamela (Arbela) (331BC)
Take that hill!

One key principle is to establish a clear goal – Take that hill!, or Find that lost shipment! – articulate the tactics, the courses of action we might use to achieve that goal, and then defer decisions on which course of action to take until the decision needs to be made. If we commit to a course of action too early, locking in a decision during design time, then it’s likely that we’ll be forced to manage the exception when we realise that we picked the wrong course of action. It’s better to wait until the moment when all relevant information and options are available to us, and then take decisive action.

From a modelling point of view, we need to establish where are the key events at which we need to make decisions in line with a larger strategy. The decisions at each of these events needs to weigh the available courses of action and select the most appropriate, much like using a set of business rules to identify applicable options. The course of action selected, a scenario or business process fragment, will be semi independent from the other in the applicable set, as it addresses a different business context. Nor can the scenario we pick cannot be predetermined, as it depends on the business context. Short and sharp, each scenario will be simple, general and flexible, enabling us to configure it for the specific circumstances at hand, as we can’t anticipate all possible scenarios. And finally, we need to ensure that the scenarios we provide cover the situations we can anticipate, including the provision of a manual escape hatch.

Goals, rules and process: in that order. Integrated rather than as standalone engines. Pull pull these established technologies into a single platform and we might just be closer to a BPM solution inline with what we really need. (And we know there is nothing new under the sun, as this essentially a build on Jim Sinurs rules-and-process argument, and borrows a lot from STRIPS, PRS, dMARS and even the work I did at Agentis.)

As I mentioned at the start of this missive, BPM as a product category makes sense and the current implementations are capable distributed programming tools. The problem is that business process management is not a distributed programming challenge. Business exceptions are not exceptional. I say steal a page from the military strategy book – they, after all, have been successfully working on this problem for some time – and build our solutions around ideas the military use to succeed in a rapidly changing environment. Goals, rules and processes. The trick is to be pragmatic, rather than dogmatic in our implementation, and focus on solving the problem rather then trying to create a new technology.