Most interesting business decisions seem to be a synthesis process. We take a handful of data and fuse them to create an insight. The invention of breath strips is a case in point. We can rarely break our problem down to a single (computed) metric, the world just doesn’t work that way.
Most business decisions rest on small number of data points. It’s just one of our cognitive limits: our working memory is only large enough to hold (approximately) four things (concepts and/or data points) in our head at once. This is one reason that I think Andrew McAfee’s cut-down business case works so well; it works with our human limitations rather than against them.
I was watching an interesting talk the other day — Peter Norvig was providing some gentle suggestions on what features should be beneficial in a language to support scientific computing. Somewhere in the middle of the talk he mentioned the Curse of dimensionality, which is something I hadn’t thought of for a while. This is the problem caused by the exponential increase in volume associated with each additional dimension of (mathematical) space.
In terms of the problem we’re considering, this means that if you are looking for n insights to a problem in a field of data (the n best data points to drive our decision), then finding them becomes exponentially harder for each data set (dimension) we add. More isn’t necessarily better. While the addition of new data sets (such as sourcing data from social networks) enables us to create new correlations, we’re also forced to search an exponentially larger area to find them. It’s the law of diminishing returns.
Our inbuilt cognitive limit only complicates this. When we hit our cognitive limit — when n becomes as large as we can usefully use — any additional correlations can become a burden rather than a benefit. In today’s rich and varied information environment, the problem isn’t to consider more data, or to find more correlations, its to find the best three or features in the data which will drive our decision in the right direction.
How do we navigate from the outside in? From the decision we need, to the data that will drive it. This is the problem I hope the Value of Information discussion addresses.
We’re drowning in a sea of data and ideas, with huge volumes of untapped information available both inside and outside our organization. There is so much information at our disposal that it’s hard to discern Arthur from Martha, let alone optimize the data set we’re using. How can we make sense of the chaos around us? How can we find the useful signals which will drive us to the next level of business performance, from amongst all this noise?
Traditional Business Intelligence (BI) tackles this problem by enabling us to mine for correlations in the data tucked away in our data warehouse. These correlations provide us with signals to help drive better decisions. Managing stock levels based on historical trends (Christmas rush, BBQs in summer …) is good, but connecting these trends to local demographic shifts is better.
Unfortunately this approach is inherently limited. Not matter how powerful your analytical tools, you can only find correlations within and between the data sets you have in the data warehouse, and this is only a small subset of the total data available to us. We can load additional data sets into the warehouse (such as demographic data bought from a research firm), but in a world awash with (potentially useful) data, the real challenge is deciding on which data sets to load, and not in finding the correlations once they are loaded.
What we really need is a tool to help scan across all available data sets and find the data which will provide the best signals to drive the outcome we’re looking for. An outside-in approach, working from the outcome we want to the data we need, rather than an inside-out approach, working from the data we have to the outcomes it might support. This will provide us with a repeatable method, a system, for finding the signals needed to drive us to the next level of performance, rather than the creative, hit-and-miss approach we currently use. Or, in geekier terms, a methodology which enables us to proactively manage our information portfolio and derive the greatest value from it.
I was doodling on the tram the other day, playing with the figure I created for the Inside vs. Outside post, when I had a thought. The figure was created as a heat map showing how the value of information is modulated by time (new vs. old) and distance (inside vs. outside). What if we used it the other way around? (Kind of obvious in hindsight, I know, but these things usually are.) We might use the figure to map from the type of outcome we’re trying to achieve back to the signals required to drive us to that outcome.
This addresses an interesting comment (in email) by a U.K. colleague of mine. (Jon, stand up and be counted.) As Andy Mulholland pointed out, the upper right represents weak confusing signals, while the lower left represents strong, coherent signals. Being a delivery guy, Jon’s first though was how to manage the dangers in excessively focusing on the upper right corner of the figure. Sweeping a plane’s wings forward increases its maneuverability, but at the cost of decreasing it’s stability. Relying too heavily on external, early signals can, in a similar fashion, could push an organization into a danger zone. If we want to use these types of these signals to drive crucial business decisions, then we need to understand the tipping point and balance the risks.
My tram-doodle was a simple thing, converting a heat map to a mud map. For a given business decision, such as planning tomorrow’s stock levels for a FMCG category, we can outline the required performance envelope on the figure. This outline shows us the sort of signals we should be looking for (inside good, outside bad), while the shape of the outlines provides us with an understanding (and way of balancing) the overall maneuverability and stability of the outcome the signals will support. More external predictive scope in the outline (i.e. more area inside the outline in the upper-right quadrant) will provide a more responsive outcome, but at the cost of less stability. Increasing internal scope will provide a more stable outcome, but at the cost of responsiveness. Less stability might translate to more (potentially unnecessary) logistics movements, while more stability would represent missed sales opportunities. (This all creates a little deja vu, with a strong feeling of computing Q values for non-linear control theory back in university, so I’ve started formalizing how to create and measure these outlines, as well as how to determine the relative weights of signals in each area of the map, but that’s another blog post.)
Given a performance outline we can go spelunking for signals which fit inside the outline.
Luckily the mud map provides us with guidance on where to look. An internal-historical signal is, by definition driven by historical data generated inside the organization. Past till data? An external-reactive signal is, by definition external and reactive. A short term (i.e. tomorrow’s) weather forecast, perhaps? Casting our net as widely as possible, we can gather all the signals which have the potential to drive us toward to the desired outcome.
Next, we balance the information portfolio for this decision, identifying the minimum set of signals required to drive the decision. We can do this by grouping the signals by type (internal-historical, …) and then charting them against cost and value. Cost is the acquisition cost, and might represent a commercial transaction (buying access to another organizations near-term weather forecast), the development and consulting effort required to create the data set (forming your own weather forecasting function), or a combination of the two, heavily influenced by an architectural view of the solution (as Rod outlined). Value is a measure of the potency and quality of the signal, which will be determined by existing BI analytics methodologies.
Plotting value against cost on a new chart creates a handy tool for finding the data sets to use. We want to pick from the lower right – high value but low cost.
It’s interesting to tie this back to the Tesco example. Global warming is making the weather more variable, resulting in unseasonable hot and cold spells. This was, in turn, driving short-term consumer demand in directions not predicted by existing planning models. These changes in demand represented cost, in the from of stock left on the shelves past it’s use-by date, or missed opportunities, by not being able to service the demand when and where it arises.
The solution was to expand the information footprint, pulling in more predictive signals from outside the business: changing the outline on the mud map to improve closed-loop performance. The decision to create an in-house weather bureau represents a straight forward cost-value trade-off in delivering an operational solution.
These two tools provide us with an interesting approach to tackling a number of challenges I’m seeing inside companies today. We’re a lot more externally driven now than we were even just a few years ago. The challenge is to identify customer problems we can solve and tie them back to what our organization does, rather than trying to conceive offerings in isolation and push them out into the market. These tools enable us to sketch the customer challenges (the decisions our customers need to make) and map them back to the portfolio of signals that we can (or might like to) provide to them. It’s outcome-centric, rather than asset-centric, which provides us with more freedom to be creative in how we approach the market, and has the potential to foster a more intimate approach to serving customer demand.