I’m not a big fan of Semantic Web{{1}}. For something that has been around for just over ten years — and which has been aggressively promoted by the likes of Tim Berners-Lee{{2}} — very little real has come of it.
Taxonomies, on the other hand, are going gangbusters, with solutions like GovDirect{{3}} showing that there is a real need for this sort of data-relationship driven approach{{4}}. Given this need, if the flexibility provided by Semantic Web (and more recently, Linked Data{{5}}) was really needed, then we would have expected someone to have invested in building significant solutions which use the technology.
While the technology behind Semantic Web and Linked Data is interesting, it seems that most people don’t think it’s worth the effort.
All this makes me think: the future of data management and standardisation is ad hoc, with communities or vendors scratching specific itches, rather than formal, top-down, theory driven approaches such as Semantic Web and Linked Data, or even other formal standardisation efforts of old.
[[1]]SemanticWeb.org[[1]]
[[2]]Tim Berners-Lee on Twitter[[2]]
[[3]]GovDirect[[3]]
[[4]]Peter Williams on the The Power of Taxonomies @ the Australian Government’s Standard Business Reporting Initiative[[4]]
[[5]]LinkedData.org[[5]]
The technologies behind the likes of Semantic Web and Linked Data have a long heritage. You can trace them back to at least the seventies when ontology and logic driven approaches to data management faced off against relational methodologies. Relational methods won that round — just ask Oracle or the nearest DBA.
That said, there has been a small number of interesting solutions built in the intervening years. I was involved in a few in one of my past lives{{6}}, and I’ve heard of more than a few built by colleagues and friends. The majority of these solutions used ontology management as a way to streamline service configuration, and therefor ease the pain of business change. Rather than being forced to rebuild a bunch of services, you could change some definitions, and off you go.
[[6]]AAII[[6]]
What we haven’t seen is a well placed Semantic Web SPARQL{{7}} query which makes all the difference. I’m still waiting for that travel website where I can ask for a holiday, somewhere warm, within my budget, and without too many tourists who use beach towels to reserve lounge chairs at six in the morning; and get a sensible result.
The flexibility which we could justify in the service delivery solutions just doesn’t appear to be justifiable in the data-driven solution. A colleague showed my a Semantic Web solution that consumed a million or so pounds worth of tax payer money to build a semantic-driven database for a small art collection. All this sophisticated technology would allow the user to ask all sorts of sophisticated questions, if they could navigate the (necessarily) complicated user interface, or if they could construct an even more daunting SPARQL query. A more pragmatic approach would have built a conventional web application — one which would easily satisfy 95% of users — for a fraction of the cost.
When you come down to it, the sort of power and flexibility provided by Semantic Web and Linked Data could only be used by a tiny fraction of the user population. For most people, something which gets them most of the way (with a little bit of trial and error) is good enough. Fire and forget. While the snazzy solution with the sophisticated technology might demo well (making it good TED{{8}} fodder), it’s not going to improve the day-to-day travail for most of the population.
[[8]]TED[[8]]
Then we get solutions like GovDirect. As the website puts it:
GovDirect® facilitates reporting to government agencies such as the Australian Tax Office via a single, secure online channel enabling you to reduce the complexity and cost of meeting your reporting obligations to government.
which make it, essentially, a Semantic Web solution. Except its not, as GovDirect is built on XBRL{{9}} with a cobbled together taxonomy.
[[9]]eXtensible Business Reporting Language[[9]]
Taxonomy driven solutions, such as GovDirect might not offer the power and sophistication of a Semantic Web driven solution, but they do get the job done. These taxonomies are also more likely to be ad hoc — codifying a vendor’s solution, or accreted whilst on the job — than the result of some formal, top down ontology{{10}} development methodology (such as those buried in the Semantic Web and Linked Data).
[[10]]Ontology defined in Wikipedia[[10]]
Take Salesforce.com{{11}} as an example. If we were to develop a taxonomy to exchange CRM data, then the most likely source will be other venders reverse engineering{{12}} whatever Salesforce.com is doing. The driver, after all, is to enable clients to get their data out of Salesforce.com. Or the source might be whatever a government working group publishes, given a government’s dominant role in its geography. By extension we can also see the end of the formal standardisation efforts of old, as they devolve into the sort of information frameworks represented by XBRL, which accrete attributes as needed.
[[11]]SalesForce.com[[11]]
[[12]]Reverse engineering defined in Wikipedia[[12]]
The general trend we’re seeing is a move away from top-down, tightly defined and structured definitions of data interchange formats, as they’re replaced by bottom-up, looser definitions.
Funny innit? W3C and the like are like enterprises stuck in time, being overcrowded by hiveminded organisations that are driven by unilateral of bilateral incentives
We need taxonomies before we can have a Semantic Web. Let me tell you though, as soon as there are taxonomies that make global-ish sense we will have a Semantic Web and it will be far from what the tin-foiled hats have cooked up over the last 10 years
Are there taxonomies? Sure there, all over the place. We call them dictionaries, and they describe perfectly what we agree certain words and definitions mean. The problem is, there never is only one description for one word: there are dozens of them
On top of that, everyone's inventing his own taxonomy over and over again, and usually it's IT-people that do so. They mold it into XML or XBRL, creating names and tags that are over 100 characters at times, without specifying much of the length or datatype, but most importantly: by not giving a well-described definition of the item
We can all talk English, and we do on Twitter and Facebook, but that doesn't mean we understand eachother. W3C is a tech troop great at inventing stuff from scratch without having to comply with existing standards, customs or people. Semantics is way, way beyond them, because they're based on thousands of years of mostly unwritten agreements and disagreements between people
Indeed, see http://www.regnet.org/demonstrationarea.html – there's not one glint of a SPARQL to be seen there. The topic maps are great and that's just taxonomy, ontology and XML
Hey Peter,
Excellent post on the ebizQ Forum, and I would like to make you an official member. If interested, please email me at pschooff(at)gmail.com.
Thanks,
Pete
It's interesting how the old standardisation process is rapidly becoming irrelevant. I'm a big believer that our future standards will either be accreted, like Open Stack, or ad hoc, such as adopting Salesforce.com's CSV data format as an interchange standard just because they're the biggest player in the market. The days of research driven standardisation and technology efforts seem to be behind us.