wwHadoop: Analytics without Borders


This week at EMC World 2012, the EMC technical community is launching a community program called World Wide Hadoop.  I am really excited to be a part of a collaboration across the EMC technical community that has been looking to extend the "borders" of our Big Data portfolio by building on the success of our Greenplum Hadoop distribution in offering the open source community the ability to federate their hBase…

Big Data Universe: “Too Big to Know”


I was surfing to WGBH on Saturday when I came across a lecture by with David Weinberger (surrounding his new book Too Big to Know). I was sucked in when he eluded to brick and mortar libraries as yesterdays public commons, and pointed to discontinuous and disconnected nature of books / paper. The epitaph may read something like this "book killed by hyperlink, the facts of the matter are whatever…

The Greenplum “Big Data” Cloud Warehouse


The Data Warehouse space has been red hot lately. Everyone knows the top tier players, as well as the emergents. What have become substantial issues are the complexity of scale/growth of enterprise analytics (every department needs one) and increasing management burden that business data warehouses are placing on IT. Like the wild west, a business technology selection is made for "local" reasons, and the more "global" concerns are left to…

Fallacies of Enterprise Information Management (part deux)…


With some hearty comments from Tom Maguire, I've been forced to adjust some of these fallacies: 1. Data quality is perfect - data is correct, complete and coherent across all enterprise contexts - People will remediate bad data - if inaccuracies are found (contrary to the axiom above) users will willingly and proactively make changes, and all users will agree with those changes 2. Relationships are Known - The linkages…