Dale Vile, originally published on CIO
I couldn’t say this for certain, but I get the impression that the marketing noise around ‘big data’ has now exceeded that of ‘cloud computing’. Either way, there is no doubt that the ‘next big thing’ is now pretty well established as vendors clamour for position in this rapidly moving area.
Unlike cloud, however, which started out as largely a re-hashing of familiar ideas around hosting, SOA, data centre automation and business service management, the whole big data movement is introducing net new capability to the business mainstream from the outset, which was confirmed in a recent Freeform Dynamics research study (122 IT pro respondents, November 2011, Figure 1).
That’s not to say that everything talked about in terms of big data technology is new in absolute terms, but until recently, there weren’t that many offerings in some key big data areas that you would call genuinely ‘enterprise ready’. This has been especially true in the areas of distributed indexing and search, and large scale distributed analytics, where it has often been a case of hand-crafting solutions based on a combination of open source and commercial components to get the desired result; fine if you are Yahoo!, Facebook or a big bank with lots of resource to throw at it, but not really tenable in a busy and resource-constrained mainstream IT department.
With this in mind, vendors like IBM and EMC have been playing the game of bringing open solutions together with their own proprietary technology for a while to form coherent offerings, or at least out of the box integration between the pieces required. This has been necessary because of the shortcomings of environments such as Apache Hadoop in the areas of resilience, security, management and development tooling.
In an announcement this week, however, the daddy of the high end database world, Oracle, has declared its hand. Having already been dabbling in the area of distributed indexing and search (with the Oracle NoSQL Database), it is now getting into bed with Cloudera, arguably the most established independent specialist provider in the Hadoop world.
The end result is the Oracle Big Data Appliance, a Hadoop stack underpinned by Sun/Linux servers and other platform components from Oracle, and augmented with Cloudera’s enhanced Hadoop management environment. Oracle has also announced a portfolio of what it calls ‘Big Data Connectors’, which provide ease of integration between the Hadoop Distributed File System (HDFS) or Oracle NoSQL Database, and a traditional relational database environment.
These announcements are especially interesting given Oracle’s existing strong presence in the high end data management and analytics space. The Cloudera guys are extremely capable and have been doing some good stuff, but the Hadoop distribution at the centre of their activities is strengthened by the Oracle platform pieces. Furthermore, rightly or wrongly, enterprise IT departments often prefer to work with an established incumbent when introducing new ideas and capability into the mix.
Oracle’s broader database management pedigree is also important when we consider that big data technology will, on the whole, complement rather than replace traditional database and storage capability. Indeed there are many scenarios in which it makes sense to exploit both together, e.g. with preliminary exploration and analysis on large data sets with a poor signal to noise ratio taking place in Hadoop, then a more compact and structured derived data set being extracted into a traditional warehouse or BI environment. This is one of the reasons why the connectors Oracle is providing make absolute sense.
The co-existence of big data with traditional database and storage technologies was confirmed during the aforementioned research, which shows quite clearly that with the exception of legacy systems, IT professionals anticipate grown across all of the technology categories explored (Figure 2).
And if you ask the question explicitly, most people confirm that they don’t anticipate big data solutions replacing traditional options in any significant way (Figure 3).
However, turning to hard practicality, we also see a couple of calls to action for vendors on this chart. IT professionals are not convinced that suppliers can back up all of the big data hype with tangible support and services at the moment to help customers realise the potential, and they also have concerns about licensing and commercial arrangements as data related needs become more demanding.
So, despite the technology advances, there is still some work to be done, and it will be interesting to see how Oracle deals with these issues as its big data activities continue to develop.