Easy web services on top of a Hadoop data lake

Posted on 24 May 2016

As it entered the enterprise world, Hadoop has become increasingly associated with the concept of data lake. The idea is actually very simple (hence its success!): with its schema-free file system, capable of handling structured, semi-structured and new unstructured data alike, Hadoop helps overcome the traditional structuration of enterprise data in silos.

With a wide range of tools to import data, the use of commodity hardware for underlying storage and the availability of parallel processing technologies for ...

Read more →
0

Bigtop: The unsung hero of Hadoop’s development

Posted on 21 Jul 2014

With the release of Hadoop 2.0 and YARN, MapReduce has been downgraded as one among many execution engines running on the platform, and the combination of HDFS and YARN represents the new core of Hadoop. Or more precisely (to take into account the specificities of MapR or IBM distributions), the combination of the HDFS API and of YARN has become Hadoop’s kernel.

However, above this kernel, the Hadoop ecosystem remains a conglomerate of open-source projects oft initiated ...

Read more →
0