When and when not to to use Hadoop

When business thinking about leveraging big data and analytics ask how to get going, they often are advised to begin with Hadoop, Apache Software’s open source data storage and processing framework.

There are a number of reasons Hadoop is an appealing alternative. Not only does the platform offer both dispersed computing and computational capabilities at a reasonably low cost, it has the ability to scale to meet the awaited exponential increase in data created by mobile technology, social media, the Web of Things, and other emerging digital technologies.

These benefits, together with strong word of mouth and prominent implementations by companies such as Facebook, Yahoo, and various Fortune 50 giants, is driving adoption of Hadoop.

Research firm Researchbeam in March forecast the worldwide Hadoop market to grow to $50 billion in 2020 from $1.5 billion in 2012. The majority of that money will be spent on services offered by commercial Hadoop experts such as Cloudera, Hortonworks, and MapR Technologies.

But not all data researchers are getting on board the Hadoop train. In fact, many have jumped off. In a recent study of data scientists on the barriers to big data analytics, supplier Paradigm4 reports that more than three-quarters (76 %) of the researchers who stated they have actually utilized Hadoop or Glow (the computational framework improved top of the Hadoop distributed file system) cite “significant limitations” to their use.

Particularly, 39 % of participants stated Hadoop takes too much effort to program, while 37 % said it was “too slow for interactive, ad hoc questions.” Another 30 % knocked Hadoop as being too slow-moving for real-time analytics. And more than one-third (35 %) of data researchers surveyed who have actually used Hadoop and Spark stated they have actually stopped utilizing them.

Granted, this survey is from a supplier that’s offering “even more” than Hadoop. However the factors provided by respondents discussing their discontentment with Hadoop are grounded in real issues as opposed to supplier buzz.

Take response time. If you’re planning to produce complicated analytics or real-time analytics, Hadoop most likely isn’t the platform for you. Nevertheless, analytical services for which response time takes a back seat to accuracy and lasting insights.

Some of the scientists who stopped making use of Hadoop simply may have selected it for the wrong job– such as real-time analytics– in the first place. For them, proceeding only makes good sense.

Another prospective source of dissatisfaction with Hadoop (that wasn’t mirrored in the Paradigm4 study) is expense. Enterprises that enter into Hadoop thinking it’s going to be free or low-cost because it’s open source normally get a big surprise. And they normally wind up paying by contracting with a Hadoop services vendor or working with qualified Hadoop developers and experts to work in-house, and already introducing misguided Hadoop jobs that trigger them to fall back rivals.

Early adopters of Hadoop who ended up being disappointed may have been victims of the first wave of Hadoop hype. The progressive maturation of big data and analytics innovations, in addition to better-educated customers, ought to make it simpler for business to select the best analytics option.

It’s really about exactly what you’re trying to do that identifies whether the tool suffices for the job.

GeoViz is a team of experienced technical and business professionals that help our customers to achieve their ‘Operations and Maintenance Performance Management’ goals. Our experts minimize inefficiencies 360 degrees focusing Assets, Processes, Technology, Materials, People, Infrastructure, and Energy. GeoViz serves client inside North America specifically USA and Canada while physically serving clients in the cities of Seattle, Toronto, Buffalo, Ottawa, Monreal, London, Kitchener, Windsor, Detroit. Feel free to contact us or Drop us a note for any help or assistance.

 

Drop Us A Note

[gravityform id=”2″ name=”Drop us a Note” title=”false” description=”false” ajax=”true”]

Post a comment