How much does a Chief Data Officer earn?

How much does a Chief Data Officer earn?
The compensation for the Chief Data Officer (CDO) role varies by nearly a magnitude! There are two reasons for it. Firstly, the role continues to vary in responsibilities. Secondly, CDOs have become a common occurrence. Hence they are found in all types of companies, from startups to blue chips, and their financial capabilities differ greatly.  read more →

What is a Chief Data Officer?

What is a Chief Data Officer?
The Chief Data Officer (CDO) role has been around for some years, and most organisation have a CDO position today. Surprisingly, there is little agreement on the Chief Data Officer responsibilities. Forbes highlighted it in this article earlier this year. As I am looking at the job market and specifically Chief Data Officer postings I ... read more →

Big Data Case Study: How a FTSE100 FinTech adopted Data Science

Big Data Case Study: How a FTSE100 FinTech adopted Data Science
To stay competitive and innovate large organisations need to adopt Big Data, Advanced Analytics and Data Science capabilities and technologies. Ideally, that happens fast and safely with manageable costs and early business outcomes. The requirements and necessary capabilities are uncertain, though, and the appropriate technologies and implementation plan unresolvable. In that situation, many organisations jump ... read more →

Virtualizing Hadoop with NAS 3

Virtualizing Hadoop with NAS
A recent question in the Hortonworks Community  mentioned someone using Hadoop in a virtualized environment with EMC’s Isilon NAS (Network Attached Storage). While this may be a valid use case for some anyone who is looking at Hadoop as more than small number crunching cluster(s) will have to reflect on this approach. Here are some ... read more →

Star Schema in Hive and Impala 2

Star Schema in Hive and Impala
Someone on the Hortonworks Community asked about how to design star schema with Hive. This is a question I hear in some way or another from various stakeholders in large enterprises we work with at Big Data Partnership. And I usually answer it by taking a step back and I did that answering the community ... read more →

6 Steps to Big Data success: Think business not technology 2

6 Steps to Big Data success: Think business not technology
In this year’s Gartner hype cycle analysis big data is approaching the trough of disillusionment. So is it unlikely for organizations to succeed with big data adoptions and should they focus elsewhere? No, of course not. Marketing has done its job and everybody knows of big data though few know what it means to their ... read more →

Big Data talks in Dhaka 1

Big Data talks in Dhaka
I am spending a few days in Dhaka and am taking the opportunity to meet some companies working on Big Data and give some talks. read more →

The four types of Big Data as a Service (BDaaS) 4

The four types of Big Data as a Service (BDaaS)
The popularity of Big Data lies within its broad definition of employing high volume, velocity, and variety data sets that are difficult to extract value from and manage. Unsurprisingly, most businesses can identify themselves as facing now or in future Big Data challenges and opportunities. This, therefore, is not a new issue yet it has a ... read more →

Full Metal Hadoop as a Service with Altiscale

Full Metal Hadoop as a Service with Altiscale
Hadoop, known to be powerful and challenging to manage, is increasingly becoming available as-a-Service in numerous varieties. Initially, do-it-yourself distributions like Cloudera, MapR, and Hortonworks made up a great part of the market. In recent years, following the success of Amazon Web Services ElasticMapReduce (EMR), Hadoop/data services like Qubole are becoming popular. Last year, quietly, another entrant in the field ... read more →

Lambda Architecture: Achieving Velocity and Volume with Big Data 4

Lambda Architecture: Achieving Velocity and Volume with Big Data
Big data architecture paradigms are commonly separated into two (supposedly) diametrical models, the more traditional batch and the (near) real-time processing. The most popular technologies representing the two are Hadoop with MapReduce and Storm. However, a hybrid solution, the Lambda Architecture, challenges the idea that these approaches have to exclude each other. The Lambda Architecture combines ... read more →