GraphChi: How a Mac Mini outperformed a 1,636 node Hadoop cluster

GraphChi: How a Mac Mini outperformed a 1,636 node Hadoop cluster
Last year GraphChi, a spin-off of GraphLab, a distributed graph-based high performance computation framework, did something remarkable. GraphChi outperformed a 1,636 node Hadoop cluster processing a Twitter graph (dataset from 2010) with 1.5 billion edges – using a single Mac Mini. The task was triangle counting and the Hadoop cluster required over 7 hours while ... read more →

4 Free DIY Twitter Visualisations: The Shahbag Protest

4 Free DIY Twitter Visualisations: The Shahbag Protest
Earlier this year a mass movement occurred in Bangladesh, which received little global news coverage. It was an immensely important event to Bangladeshi’s at home and abroad. This prompted me to try and illustrate the event with Twitter data myself, merely utilizing some free web services and a few hours time. Amazingly the results are ... read more →

9+ Free Online Data Science Resources You Should Know 1

Data Science is a hot topic and there are plenty of courses and resources available for anyone interested. Try out these 9 free resources to get started if you are new to the topic or want to refresh on one of the subjects. read more →

Big Data at Mendeley

Big Data at Mendeley is about similarity measures and comparing documents, groups, and users for search, deduplication, recommendations and classification. read more →

Voronoi Tessellation

The Voronoi Tesselation (or Voronoy Tessellation) by Georgy Feodosevich Voronoy/Вороной Георгий Феодосьевич (1908) is a technique that enables the division of a such multi-dimensional spaces into subspaces. Its application defines geometric areas equivalent to subspaces by defining several vectors as centres of subspaces. Any other vector in space can then be attributed to the closest centre ... read more →