Data Science is a hot topic and there are plenty of courses and resources available for anyone interested. Try out these 9 free resources to get started if you are new to the topic or want to refresh on one of the subjects.
A Coursera course specifically about data science, due to start in April 2013. I am very curious about it since its broad syllabus appears to capture many of the experiences data scientists need. Much of it had to be gathered in the field until now. Having a dedicated course for it is an appealing idea.
Course Syllabus – Specific Topics
- Data modeling: relations, key-value, trees, graphs, images, text
- Relational algebra and parallel query processing
- NoSQL systems, key-value stores
- Tradeoffs of SQL, NoSQL, and NewSQL systems
- Algorithm design in Hadoop (and MapReduce in general)
- Basic statistical analysis at scale: sampling, regression
- Introduction to data mining: clustering, association rules, decision trees
- Case studies in analytics: social networking, bioinformatics, text processing
The academy is due to start early 2013 with some interesting workshops:
- Dive into Cloudera Impala
- NumPy for Data Scientists
- Couchbase for Data Scientists
- MapReduce Algorithm Design
- Integrating SAP HANA with R
- Scikit-learn: Machine Learning with Python
The School of Data recently started with their first course, Data Fundamentals. It is a great starting point for anyone interested in (big) data and data science and lays the foundations for more serious work.
“The mission of the School of Data is to promote data literacy and data ‘wrangling’ skills – the ability to find, clean, retrieve, manipulate, analyse, interpret and represent different types of data – across the world. The more people who have the skills to understand and work with data effectively, the greater its value and impact, and the more likely it is that data will be able to bring about positive social benefits.”
This free book is available under a Creative Commons licence. So download it and read it for free. It utilises R and lots of examples to introduce the topic.
Data Science and machine learning are tightly related and should be of interest to any data science enthusiast. The Coursera machine learning course by Stanford Associate Professor Andrew Ng comes highly recommended to anyone interested in a solid introduction into machine learning with a hands-on approach, and great lecture material and videos.
The California Institute of Technology ran a free online machine learning course with video lectures earlier in 2012. The lectures are still online for anyone to watch and another course will start in January 2013.
An important aspect of data science can be data visualisation. The best analytics and models are not effective if the information and insight gained can not be easily and transparently shared with your client, consumer, or customer. The Knight Center is running their second massive open online course early 2013 about infographics and data visualisation.
Statistics and data analysis are, of course, the bread and butter of data science. This fall 2012 Carnegie Mellon University course is not as fancy as Coursera one. In fact, it is little more than a page with all the lecture slides, homework, lab sheets and solutions. But it is free and comprehensive so give it a try.
I know I wrote 9 resources but as I come across something good I might just append it here to the end.
This is a fun way to get started with R. It is a web site that teaches you, interactively, R. Not much more to say than give it a go.
Head over to Wiki Books to read ‘Data Science: An Introduction‘. There is already some signifcant material. Nevertheless, it is a work in progress and you can contribute.
Nearly complete is ‘Statistics‘ a book, you guessed it, about statistics.