Big Data

In case you missed it: My Webinar on Model-Based Machine Learning

In case you missed my free webinar on “Model-Based Machine Learning”,  here is the recording. If you have any questions, please do not hesitate to contact me.

Incase you missed it: My Talk at the United Nations Global Pulse Workshop

In case you missed my talk at the 2016 Data Science Africa Workshop organized by the United Nations Global Pulse Lab, here is the recording. My talk was titled “Sustainable Urban Transport Planning using Big Data from Mobile Phones”.

A Preview of My Talk for the Data Science Africa Workshop organized by the United Nations

I am excited to be invited by the United Nations Global Pulse lab to speak at the 2nd Data Science Africa Workshop scheduled to take place in Kampala, Uganda from 30th June to 1st July. I will be speaking particularly on “Data Science for Sustainable Cities”. My talk is titled: “Sustainable Urban Transport Planning using Big Data from Mobile Phones”.

Using Apache SparkR to Power Shiny Applications: Part I

The objective of this blog post is demonstrate how to use Apache SparkR to power Shiny applications. I have been curious about what the use cases for a “Shiny-SparkR” application would be and how to develop and deploy such an app.

Launch Apache Spark on AWS EC2 and Initialize SparkR Using RStudio

In this blog post, we shall learn how to launch a Spark stand alone cluster on Amazon Web Services (AWS) Elastic Compute Cloud (EC2) for analysis of Big Data. This is a continuation from our previous blog, which showed us how to download Apache Spark and start SparkR locally on windows OS and RStudio

Installing and Starting SparkR Locally on Windows OS and RStudio

With the recent release of Apache Spark 1.4.1 on July 15th, 2015, I wanted to write a step-by-step guide to help new users get up and running with SparkR locally on a Windows machine using command shell and RStudio. SparkR provides an R frontend to Apache Spark and using Spark’s distributed computation engine allows R-Users to run large scale data analysis from the R shell