My colleague Regunathan and I presented our work on using data science to predict four year student graduation rates at a large public university in the US, by building an institutional data lake b...
Data Science at Scale for IoT on Pivotal Platform
Along with my colleagues Gautam and Rashmi, I presented our work on how to build scalable data science pipelines for IoT on the Pivotal stack. You can watch a recording of this talk below.
All Things Python @ Pivotal
I presented all the glorious thing our team of data scientists and machine learning engineers at Pivotal build with Python. From our massively parallel processing database - Greenplum MPP, our in-d...
Data Driven Action - A Primer on Data Science (Spring One GX 2015)
Along with my colleagues Sarah Aerni and Jarrod Vawdrey I presented how we leverage open source tools from the Spring ecosystem to solve data science problems in NLP at scale. You can watch a reco...
Big Data vs Climate Change - Strata Hadoop World 2015
My colleague from EMC, John Cardente and I presented our work on how a climate data lake could empower citizen scientists studying climate change in Acadia National park. This Strata + Hadoop World...
How to Scale Native (C/C++) Applications on Pivotal's MPP Platform - An Edge Detection Example
My colleague Gautam Muralidhar and I describe how we can leverage native C/C++ applications on Massively Parallel Processing (MPP) databases like Greenplum using procedural languages such as PL/C o...
Big Data in Education - Analyzing Student Clusters to Influence Success and Retention
We used Greenplum MPP, Apache MADlib and PL/Python libraries to analyze a variety of data sources pertaining to academic programs of students at a large public university to determine the factors t...
From Sea to Trees, Pivotal Data Science Looks at Climate Change in Acadia National Park
In November 2014, I spent a week with some of my colleagues from Pivotal/VMWare/EMC at the beautiful Acadia National Park to understand the effect of climate change on the flora and fauna in the pa...
Distributed Pipeline for Topic and Sentiment Analysis of Tweets
I returned to Data Day Texas 2014 this year to present a distributed topic and sentiment analysis pipeline based on open source in-database machine learning tools. Here’s a demo of this pipeline i...
Python Powered Data Science at Pivotal - PyData NYC 2013
My colleague Ian Huston and I presented how we leverage Python for data science projects in Pivotal at PyData New York - 2013. Here is a summary of our talk, you can view the slides below. The...