I'm an N-1 year database graduate student in CSAIL at MIT. My advisor is Sam Madden. I previously studied at UC Berkeley.

I'm on the job market!

A very kind biography of me by @mstem. I dream of living a wild and crazy life.

I Enjoy

Visualizing and understanding lots of data


Ultimate frisbee, biking, running Trying not to injure myself

Current Projects

Big data course@MIT

I am co-developing the "big data" course at MIT. The class surveys techniques and systems for ingesting, efficiently processing, analyzing, and visualizing large data sets. Topics will include data cleaning, data integration, scalable systems (relational databases, NoSQL, Hadoop, etc.), analytics (data cubes, scalable statistics and machine learning), and scalable visualization of large data sets.

The goal is for students to gain working experience of the topics and systems that are covered.

Scorpion: Outlier Explanation

System to generate sensible explanations to outliers analytic queries.

Provenance + Declarative Visualization

Implementation of a declarative javascript-based visualization system with a provenance system integrated out-of-the-box.

Inspired by Wilkinson's Grammar of Graphics and Wickham's ggplot2 for R.

source on github

VLDB conference trends

A history of databases through keyword trends in VLDB publication titles


A tool to import your data into whatever data store you want, as painlessly as possible.

See article for motivation

Other Projects


MEET strives to bridge the gap between future Israeli and Palestinian leaders by immersing them together for 3 full years of fun and education. MIT business and technical instructors work in the Middle East for a month-long intensive session during the summer. I was one of four Year 3 technical instructors in 2010, and helped head the curriculum team for the past 3 years


A look at optimizing human computation through a database lens. Qurk is a database prototype that enables users to write queries that compute results from both machines and humans. With adam marcus.

Introduction to Data Literacy

I co-taught a heavily lab-based IAP class called Introduction to Data Literacy that introduces students to many basic data cleaning, analysis, and visualization techniques. The course was added to OCW. With my buddy adam marcus.


A look into the properties of structured data at the internet scale. With michael cafarella, yang zhang, nodira k., daisy wang and alon halevy.


An experimental course scheduling system. Tries to make the user experience not suck by using JS. This was around the time google calendar came out. With sukhchander khanna


System for declaratively filtering and correlating streams of events from sensor and rfid devices. Extends YFilter's core query processing engine. With yanlei diao and daniel gyllstrom

HiFi @ Berkeley

A Cascading Stream Architecture for Large-Scale Receptor-Based Networks. With the berkeley db group and notably shawn jeffrey and shariq rizvi

Past Jobs

