What is the paper about? •Overview of Dremel system •Columnar storage format for nested data •Dremel’squery language and execution •Execution trees used in web search systems •Experimental results
What is Dremel? •System for interactive analysis of data. •Uses data, sitting on different storage systems. •Data modeled in a columnar, semi-structured (Protocol Buffers) format •Offers SQL-like Query language
has intentionally blurred sections.
Sign up to view the full version.
•Runs a MapReduce to extract billions of signals from web pages •Ad hoc SQL against Dremel DEFINE TABLE t AS /path/to/data/* SELECT TOP(signal, 100), COUNT(*) FROM t •More MR-based processing on the data (FlumeJava, Sawzall) •Can register the new dataset in a projectExample : Data Exploration