aster - SQL/MR Peter Pawlowski Member of Technical Staff...

Info iconThis preview shows pages 1–9. Sign up to view the full content.

View Full Document Right Arrow Icon
SQL/MR Peter Pawlowski Member of Technical Staff January 16, 2009
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
ASTER BACKGROUND 2
Background image of page 2
Our Founders 3 3 PhD students from Stanford C.S. Cool ideas… … but no funding, no product, no clients! OK, they had $ 10,000…
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Our Product: n Cluster A massively scalable database designed for analytics. Runs on a cluster of commodity nodes. Scales from GBs to 100s of TBs and beyond. Standard SQL interface (via a command line tool, JDBC, ODBC, etc). Support MR-like functionality via user-defined SQL/ MR functions. 4
Background image of page 4
5 Our Approach: Commodity Nodes Queen Query Server nodes Processing + Storage Results 5
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
SQL/MR 6
Background image of page 6
What are SQL/MR functions? SQL/MR functions: Are Java functions meeting a particular API. Are compiled outside the database, installed via a command line tool, and then invoked via SQL. Take a database table of one schema as input and output rows back into the database. Are polymorphic. During initialization, a function is told the schema of its input (for example, (key, value)) and needs to return its output schema. Accept zero or more argument clauses (parameters), which can modify their behavior. Are designed to run on a massively parallel system by allowing the user to specify which slice of the data a particular instance of the function sees. 7
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
First Example: Word Count Problem: Count the word frequency distribution across a set of documents. Input: A database table containing the documents in question. Map Phase: For each word in each document, outputs a row of the form (word, 1). Shuffle Phase:
Background image of page 8
Image of page 9
This is the end of the preview. Sign up to access the rest of the document.

This document was uploaded on 03/04/2012.

Page1 / 25

aster - SQL/MR Peter Pawlowski Member of Technical Staff...

This preview shows document pages 1 - 9. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online