Approximation &
Randomization
20-1
Approximation and Randomization
Approximation
Return f (A) instead of f (A) where
f (A) f (A) f (A)
is a (1 + )-approximation of f (A).
21-1
Approximation and Randomization
Approximation
Return f (A) instead of f (A) whe
B669 Sublinear Algorithms
for Big Data
Qin Zhang
1-1
Now about the Big Data
Big data is everywhere
.
2-1
: over 2.5 petabytes of sales transactions
: an index of over 19 billion web pages
: over 40 billion of pictures
Now about the Big Data
Big data is e
B669 Sublinear Algorithms
for Big Data
Qin Zhang
1-1
Part 1: Sublinear in Space
2-1
The model and challenge
The data stream model (Alon, Matias and Szegedy 1996)
RAM
an
a2 a1
CPU
Why hard? Cannot store everything.
Applications: Internet router, stock data
Sublinear Algorithms
for Big Data
Part 4: Random Topics
Qin Zhang
1-1
Topic 2: Subspace embedding
and Lp Regression
(based on a paper with David Woodru in COLT 2013)
2-1
Subspace embeddings
Subspace embeddings:
A distribution over linear maps : Rn Rm , s.
Data Streams & Communication Complexity
Lecture 3: Communication Complexity and Lower Bounds
Andrew McGregor, UMass Amherst
1/23
Basic Communication Complexity
Three friends Alice, Bob, and Charlie each have some information
x, y , z and Charlie wants to
B669 Sublinear Algorithms
for Big Data
Qin Zhang
1-1
Part 1: Sublinear in Space
2-1
The model and challenge
The data stream model (Alon, Matias and Szegedy 1996)
RAM
an
a2 a1
CPU
Why hard? Cannot store everything.
Applications: Internet router, stock data
Notes for # Connected components
Qin Zhang
1
The Algorithm
1. Sample a random set of r = c0 /
2
vertices u1 , . . . , ur
2. For each sampled vertex ui , we grow a BFS tree Tui rooted ui as follows. Set i = 0
and f = 0.
* Flip a coin. Set f = f + 1.
If (he
Sublinear Algorithms
for Big Data
Part 4: Random Topics
Qin Zhang
1-1
Topic 1: Compressive sensing
2-1
Compressive sensing
The model (Candes-Romberg-Tao 04; Donoho 04)
Applicaitons
Medical imaging
reconstruction
Single-pixel
camera
Compressive
sensor n
Massachusetts Institute of Technology
Lecturer: Piotr Indyk
6.895: Sketching, Streaming and Sub-linear Space Algorithms
September 17, 2007
Scribe: Anastasios Sidiropoulos
Lecture 4
1
Introduction
Most algorithms that we have seen so far operate by maintai
Sublinear Algorithms
for Big Data
Qin Zhang
1-1
Part 3: Sublinear in Time
2-1
Sublinear in time
Given a social network graph, if we have no time to ask
everyone, can we still compute something non-trivial?
For example, the average # of individuals friends
Sublinear Algorithms
for Big Data
Qin Zhang
1-1
Part 2: Sublinear in Communication
2-1
Sublinear in communication
The model
x1 = 010011
x2 = 111011
x3 = 111111
xk = 100011
They want to jointly compute f (x1 , x2 , . . . , xk )
Goal: minimize total bits of