CSOR W4246 Fall, 2015
Homework 1
1. (10 points)
(a) Let f1 (n) = n. Then 2n c n for all n n0 = 1 and c = 2.
22n
n
n 2
(b) Let f1 (n) = 2n . Then lim
= lim 2n = . Hence 22n = (2n ).
n
2. (20 points)
(a) T (n) = O(n2 log n)
(b) T (n) = O(n3 log n)
(c) T (n)

cfw_
"cells": [
cfw_
"cell_type": "markdown",
"metadata": cfw_,
"source": [
"# NP-Hard Problems\n",
"\n",
"The purpose of this assignment is to familiarize yourself with different approaches to solving NP-hard problems in practice, especially via

Midterm Exam
STAT W4240: Data Mining
Instructor: Dr. Rahul Mazumder
March 9, 2015 (M/W Section)
Explanation
This exam is to be done in-class. You have 75 minutes to complete the entirety. All solutions should be
written in the accompanying blue book. No o

cfw_
"cells": [
cfw_
"cell_type": "markdown",
"metadata": cfw_,
"source": [
"# Connected Components\n",
"\n",
"The purpose of this assignment is to familiarize yourself with the handling of graph data structures. You will implement the algorithm f

STAT W4240 Sec1 Midterm2 Solution
April 17, 2015
1. (a) Split data randomly into 5 folds. For each k = 1, 2, 3, ., 20 do best subset selection
5 times. In time i hold fold i as validation data and use the other 4 folds as training set. For
the training se

CSOR W4246 Fall, 2015
HW1 Theoretical part
Out: Thursday, September 17, 2015
Due: 8pm, Monday, September 28, 2015
Please keep your answers clear and concise. For all algorithms you suggest, you must give the best upper
bound that you can for the running t

CSOR W4246 Fall, 2015
Homework 3 Theoretical part
Out: Thursday, October 22, 2015
Due: 1pm, Wednesday, November 4, 2015
Please keep your answers clear and concise. For all algorithms you suggest, you must give the best upper
bound that you can for the run

cfw_
"cells": [
cfw_
"cell_type": "markdown",
"metadata": cfw_,
"source": [
"# Max Flow Applications\n",
"\n",
"The purpose of this assignment is to investigate applications of finding a Max Flow. The problem asks you to design and implement an al

Algorithms for Data Science
CSOR W4246
Eleni Drinea
Computer Science Department
Columbia University
Tuesday, October 21, 2014
Outline
1 Recap
2 Review of ows
3 Correctness of the Ford-Fulkerson algorithm
4 An application of Max-Flow: Bipartite Matching
Re

Algorithms for Data Science
CSOR W4246
Eleni Drinea
Computer Science Department
Columbia University
Thursday, October 9, 2014
Outline
1 Recap
2 Hashing
Review of the last lecture
Negative cycle detection
All-pairs shortest paths (Floyd-Warshall)
Hashing
T

CSOR W4246 Fall, 2015
Homework 2 Theoretical part
Out: Monday, October 5, 2015
Due: 9pm, Monday, October 19, 2015
Please keep your answers clear and concise. For all algorithms you suggest, you must give the best upper
bound that you can for the running t

CSOR W4246 Fall, 2014
Homework 1
Out: Monday, September 8, 2014
Due: 6pm, Monday, September 22, 2014
Please keep your answers clear and concise, and make sure that your hand-writing is legible and
that your name is clearly written on your homework if you

cfw_
"cells": [
cfw_
"cell_type": "markdown",
"metadata": cfw_,
"source": [
"# Max Flow Applications\n",
"\n",
"The purpose of this assignment is to investigate applications of finding a Max Flow. The problem asks you to design and implement an al

CSOR W4246 Fall, 2015
Homework 4 Theoretical part
Out: Thursday, November 12, 2015
Due: 1pm, Wednesday, November 25, 2015
Please keep your answers clear and concise. For all algorithms you suggest, you must prove correctness and give the best upper bound

CSOR W4246 Fall, 2014
Homework 2 Theoretical part
Out: Monday, September 22, 2014
Due: 6pm, Monday, October 6, 2014
Please keep your answers clear and concise, and make sure that your hand-writing is legible and
that your name is clearly written on your h

Algorithms for Data Science
CSOR W4246
Eleni Drinea
Computer Science Department
Columbia University
Thursday, October 29, 2015
Outline
1 Recap
2 Taking the dual of an LP
3 Examples of formulating LPs
4 Interpreting the dual LP
Today
1 Recap
2 Taking the d

Midterm Exam
STAT W4240: Data Mining
Instructor: Dr. Rahul Mazumder
April 8, 2015 (M/W Section)
Explanation
This exam is to be done in-class. You have 75 minutes to complete the entirety. All solutions should be
written in the accompanying blue book. No o

Algorithms for Data Science
CSOR W4246
Eleni Drinea
Computer Science Department
Columbia University
Thursday, December 3, 2015
Outline
1 The structure of the WWW
2 Identifying important pages via link analysis
3 Hubs and authorities
4 PageRank
Review of l

Algorithms for Data Science
CSOR W4246
Eleni Drinea
Computer Science Department
Columbia University
Tuesday, November 24, 2015
Outline
1 Recap
2 Hashing
3 Time/space analysis of chain hashing
Balls and bins models
Expected & worst-case analysis of Lookup

Algorithms for Data Science
CSOR W4246
Eleni Drinea
Computer Science Department
Columbia University
Tuesday, December 1, 2015
Outline
1 Recap
Balls and bins
2 On randomized algorithms
3 Saving space: hashing-based ngerprints
4 Bloom lters
Today
1 Recap
Ba

Algorithms for Data Science
CSOR W4246
Eleni Drinea
Computer Science Department
Columbia University
Thursday, November 12, 2015
Outline
1 Review of last lecture
The class N P
The class of N P-complete problems
2 Satisability: a fundamental N P-complete pr

Algorithms for Data Science
CSOR W4246
Eleni Drinea
Computer Science Department
Columbia University
Tuesday, November 17, 2015
Outline
1 Review of last lecture
2 Representative N P-complete problems
3 Integer Programming
4 Minimum-weight Set Cover
Today
1

Algorithms for Data Science
CSOR W4246
Eleni Drinea
Computer Science Department
Columbia University
October 20-22, 2015
Outline
1 Recap
2 Flow networks
Applications
3 The residual graph and augmenting paths
4 The Ford-Fulkerson algorithm for max ow
5 Corr

Algorithms for Data Science
CSOR W4246
Eleni Drinea
Computer Science Department
Columbia University
Thursday, November 19, 2015
Outline
1 Recap
2 More representative N P-complete problems
3 Integer programming
4 Minimum-weight set cover
5 An approximation

Algorithms for Data Science
CSOR W4246
Eleni Drinea
Computer Science Department
Columbia University
Thursday, November 5, 2015
Outline
1 Recap
2 Taking the dual of an LP
3 Examples of formulating LPs
4 Interpreting the dual LP
Today
1 Recap
2 Taking the d

Algorithms for Data Science
CSOR W4246
Eleni Drinea
Computer Science Department
Columbia University
Tuesday, October 13, 2015
Outline
1 Recap
2 Shortest paths in graphs with non-negative edge weights
(Dijkstras algorithm)
Graphs with negative edge weights

Algorithms for Data Science
CSOR W4246
Eleni Drinea
Computer Science Department
Columbia University
Tuesday, October 27, 2015
Outline
1 Recap
Feasibility problems
2 The structure of a linear program
3 Duality
4 Examples
Today
1 Recap
Feasibility problems

Another Variant of 3sat
Proposition 32 3sat is NP-complete for expressions in
which each variable is restricted to appear at most three
times, and each literal at most twice. (3sat here requires
only that each clause has at most 3 literals.)
Consider a g

CSOR W4246
HW4
Due: Nov 25th
Date: Oct 21th
Name: Yeyun Chen
UNI: yc3070
(a) Variables: We introduce one variable Fij for each edge (i, j) E, which stands for the flow in
edge(i, j) in the solution.
Objective Function:
min
(,)
Constraints:
Capacity con

CSOR W4246
HW3
Name: Yeyun Chen
UNI: yc3070
Due: Nov 04th
Date: Oct 29th
So, the maximum flow and the capacity of minimum cut is 11, and the minimum cut is:
S = cfw_s, a, b, c, T = cfw_d, t
-1/5
CSOR W4246
HW3
1.
Name: Yeyun Chen
UNI: yc3070
Due: Nov 04th

CSOR W4246
HW2
Name: Yeyun Chen
UNI: yc3070
Due: Oct 18th
Date: Oct 11th
Since it is an undirected graph and unweighted graph, we can use BFS algorithm to solve this
problem. Let N_SP[t] denote the number of shortest path from s to t.
1.
First add all nod

cfw_
"cells": [
cfw_
"cell_type": "markdown",
"metadata": cfw_,
"source": [
"# Connected Components\n",
"\n",
"The purpose of this assignment is to familiarize yourself with the handling
of graph data structures. You will implement the algorithm for ident