# Example the function nonduplicates checks if a set of

• 15
• 83% (12) 10 out of 12 people found this document helpful

This preview shows page 6 - 10 out of 15 pages.

Example The function non_duplicates checks if a set of points contains duplicates or not. Input: Pair (A,B) where A and B are of form (Ax, Ay) and (Bx, By) respectively. Example Code my_input = ((0,0),(1,2)) non_duplicates(my_input) Output: Returns True if A != B, False otherwise. Example Output True **Hint : ** The above example is given just to provide the input and output format. This function may be used to "filter" out duplicates inside the get_cartesian() function. Definition In [8]: ## Insert your answer in this cell. DO NOT CHANGE THE NAME OF THE FUNCTI ON. def non_duplicates(x): """ Use this function inside the get_cartesian() function to 'filter' ou t pairs with duplicate points """ # # YOUR CODE HERE # a, b = x return a != b Unit Tests In [9]: assert type(non_duplicates(((0,0),(1,2)))) == bool, "Incorrect Return ty pe: Function should return a boolean value" In [10]: assert non_duplicates(((0,0),(1,2))) == True , "No duplicates are presen t" In [11]: assert non_duplicates(((0,0),(0,0))) == False , "Duplicates exist: (0,0)" Exercise 3: get_cartesian
Example The function get_cartesian does a cartesian product of an RDD with itself and returns an RDD with DISTINCT pairs of points. Input: An RDD containing the given list of points Output: An RDD containing The cartesian product of the RDD with itself Example Code test_rdd = sc.parallelize([(1,0), (2,0), (3,0)]) get_cartesian(test_rdd).collect() Example Output [((1, 0), (2, 0)), ((1, 0), (3, 0)), ((2, 0), (1, 0)), ((2, 0), (3, 0)), ((3 , 0), (1, 0)), ((3, 0), (2, 0))] Refer: () Definition In [12]: ## Insert your answer in this cell. DO NOT CHANGE THE NAME OF THE FUNCTI ON. def get_cartesian(rdd): return rdd.cartesian(rdd).filter(non_duplicates) Unit Tests In [13]: test_rdd = sc.parallelize([(1,0), (2,0), (3,0)]) l = [((1, 0), (2, 0)), ((1, 0), (3, 0)), ((2, 0), (1, 0)), ((2, 0), (3, 0)), ((3, 0), (1, 0)), ((3, 0), (2, 0))] assert isinstance(get_cartesian(test_rdd), RDD) == True , "Incorrect Retu rn type: Function should return an RDD" assert set(get_cartesian(test_rdd).collect()) == set(l), "Incorrect Retu rn Value: Value obtained does not match" In [14]: ##Hidden test cases here # # AUTOGRADER TEST - DO NOT REMOVE #
In [15]: ##Hidden test cases here # # AUTOGRADER TEST - DO NOT REMOVE # Exercise 4: find_slope Example The function find_slope computes slope between points A and B and returns it in the format specified below. Input: Pair (A,B) where A and B are of form (Ax, Ay) and (Bx, By) respectively. Example Code my_input = ((1,2),(3,4)) find_slope(my_input) Output: Pair ((A,slope), B) where A and B have the same definition as input and slope refers to the slope of the line segment connecting point A and B. Example Output (((1, 2), 1.0), (3, 4)) **Note: ** If Ax == Bx, use slope as "inf". **Hint : ** The above example is given just to provide the input and output format. This function is called a different way in the spark exercise. Definition In [16]: ## Insert your answer in this cell def find_slope(x): (ax, ay), (bx, by) = x slope = "inf" if ax == bx else (by - ay) / (bx - ax) return (((ax, ay), slope), (bx, by)) Unit Tests In [17]: assert type(find_slope(((1,2),(3,4)))) == tuple, "Function must return a tuple"
In [18]: assert find_slope(((1,2),(-7,-2)))[0][1] == 0.5, "Slope value should be 0.5" In [19]: assert