A4 - Homework#4 Jorge D. Fernandez jdf54 CS 4320...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Homework #4 Jorge D. Fernandez jdf54 CS 4320 Part A: Sequence S1: T1:R(X), T2:W(X), T2:W(Y), T3:W(Y), T1:W(Y), T1:Commit, T2:Commit, T3:Commit Sequence S2: T1:R(X), T2:W(Y), T2:W(X), T3:W(Y), T1:W(Y), T1:Commit, T2:Commit, T3:Commit A.5) Sequence S1: It begins with T1 trying to read object X. It performs a check to see whether TS(T1)<TS(X). Since X has yet to be written, the condition fails and T1, therefore, reads X and RTS(X) is updated to 1. It continues with T2 trying to write to X. We check to see that both TS(T2) < RTS(X) and TS(T2) < WTS(X) are not satisfied. Because, that is indeed the case T2 proceeds to write X to its private workspace and WTS(X) gets updated. Then T2 tries to write to Y. It checks both conditions and sees that is able to write Y, therefore it writes it to its private workspace and updates WTS(Y) to 2. Then T3 tries to write to Y and checks that both TS(T3) < RTS(Y) and TS(T3) < WTS(Y) are not satisfied. This is indeed the case since T2, which has already written to them in its private workspace, has a timestamp of 2 which is less than 3. T3 proceeds to write to Y in its private workspace and WTS(Y) gets updated to 3. Similarly, T1 tries to write to Y and checks both conditions and sees that TS(T1) < WTS(Y) evaluates to true. Therefore, by the Thomas Write Rule we ignore this write. Then T2 commits, and the changes that it made to both X and Y in its private workspace are used to update both X and Y in the database. Then T3 commits and the changes to Y get updated to the database. Finally, T1 commits but since its write was ignored it does not update anything on the database. Sequence S2: It begins with T1 trying to read object X. It performs a check to see whether TS(T1)<TS(X). Since X has yet to be written, the condition fails and T1, therefore, reads X and RTS(X) is updated to 1. It continues with T2 trying to write to Y. We check to see that both TS(T2) < RTS(Y) and TS(T2) < WTS(Y) are not satisfied. Because, that is indeed the case T2 proceeds to write Y to its private workspace and WTS(Y) gets updated. Then T2 tries to write to X. It checks both conditions and sees that is able to write X, therefore it writes it to its private workspace and updates WTS(X) to 2. Then T3 tries to write to Y and checks that both TS(T3) < RTS(Y) and TS(T3) < WTS(Y) are not satisfied. This is indeed the case since T2, which has already written to them in its private workspace, has a timestamp of 2 which is less than 3. T3 proceeds to write to Y in its private workspace and WTS(Y) gets updated to 3. Similarly, T1 tries to write to Y and checks both conditions and sees that TS(T1) < WTS(Y) evaluates to true. Therefore, by the Thomas Write Rule we ignore this write. Then T2 commits, and the changes that it made to both X and Y in its private workspace are used to update both X and Y in the database. Then T3 commits and the changes to Y get updated to the database. Finally, T1 commits but since its write was ignored it does not update anything on the database. A.6) Sequence S1: It begins with T1 reading X and therefore changing the RTS(X) from 0 to 1. Then T2 tries to write to X. It checks to see whether TS(T2) < RTS(X), since this is false it proceeds to make a new copy of X, call it X_1, and set RTS(X_1) and WTS(X_1) to 2. Then T2 tries to write to Y and checks to see whether TS(T2) < RTS(Y) holds. Since the condition does not hold T2 creates a new copy of Y, call it Y_1, and changes RTS(Y_1) and WTS(Y_1) to 2. It continues with T3 trying to write to Y_1. It checks to see whether TS(T3) < RTS(Y_1), which it turns out to be false. Therefore, T3 makes a new copy of the object Y_1, call it Y_2, and changes RTS(Y_2) and WTS(Y_2) to 3. Then, T1 tries to write to Y_2, it checks the following condition TS(T1) < RTS(Y_2), which evaluates to true. Therefore, T1 gets aborted and restarted with a higher timestamp of 4. Then T2 and T3 commit. T1 gets restarted and reads X_1, since TS(T1) < RTS(X_1) fails, and sets RTS(X_1) to 4 (Notice that if the condition had not failed it would read the version of X for which the condition was satisfied). Then it tries to write Y_2 and checks the following TS(T1) < RTS(Y_2) which turns out to be false. Therefore, T1 makes a new copy of Y_2, call it Y_3 and changes RTS(Y_3) and WTS(Y_3) to 4. T1 then commits. Sequence S2: It begins with T1 reading X and therefore changing the RTS(X) from 0 to 1. Then T2 tries to write to Y. It checks to see whether TS(T2) < RTS(Y), since this is false it proceeds to make a new copy of Y, call it Y_1, and set RTS(Y_1) and WTS(Y_1) to 2. Then T2 tries to write to X and checks to see whether TS(T2) < RTS(X) holds. Since the condition does not hold T2 creates a new copy of X, call it X_1, and changes RTS(X_1) and WTS(X_1) to 2. It continues with T3 trying to write to Y_1. It checks to see whether TS(T3) < RTS(Y_1), which it turns out to be false. Therefore, T3 makes a new copy of the object Y_1, call it Y_2, and changes RTS(Y_2) and WTS(Y_2) to 3. Then, T1 tries to write to Y_2, it checks the following condition TS(T1) < RTS(Y_2), which evaluates to true. Therefore, T1 gets aborted and restarted with a higher timestamp of 4. Then T2 and T3 commit. T1 gets restarted and reads X_1, since TS(T1) < RTS(X_1) fails, and sets RTS(X_1) to 4 (Notice that if the condition had not failed it would read the version of X for which the condition was satisfied). Then it tries to write Y_2 and checks the following TS(T1) < RTS(Y_2) which turns out to be false. Therefore, T1 makes a new copy of Y_2, call it Y_3 and changes RTS(Y_3) and WTS(Y_3) to 4. T1 then commits. Part B: B.1) Suppose that we had a query that looked searched for every employee named Santa and printed the ids of those whose salary was higher than 40 dollars/hour. If it was the case that there were two employees named Santa whose salary was slightly less than 40 dollars/hours, say 38 and 39 dollars/hour, then we would arrive at a problem if the updating command was running concurrently with the query. In other words, suppose the update command looks at the first employee named Santa and changes his salary from 38 to 41.8 dollars/hours. Then the query would find that same Santa whose salary was just updated and his id would get printed out. However, suppose then the query finds the other Santa whose salary is 39 dollars/hour, this Santa’s id would not get printed since his salary has not yet been updated. Once the update is performed we clearly see that his id should have been printed since his new salary is 42.9 dollars/hour, but this was not the case. If we used locking instead, it would solve the problem since we would get to a deadlock that will be resolved by aborting and restarting one of the transactions. Using locking the following will happen: The updating command finds the first Santa whose salary is slightly less than 40, say 39 dollars/hour, obtains a lock on this Santa and updates his salary. Then the query finds this same Santa and tries to print his/her id, but instead it has to wait for the update command to release this lock. Then the query finds the other Santa whose salary is 38 dollars/hour, acquires a lock on it and prints it. Finally the update command tries to update the salary of the Santa whose salary is 38 dollars/hour but fails since it is locked by the query. As seen, we arrive at a deadlock that gets resolved by aborting the query and restarting it. After aborting the query all of its locks are release and the update command can perform its operations. Then both of the Santas ids get printed out. B.2) Consider the case in which there are two transactions T1 and T2, where T1 performs and insertion of two Santa’s S1 and S2 and T2 updates the salary of all Santas. Next, consider the case in which T1 inserts its first Santa S1, then T2 performs and update of all Santas and then T1 finishes by inserting S2. We can clearly see that this will result in a conflict with the update command. In this particular schedule T1 will insert S1 and obtain a lock on it. Then T2 will update the salary of all Santas in the database. This will happen sequentially, scanning all pages and updating one by one. Then T1 will finish by inserting S2 on a page that has already been seen by the update command and that is, therefore, not locked. In such schedule can see that T2 will finish the update of all Santas except Santa S2 since it was inserted in a page, which was already scanned by the update command. Therefore, both transactions will finish and commit but there will be one Santa in the database that will not have his salary updated, which is clearly an issue. This will only be resolved by using index locking. B.3) Index locking consists of locking the index page that contains the data that is to be updated, or in the case that there is no data, we would lock those index pages that would contain the data to be updated. This way we prevent transactions from inserting new records for those specific data entries. For the previous example, we fix the conflict by using index locking. As we previously mentioned, we have a schedule that performs two transactions T1 and T2, where T1 simply inserts two new Santas S1 and S2 and T2 updates the salaries of all the Santas in the database. Using index locking the following will happen. T1 will insert the first Santa S1 and will have a lock on it. Then T2 will update the salaries of all Santas currently in the database. T2 will have to wait as soon as it tries to update Santa S1 since T1 has a lock on it. Then T1 will try to insert S2 but it will have to wait for T2 since T2 has a lock on all pages that contain Santas. As a result we will arrive at a deadlock, which can only be resolved by aborting and restarting one of the transactions. No matter which transaction we choose to abort and restart we will get a serial schedule, therefore resolving the conflict that we had in the previous problem. If T2 is aborted and restarted, T1 will successfully insert Santas S1 and S2 and then T2 will update all of the Santas successfully. In the other case, T2 will update all Santas currently in the database and then T1 will insert S1 and S2, therefore resolving the conflict. Part C: A ­>B BC ­>E ED ­>A C.1) All of the possible keys are the following: ACD, EDC, BCD ACD  ­> BCD  ­> ED EDC  ­> AC ­> BC BCD  ­> ED  ­> A C.2) The relation is 3NF because the following condition satisfies for all functional dependencies. A ­>B, B is part of the key BCD BC ­>E, E is part of the key EDC ED ­>A, A is part of the key ACD C.3) The relation is not BCNF because none of the following conditions satisfy: A ­>B, is not trivial BC ­>E, is not trivial ED ­> A, is not trivial None of A, BC or ED are a superkey. Part D R = ABCDEGH F = {AB ­>C, AC ­>B, AD ­>E, B ­>D, BC ­>A, B ­>G} D.1.a) (i) ABC F = {AB ­>C, AC ­>B, BC ­>A} Minimal cover set for set ABC is either: {AB ­>C, AC ­>B, BC ­>A} Each of these relations is a duplicate of the other ones. (ii) The strongest normal form is therefore BCNF because: AB is a superkey because AB is a key, AB ­>AB, AB ­>C AC is a superkey because AC is a key, AC ­>AC, AC ­>B BC is a superkey because BC is a key, BC ­>BC, BC ­>A (iii) It is already in BCNF D.1.c) (i)ABCEG F = {AB ­>C, AC ­>B, BC ­>A, E ­>G} Minimal cover set for set ABCEG is either: {AB ­>C, E ­>G, AC ­>B, E ­>G, BC ­>A, E ­>G} As in the previous problem, each of the relations are duplicates of the other ones, except for E ­>G which is necessary for each minimal cover set. (ii) Both BCNF and 3NF normal ­forms are violated by this relation because G is not part of a key and none of AB, AC, BC and E are a superkey. (iii) R = ABCEG FD = {AB ­>C, AC ­>B, BC ­>A, E ­>G} ABCEG / E ­>G \ EG ABCE /AB ­>C \ ABC ABE As we can see from the tree above, all the leaves are in BCNF form because: EG, in E ­>G E is superkey since E is a key. ABC, in AB ­>C, BC ­>A, AC ­>B, all of AB, BC and AC are a superkey since AB, BC and AC are a key. ABE, in A ­>A, B ­>B, E ­>E are trivial relations. D.1.e) (i) ACEH The initial intuition is to think that the set of functional dependencies is empty, however, we need to find additional implied functional dependencies from the rules. AC ­>B, B ­>D, therefore, AC ­>D which by augmentation we get AC ­>AD ­>E. F = {AC ­>E} (ii) Both BCNF and 3NF and are violated by this relation. (iii) ACEH /AC ­>E\ ACE ACH As one can, the leaves of the decomposition above are in BCNF because: In ACE, AC ­>E, so AC is a superkey since AC is a key. In ACH, ACH is a superkey since ACH is a key. Part F R = ABCDEG F = {AB ­>C, AC ­>B, AD ­>E, B ­>D, BC ­>A, E ­>G} F.a) {AB, BC, ABDE, EG} a) The functional dependencies are the following: FAB = {} FBC = {} FABDE = {AD ­>E, B ­>D} FEG = {E ­>G} In order for it to be dependency ­preserving then (FAB ∪ FABDE ∪ FEG ∪ FBC)+ must be equal to F+. Because the closure of the union of FAB, FEG, FBC and FABDE cannot be equal to F+, given that we are missing too many original functional dependencies, therefore this decomposition is not dependency ­preserving. b) The decomposition above is not lossless. We can decompose ABCDEG without loss of generality for one of AB, BC, ABDE or EG. After that we get that we cannot decompose them further using lossless ­join. We can decompose ABCDEG the following way: ABCDEG / \ AB ABCDEG (This is lossless because AB is a key) Notice that since decomposing AB gives us the original relation we only need to prove that all decompositions from this point forward are not lossless. ABCDEG / \ BC ABDEG (Not lossless because B is not a key) / \ EG ABDE ABCDEG / \ EG ABCDE (Lossless because E is a key) / \ BC ABDE (Not lossless because B is not a key) ABCDEG / \ ABDE BCEG (Not lossless because B or E are not a key) / \ EG BC We have shown that for every possible decomposition it is not lossless. F.b) {ABC, ACDE, ADG} a) The functional dependencies are the following: FABC = {AB ­>C, BC ­>A, AC ­>B} FACDE = {AC ­>D, AD ­>E} FADG = {AD ­>G} In order for it to be dependency ­preserving then (FABC ∪ FACDE ∪ FADG)+ must be equal to F+. Because in order to get the functional dependency E ­>G or B ­>D we must join FACDE and FADG, FABC and FACDE, therefore this decomposition is not dependency ­preserving. In other words, the closure of the union of the functional dependencies does not equal the closure of the original one. b) The decomposition above is lossless. If we split the relation in the following way we can see that is lossless: ABCDEG / \ ADG ABCDE (Lossless join because AD is a key) / \ ABC ACDE (Lossless join because AC is a key) Part G R(A,B,C,D) G.1) B ­>C, D ­>A, decompose into BC and AD. BD is a key The decomposition into the smaller relations BC and AD is bad because even though is dependency ­preserving, it is not lossless. That is, closure of the union of B ­>C and D ­>A gives us the close of the original functional dependencies. Furthermore, it is bad because as we can see BC ∩ AD is empty. G.3) A ­>BC, C ­>AD, decompose into ABC and AD A and C are keys FDABC = {A ­>BC, C ­>A} FDAD = {A ­>D} The decomposition into the smaller relations ABC and AD is good. We can see that the decomposition is lossless and dependency ­preserving. For instance, ABC ∩ AD = A  ­> BC, therefore A is a key and the decomposition is lossless. Furthermore, taking the closure of FDABC ∪ FDAD does gives us the closure of the original functional dependency. As we can see, the closure of the union of the above dependencies gives us all of A ­>BC and C ­>AD. G.4) A ­>B, B ­>C, C ­>D, decompose into AB and ACD A is a key The decomposition into the smaller relations AB and ACD is bad because although it is lossless it is not dependency ­preserving and it is neither BCNF nor 3NF. However, we see that is not dependency ­preserving since the closure of FDAB ∪ FDACD is not equal to the closure of the original functional dependencies. That is, there is no way of getting B ­>C from the closure of FDAB ∪ FDACD. Moreover, in B ­>C, B is not a superkey. Also, in B ­>C, C is not part of a key. Therefore, the decomposition is bad. G.5) A ­>B, B ­>C, C ­>D, decompose into AB, AD and CD A is a key The decomposition into the smaller relations AB, AD and CD is bad. We know that is lossy because for all possible decompositions we get it to be lossy. If we decompose it in the following way: ABCD / \ AB ACD (Lossless because A ­>B, so A is a key) / \ AD CD (Lossy because D is not a key) ABCD / \ AD ABC (Lossless because A ­>D, so A is a key) / \ AB CD (Lossy AB ∩ CD is empty) ABCD / \ CD ABD (Lossy because D is not a key) Moreover, we can see that is not dependency ­preserving since there is no way of getting B ­>C from the closure of FDAB ∪ FDAC ∪ FDCD. From the closure of these functional dependencies we can only get A ­B, A ­>C, and C ­>D. ...
View Full Document

This note was uploaded on 01/10/2010 for the course CS 4320 taught by Professor Koch during the Fall '07 term at Cornell.

Ask a homework question - tutors are online