Lec13 - Today More Canonical Problems Termination Detection

Computer Science Lecture 13, page CS677: Distributed OS Today: More Canonical Problems Termination Detection Leader election Mutual exclusion Computer Science Lecture 13, page CS677: Distributed OS Termination Detection Detecting the end of a distributed computation Notation: let sender be predecessor , receiver be successor Two types of markers: Done and Continue After finishing its part of the snapshot, process Q sends a Done or a Continue to its predecessor Send a Done only when All of Q ’s successors send a Done Q has not received any message since it check-pointed its local state and received a marker on all incoming channels Else send a Continue Computation has terminated if the initiator receives Done messages from everyone

Computer Science Lecture 13, page CS677: Distributed OS Election Algorithms Many distributed algorithms need one process to act as coordinator Doesn’t matter which process does the job, just need to pick one Election algorithms: technique to pick a unique coordinator (aka leader election ) Examples: take over the role of a failed process, pick a master in Berkeley clock synchronization algorithm Types of election algorithms: Bully and Ring algorithms Computer Science Lecture 13, page CS677: Distributed OS Bully Algorithm Each process has a unique numerical ID Processes know the Ids and address of every other process Communication is assumed reliable Key Idea : select process with highest ID Process initiates election if it just recovered from failure or if coordinator failed 3 message types: election, OK, I won Several processes can initiate an election simultaneously Need consistent result O(n 2 ) messages required with n processes
Computer Science Lecture 13, page CS677: Distributed OS Bully Algorithm Details Any process P can initiate an election P sends Election messages to all process with higher Ids and awaits OK messages If no OK messages, P becomes coordinator and sends I won messages to all process with lower Ids If it receives an OK , it drops out and waits for an I won If a process receives an Election msg, it returns an OK and starts an election

