{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

1fault tolerance - Scheduling in Distributed Systems There...

Info icon This preview shows pages 1–5. Sign up to view the full content.

View Full Document Right Arrow Icon
Scheduling in Distributed Systems There is not really a lot to say about scheduling in a distributed system. Each processor does its own local scheduling (assuming that it has multiple processes running on it), without regard to what the other processors are doing. However, when a group of related, heavily interacting processes are all running on different processors, independent scheduling is not always the most efficient way. See the example below: Time Slot 0 1 0 A C 1 B D 2 A C 3 B D 4 A C 5 B D Time Slot 0 1 2 3 4 5 6 7 0 X X 1 X X 2 X X X 3 X X 4 X X X 5 X X
Image of page 1

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Scheduling in Distributed Systems (continue) Although it is difficult to determine dynamically the inter-process communication patterns, in many cases, a group of related processes will be started off together. We can assume that processes are created in groups and that intra-group communication is much more prevalent than inter-group communication. We can further assume that a sufficiently large number of processors is available to handle the largest group, and that each processor is multi-programmed with N process slots. Ousterhout’s co-scheduling: The idea is to have each processor use a round-robin scheduling algorithm with all processors first running the process in slot 0 for a fixed period, then all processors switch to slot 1 and run for a fixed period and so on. A broadcast message could be used to tell each processor when to do process switching, to keep the time slices synchronized. By putting all the members of a process group in the same slot number, but on different processors, one has the advantage of N-fold parallelism, with a guarantee all the processes will be run at the same time, to maximize communication throughput
Image of page 2
Fault Tolerance A system is said to fail when it does not meet its specification. Examples of failures: supermarket’s distributed ordering systems, distributed air traffic control system (safety-critical system). Component Faults: Computer systems can fail due to a fault in some component, such as a processor, memory, I/O device, cable, or software. A fault is a malfunction, possibly caused by a design error, a manufacturing error, a programming error, physical damage, deterioration in the course of time, harsh environmental conditions, unexpected inputs, operator error, and many other causes. Faults are generally classified as: Transient fault: occur once and then disappear. If the operation is repeated, the fault goes away for the second try. Intermittent fault: it occurs, then vanishes, then reappears, and so on. Like a loose contact on a connector. Intermittent faults cause a great deal of aggravation because they are difficult to diagnose. Permanent fault: one that continue to exist until the faulty component is repaired or replaced. Like burnt-out chips, software bugs, and disk head crashes.
Image of page 3

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
The goal of designing and building fault-tolerance systems is to ensure that the system as a whole continues to function correctly, even in the presence of faults.
Image of page 4
Image of page 5
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern