Hadoop is organized around the following three components 1 A distributed file

Hadoop is organized around the following three

This preview shows page 20 - 22 out of 52 pages.

Hadoop is organized around the following three components:1.A distributed file system that manages data and files across distributed com-puting nodes.2.The YARN(Yet Another Resource Negotiator) framework, which managesresources within the cluster as well as scheduling tasks on nodes in the cluster.3.The MapReducesystem, which allows parallel processing of data across nodes inthe cluster.Hadoop is designed to run on Linux systems, and Hadoop applications can bewritten using several programming languages, including scripting languages suchas PHP, Perl, and Python. Java is a popular choice fordeveloping Hadoopapplications, as Hadoop has several Java libraries that support MapReduce. MoreinformationonMapReduceandHadoopcanbefoundattutorial.htmlandstart executing that system. To accomplish this goal, the bootstrap program must locate the operating-system kernel and load it into memory.Once the kernel is loaded and executing, it can start providing services tothe system and its users. Some services are provided outside of the kernel by systemprograms that are loaded into memory at boot time to become systemdaemons,which run the entire time the kernel is running. On Linux, the firstsystem program is “systemd,and it starts many other daemons. Once this phase iscomplete, the system is fully booted, and the system waits for some event to occur.If there are no processes to execute, no I/Odevices to service, and no users towhom to respond, an operating system will sit quietly, waiting for something tohappen. Events are almost always signaled by the occurrence of an interrupt.In Section 1.2.1 we described hardware interrupts. Another form of interrupt is atrap(or an exception), which is a software-generated interrupt caused eitherby an error (for example, division by zero or invalid memory access) or by a specificrequest from a user program that an operating- system service be performed byexecuting a special operation called a system call.
Background image
1.4 Operating-System Operations231.4.1 Multiprogramming and MultitaskingOne of the most important aspects of operating systems is the ability to run multipleprograms, as a single program cannot, in general, keep either the CPUor the I/Odevices busy at all times. Furthermore, users typically wantto run morethan one program at a time as well. Multiprogrammingincreases CPUutilization, as well as keeping users satisfied, by organizing programs so thatthe CPUalways has one to execute. In a multiprogrammed system, a program inexecution is termed a process.The idea is as follows: The operating system keeps several processes in memorysimultaneously (Figure 1.12). The operating system picks and begins to execute oneof these processes. Eventually, the process may have to wait for some task, such asan I/Ooperation, to complete. In a non-multiprogrammed system, the CPUwould sitidle. In a multiprogrammed system, the operatingsystem simply switches to, and executes, another process. When
Background image
Image of page 22

You've reached the end of your free preview.

Want to read all 52 pages?

  • Fall '19
  • Central processing unit, Interrupt

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture