Amazon Virtual Private Cloud Amazon VPC to configure the virtual network in

Amazon virtual private cloud amazon vpc to configure

This preview shows page 11 - 12 out of 395 pages.

Amazon Virtual Private Cloud (Amazon VPC) to configure the virtual network in which you launch your instances Amazon S3 to store input and output data Amazon CloudWatch to monitor cluster performance and configure alarms AWS Identity and Access Management (IAM) to configure permissions AWS CloudTrail to audit requests made to the service AWS Data Pipeline to schedule and start your clusters Deployment Your EMR cluster consists of EC2 instances, which perform the work that you submit to your cluster. When you launch your cluster, Amazon EMR configures the instances with the applications that you choose, such as Apache Hadoop or Spark. Choose the instance size and type that best suits the processing needs for your cluster: batch processing, low-latency queries, streaming data, or large data storage. For more information about the instance types available for Amazon EMR, see Configure Cluster Hardware and Networking (p. 104) . Amazon EMR offers a variety of ways to configure software on your cluster. For example, you can install an Amazon EMR release with a chosen set of applications that can include versatile frameworks, such as Hadoop, and applications, such as Hive, Pig, or Spark. You can also install one of several MapR distributions. Amazon EMR uses Amazon Linux, so you can also install software on your cluster manually using the yum package manager or from the source. For more information, see Configure Cluster Software (p. 100) . Scalability and Flexibility Amazon EMR provides flexibility to scale your cluster up or down as your computing needs change. You can resize your cluster to add instances for peak workloads and remove instances to control costs when peak workloads subside. For more information, see Manually Resizing a Running Cluster (p. 336) . Amazon EMR also provides the option to run multiple instance groups so that you can use On-Demand Instances in one group for guaranteed processing power together with Spot Instances in another group to have your jobs completed faster and for lower costs. You can also mix different instance types to take advantage of better pricing for one Spot Instance type over another. For more information, see When Should You Use Spot Instances? (p. 132) . Additionally, Amazon EMR provides the flexibility to use several file systems for your input, output, and intermediate data. For example, you might choose the Hadoop Distributed File System (HDFS) which runs on the master and core nodes of your cluster for processing data that you do not need to store beyond your cluster’s lifecycle. You might choose the EMR File System (EMRFS) to use Amazon S3 as a data layer for applications running on your cluster so that you can separate your compute and storage, and persist data outside of the lifecycle of your cluster. EMRFS provides the added benefit of allowing you to scale up or down for your compute and storage needs independently. You can scale your compute needs by resizing your cluster and you can scale your storage needs by using Amazon S3. For more information, see Work with Storage and File Systems (p. 38) .
Image of page 11
Image of page 12

You've reached the end of your free preview.

Want to read all 395 pages?

  • Spring '12
  • LauraParker
  • Amazon Web Services, Amazon Elastic Compute Cloud

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern

Stuck? We have tutors online 24/7 who can help you get unstuck.
A+ icon
Ask Expert Tutors You can ask You can ask ( soon) You can ask (will expire )
Answers in as fast as 15 minutes
A+ icon
Ask Expert Tutors