2. Type a Name for the security configuration. You use this name to specify the security configuration when you create a cluster. 3. Under AWS Lake Formation integration , select Enable fine-grained access control managed by AWS Lake Formation . 4. Select an IAM role for AWS Lake Formation to apply. Note For more information, see Overview of the IAM Roles for Lake Formation (p. 247) . 5. Select an IAM role for other AWS services to apply. 6. Upload your identify provider (IdP) metadata by specifying the S3 path where the metadata is located. Note For more information, see Configure Trust Relationship Between your IdP and Lake Formation (p. 249) . 7. Set up other security configuration options as appropriate and choose Create . You must enable Kerberos authentication using the cluster-dedicated KDC. For more information, see Configure EMR Security Features (p. 252) . 2. Launch a cluster with the security configuration that you specified in the previous step. For more information, see Specify a Security Configuration for a Cluster . Launch an Amazon EMR cluster integrated with Lake Formation using the CLI The following procedure demonstrates how to launch an Amazon EMR cluster with Zeppelin integrated with AWS Lake Formation. 1. Create an security-configuration.json file for security configuration with the following content. Specify the whole path to the IdP metadata file uploaded in S3. • Replace account-id with your AWS account ID. Specify a value for TicketLifetimeInHours to determine the period for which a Kerberos ticket issued by the KDC is valid. { "LakeFormationConfiguration": { "IdpMetadataS3Path": "s3:// mybucket/myfolder/idpmetadata.xml ", "EmrRoleForUsersARN": "arn:aws:iam:: account-id :role/ IAM_Role_For_AWS_Services ", 253
Amazon EMR Management Guide Launch an Amazon EMR Cluster with Lake Formation "LakeFormationRoleForSAMLPrincipalARN": "arn:aws:iam:: account- id :role/ IAM_Role_For_Lake_Formation " }, "AuthenticationConfiguration": { "KerberosConfiguration": { "Provider": "ClusterDedicatedKdc", "ClusterDedicatedKdcConfiguration": { "TicketLifetimeInHours": 24 } } } } 2. Run the following command to create a security configuration. aws emr create-security-configuration \ --security-configuration \ --name security-configuration 3. Create a configurations.json file that configures the Hive Metastore. [ { "Classification": "spark-hive-site", "Properties": { "hive.metastore.glue.catalogid": " account-id " } } ] 4. Run the following command to launch an Amazon EMR cluster. • Replace subnet-00xxxxxxxxxxxxx11 with your subnet ID. Replace the EC2_KEY_PAIR with the name of your EC2 key pair for this cluster. EC2 key pair is optional and only required if you want to use SSH to access your cluster. • Replace cluster-name with the name of your cluster. aws emr create-cluster --region us-east-1 \ --release-label emr-5.26.0 \ --use-default-roles \ --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.xlarge \ InstanceGroupType=CORE,InstanceCount=1,InstanceType=m4.xlarge \ --applications Name=Zeppelin Name=Livy \ --kerberos-attributes Realm=EC2.INTERNAL,KdcAdminPassword= MyClusterKDCAdminPassword
