Amazon EMR Facebook Presto Meetup

  • Published on
    21-Apr-2017

  • View
    2.191

  • Download
    0

Transcript

Using Amazon Elastic MapReduce as Your Scalable Data WarehouseMarch 19, 2015 | Facebook Presto MeetupInteractive SQL on Amazon S3 using Presto on Amazon EMR Steve McPherson 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.1instanceAMIDB on instanceinstance with CloudWatchElastic IPoptimized instanceAmazon WorkSpacesassignment/taskAmazon EMRclusterMapR M3 engineMapR M5 engineMapR M7 engineengineKinesis-enabled appnew!Amazon Route 53hosted zoneroute tablesolid state disks AWS Direct ConnectrouterAmazon RDScustomer gatewayattributeVPC peeringAuto ScalingAmazon S3bucket with objectsobjectAWS Import/ExportAWS Storage GatewayvolumesnapshotAmazon EBScached volumevirtual tape libraryElastic BeanstalkAmazon GlacierarchivevaultCloudFrontdownload distributionNode.jsstreaming distributionitemstableDynamoDBattributesglobal secondary indexAmazon KinesisRDS DB instanceRDS DB instance standby (Multi-AZ)Oracle DB instance MS SQL instancePostgreSQL instancePIOPMemcachedRedisnew!new!new!new!AWS CloudTrailinstancesdomainAmazon RedshiftAmazon SimpleDBnew!DW1 Dense ComputeElastiCacheDW2 Dense Computeedge locationAWS Toolkit for Visual StudioJavaScriptapplicationstackAmazon VPCVPN connectionvirtual private gatewayalarmstackInternet gateway.NETRDS DB instance read replicaIAMJavaPython (boto)AWS CLIpermissionsroleMFA tokennew!new!new!AWS OpsWorkselastic network instancePHPdata encryption keyAWS Data Pipelinemonitoringnew!new!deploymentCloudWatchElastic LoadBalancingSQL masternew!new!Amazon EC2new!SQL slaveencrypted data AWS Tools for Windows PowerShellnon-cached volumeusersIAM add-ondeploymentsbucketdeploymentsnew!permissionsiOSresourcescache nodestackAWS OpsWorkslayersappsnew!new!appsnew!Amazon SNSnew!Human Intelligence Tasks (HIT)AWS Simple Icons: Deployment & Managementinstancesnew!new!new!Rubynew!instancesnew!permissionsresourcesnew!topicnew!templateAWS Toolkit for EclipseAmazon SEStraditional serverElastic TranscoderemailmonitoringRequesteremail notificationHTTP notificationAmazon CloudSearchSDF metadataAmazon SQSitemmessageAmazon SWFdeciderlayersworkertape storagediskuserInternetAmazon Mechanical Turkclientmobile clientmultimediaworkerscorporate data centergeneric databaseAndroidAWS Security Token ServiceAWS cloudAWS Management Consolevirtual private cloudforumsMySQL DB instancequeueAMAZON EMR2Amazon EMR makes Cluster Management easyAmazon EMRSetup and configurationNode monitoring and replacementLog aggregationCloudwatch integration Expand and shrink on demandIntegration with Spot AWS Support 3Data Warehousing on Amazon EMRExtract Transform & Load Data WarehouseReport Generation & Ad Hoc Analysis Amazon S3 Amazon EMRAmazon EMRMapReduce APIScoopSparkCascadingPigMRHiveSparkCascadingPigPrestoHiveSpark-SQLLingualParquetORCSEQTextExtractTransform & Load Data WarehouseReport GenerationAd Hoc Analysis writeread 4Different Clusters for different workloadsHive, Pig,Cascading PrestoSparkHBaseAmazon S35Why our customers like Presto?It works directly on S3It integrates with HiveIts fastIts JavaDemo: Launch a cluster#> aws emr create-cluster /--name="PRESTO-0-95" /--ami-version=3.5.0 /--applications Name=hive /--ec2-attributes KeyName=[KEY_NAME] /--instance-groups /InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m3.xlarge /InstanceGroupType=CORE,InstanceCount=1,InstanceType=m3.xlarge /--bootstrap-action Name="install presto",Path="s3://github-emr-bootstrap-actions/presto/0.95/install-presto",Args="[-p,8989,-m,1024,-n,128]#wait 5 minutes#> emrscreenRun a Query#> hiveCREATE EXTERNAL TABLE test(id int, name string, surname string, emails string, country string, ip string)ROW FORMAT DELIMITED FIELDS TERMINATED BY ','LOCATION "s3://support.elasticmapreduce/bootstrap-actions/presto/0.95/Query_Sample/";#> presto-cli --catalog hiveshow tables; SELECT name,COUNT(name) FROM test GROUP BY name;Whats nextFormal packaging of PrestoGraceful shrinkCloudwatch integrationIdentity and Authorization integration with AWS servicesGet started todayAmazon EMRhttp://aws.amazon.com/elasticmapreduce/10