Apache Hadoop YARN: Past, Present and Future

  • Published on
    07-Jan-2017

  • View
    442

  • Download
    0

Transcript

PowerPoint PresentationApache Hadoop YARN: Past, Present and FutureMelbourne, Aug.31 2016Junping Du# Hortonworks Inc. 2011 2016. All Rights Reserved1Who.JSON{ "name" : "Junping Du" , "job_title" : "Lead Software Engineer @ Hortonworks YARN core team", "experiences" : [ { "software_industry_years" : 10, "hadoop_experience" : "Hadoop contributor before YARN comes out, Apache Hadoop committer & PMC, Release Manager for Apache Hadoop 2.6", non_hadoop_experience" : Architect in cloud computing and enterprise software" }], "email" : "junping_du@apache.org"}# Hortonworks Inc. 2011 2016. All Rights ReservedWhat is Apache Hadoop YARN ?YARN is short for Yet Another Resource NegotiatorBig Data Operating SystemResource Management and SchedulingSupport for colorful applications, like: Batch, Interactive, Real-Time, etc.Enterprise adoption acceleratingSecure mode becoming more widespreadMulti-tenant supportDiverse workloadsSLAsTolerance for slow running jobs decreasingConsistent performance desired# Hortonworks Inc. 2011 2016. All Rights ReservedPast# Hortonworks Inc. 2011 2016. All Rights Reserved# Hortonworks Inc. 2011 2016. All Rights ReservedA brief TimelineJune-July 2010August 2011May 2012August 2013Sub-project of Apache HadoopReleases tied to Hadoop releasesAlphas and betas# Hortonworks Inc. 2011 2016. All Rights Reserved5GA ReleasesOct. 2013Feb. 2014Apr. 2014Aug. 20141st GAMR binary compatibilityYARN API cleanupTesting!1st Post GABug fixesAlpha featuresLoad simulatorLCE enhancementsRM Fail-overCS PreemptionTimeline Service V1Writable REST APIsTimeline Service V1 security# Hortonworks Inc. 2011 2016. All Rights ReservedGA Releases (Recent + Planning)Nov. 2014Apr. 20152nd H 2016 (estimated)TBDKMSLong running service supportRolling UpgradeNode Label SupportDocker ContainerPluggable AuthorizationShared Resource CacheTimeline Service V1.5Graceful DecommissionLog CLI EnhancementTimeline Service V2# Hortonworks Inc. 2011 2016. All Rights Reserved7Outstanding YARN Features released in 2.6/2.7Default PartitionPartition BGPUsPartition CWindowsJDK 8JDK 7JDK 7Rolling UpgradeNode LabelPluggable ACLs# Hortonworks Inc. 2011 2016. All Rights ReservedRecent Maintenance Releases Updates2.6 and 2.7 maintenance releases are carried outOnly blockers and critical fixes are addedApache Hadoop 2.62.6.4 released in Feb. 20162.6.3 released in Dec. 20152.6.2 released in Oct. 2015Apache Hadoop 2.72.7.3 released in Aug. 20162.7.2 released in Jan. 20162.7.1 released in Jul. 2015# Hortonworks Inc. 2011 2016. All Rights ReservedPresent# Hortonworks Inc. 2011 2016. All Rights Reserved# Hortonworks Inc. 2011 2016. All Rights ReservedYARN in Modern Data ArchitectureModern Data Architecture Enable applications to have access to all your enterprise data through an efficient centralized platform Supported with a centralized approach governance, security and operations Versatile to handle any applications and datasets no matter the size or type YARNs Evolution The CORE of Modern Data ArchitectureCentralized resource management, high efficient scheduling, flexible resource model, isolation in security and performance, colorful applications support, etc.# Hortonworks Inc. 2011 2016. All Rights Reserved11Apache Hadoop YARNResourceManager(active)ResourceManager(standby)NodeManager1NodeManager2NodeManager3NodeManager4Resources: 128G, 16 vcoresAuto-calculate node resourcesLabel: SASDynamically update node resources# Hortonworks Inc. 2011 2016. All Rights ReservedNodeManager Resource ManagementOptions to report NM resources based on node hardwareYARN-160Restart of the NM required to enable featureAlternatively, admins can use the rmadmin command to update the nodes resourcesYARN-291Looks at the dynamic-resource.xmlNo restart of the NM or the RM required# Hortonworks Inc. 2011 2016. All Rights ReservedApache Hadoop YARN SchedulerInter queue pre-emptionImprovements to pre-emptionApplicationQueue B 25%Queue C 25%Label: SAS (non-exclusive)Queue A 50%Priority/FIFO, FairResourceManager(active)Application, Queue A, 4G, 1 vcoreSupport for application priorityReservation for applicationSupport for cost based placement agentUser# Hortonworks Inc. 2011 2016. All Rights ReservedCapacity schedulerSupport for application priority within a queueYARN-1963Users can specify application prioritySpecified as an integer, higher number is higher priorityApplication priority can be updated while its runningImprovements to reservationsYARN-2572Support for cost based placement agent added in addition to greedyQueue allocation policy can be switched to fair sharingYARN-3319Containers allocated on a fair share basis instead of FIFO# Hortonworks Inc. 2011 2016. All Rights Reserved15Capacity schedulerSupport for non-exclusive node labelsYARN-3214Improvement over partition that existed earlierBetter for cluster utilizationImprovements to pre-emption# Hortonworks Inc. 2011 2016. All Rights ReservedNode 1NodeManagerSupport added for graceful decomissioning128G, 16 vcoresLaunch Applicaton 1 AMAM process/Docker container(alpha)Launch AM process via ContainerExecutor DCE, LCE, WSCE. Monitor/isolate memory and cpu. Support added for disk and network isolation via CGroups(alpha)Apache Hadoop YARN Application LifecycleResourceManager(active)Request containersAllocate containersSupport added to resize containers.Container 1 process/Docker container(alpha)Container 2 process/Docker container(alpha)Launch containers on node using DCE, LCE, WSCE. Monitor/isolate memory and cpu. Support added for disk and network isolation using Cgroups(alpha).History Server(ATS 1.5 leveldb + HDFS, JHS - HDFS)HDFSLog aggregation# Hortonworks Inc. 2011 2016. All Rights ReservedApache Hadoop YARNGraceful decommissioning of NodeManagersYARN-914Drains a node thats being decommissioned to allow running containers to finishResource isolation support for disk and networkYARN-2619, YARN-2140Containers get a fair share of disk and network resources using CGroupsAlpha featureDocker support in LinuxContainerExecutorYARN-3853Support to launch Docker containers alongside process containersAlpha feature# Hortonworks Inc. 2011 2016. All Rights ReservedApache Hadoop YARNSupport for container resizingYARN-1197Allows applications to change the size of an existing containerATS 1.5YARN-4233Store timeline events on HDFSBetter scalability and reliability# Hortonworks Inc. 2011 2016. All Rights ReservedOperational supportImprovements to existing tools (like yarn logs)New tools added (yarn top)Improvements to the RM UI to expose more details about running applications# Hortonworks Inc. 2011 2016. All Rights ReservedFuture# Hortonworks Inc. 2011 2016. All Rights Reserved# Hortonworks Inc. 2011 2016. All Rights ReservedPackagingContainersLightweight mechanism for packaging and resource isolationPopularized and made accessible by DockerCan replace VMs in some casesOr more accurately, VMs got used in places where they didntneed to beNative integration ++ in YARNSupport for Container Runtimes in LCE: YARN-3611Process runtimeDocker runtime# Hortonworks Inc. 2011 2016. All Rights ReservedAPIsApplications need simple APIsNeed to be deployable easilySimple REST API layer fronting YARNhttps://issues.apache.org/jira/browse/YARN-4793[Umbrella] Simplified API layer for services and beyondSpawn services & Manage them# Hortonworks Inc. 2011 2016. All Rights ReservedYARN as a PlatformYARN itself is evolving to support services and complex appshttps://issues.apache.org/jira/browse/YARN-4692[Umbrella] Simplified and first-class support for services in YARNSchedulingApplication priorities: YARN-1963Affinity / anti-affinity: YARN-1042Services as first-class citizens: Preemption, reservations etc# Hortonworks Inc. 2011 2016. All Rights ReservedYARN as a Platform (Contd)Application & Services upgradesDo an upgrade of my Spark / HBase apps with minimal impact to end-usersYARN-4726Simplified discovery of services via DNS mechanisms: YARN-4757YARN Federation to infinity and beyond: YARN-2915# Hortonworks Inc. 2011 2016. All Rights ReservedYARN Service FrameworkPlatform is only as good as the toolsA native YARN frameworkhttps://issues.apache.org/jira/browse/YARN-4692[Umbrella] Native YARN framework layer for services andbeyond Slider supporting a DAG of apps:https://issues.apache.org/jira/browse/SLIDER-875# Hortonworks Inc. 2011 2016. All Rights ReservedOperational and User ExperienceModern YARN web UI - YARN-3368Enhanced shell interfacesMetrics: Timeline Service V2 YARN-2928Application & Services monitoring, integration with other systemsFirst class support for YARN hosted services in Ambarihttps://issues.apache.org/jira/browse/AMBARI-17353# Hortonworks Inc. 2011 2016. All Rights ReservedUse-cases.. Assemble!YARN and Other Platform ServicesStorageResourceManagementSecurityServiceDiscoveryManagementMonitoringAlertsHoliday AssemblyHBaseWebServerIOT AssemblyKafkaStormHBaseSolrGovernanceMRTezSpark# Hortonworks Inc. 2011 2016. All Rights Reserved28Future Work List (I)Arbitrary resource typesYARN-3926Admins can decide what resource types to supportResource types read via a config fileNew scheduler featuresYARN-4902Support richer placement strategies such as affinity, anti-affinityDistributed schedulingYARN-2877, YARN-4742NMs run a local schedulerAllows faster scheduling turnaroundYARN federationYARN-2915Allows YARN to scale out to tens of thousands of nodesCluster of clusters which appear as a single cluster to an end userBetter support for disk and network isolationTied to supporting arbitrary resource types# Hortonworks Inc. 2011 2016. All Rights Reserved29Future Work List (II)Simplified and first-class support for services in YARN YARN-4692Container restart (YARN-3988)Allow container restart without losing allocation Service discovery via DNS (YARN-4757)Running services can be discovered via DNSAllocation re-use (YARN-4726)Allow AMs to stop a container but not lose resources on the nodeEnhance Docker supportYARN-3611Support to mount volumesIsolate containers using CGroupsATS v2 Phase 2YARN-2928 (Phase 1), YARN-5355 (Phase 2)Run timeline service on HbaseSupport for more data, better performanceAlso in the pipelineSwitch to Java 8 with Hadoop 3.0Add support for GPU isolationBetter tools to detect limping nodesNew RM UI YARN-3368# Hortonworks Inc. 2011 2016. All Rights ReservedHDP Evolution with Apache Hadoop YARNBeyond2.x1.x# Hortonworks Inc. 2011 2016. All Rights ReservedThank you!# Hortonworks Inc. 2011 2016. All Rights Reserved# Hortonworks Inc. 2011 2016. All Rights Reserved