Hadoop Infrastructure @Uber Past , Present and Future

  • Published on
    31-Dec-2016

  • View
    214

  • Download
    2

Transcript

U B E R | Data HadoopInfrastructure@UberPast,PresentandFutureMayankBansalU B E R | Data Transporta=onasreliableasrunningwater,everywhere,foreveryoneUbersMission75+Countries 500+Ci=esAndgrowingU B E R | Data HowUberworksU B E R | Data HowUberworksU B E R | Data HowUberworksU B E R | Data DataDrivenDecisionsU B E R | Data DataInfraOnceUpona8me..(2014)Kafka Logs Key-Val DB RDBMS DBs S3 Applica=onsETL BusinessOpsA/BExperimentsAdhoc Analytics CityOpsVertica DataWarehouse DataScienceEMRU B E R | Data DataInfrastructureTodayKafka8 Logs Schemaless DB SOA DBs Service Accounts ETL MachineLearningExperimenta=onData Science Adhoc Analytics Ops/DataScienceHDFS CityOpsDataScienceSpark|PrestoHiveFewTakeaways StrictSchemaManagement BecauseourlargestdataaudienceareSQLSavvy!(1000sofUberOps!) SQL=StrictSchema BigDataProcessingToolsUnlocked-Hive,PrestoandSpark MigrateSQLsavvyusersfromVer=catoHive&Presto(1000sofOps&100sofdatascien=sts&analysts) Sparkformoreadvancedusers-100sofdatascien=stsU B E R | Data HadoopEvolu8on@ebay20141XNodes1XPB2015 10X Nodes 4X PB Data 3000+ node 30,000+ cores 50+ PB 2016 90X Nodes 40X PB Data HadoopEvolu8on@UberU B E R | Data HadoopClusterU=liza=on Overprovisioningforthepeakloads. Overcapacityforan=cipa=onoffuturegrowthU B E R | Data HadoopEvolu8on@ebay20140Nodes2015 X Nodes 2016 300XNodesMesosEvolu8on@UberU B E R | Data MesosClusterU=liza=on Overprovisioningforthepeakloads Overcapacityforan=cipa=onoffuturegrowthU B E R | Data EndGoalOnlinePrestoU B E R | Data Whatweneed?GLOBALVIEWOFRESOURCESU B E R | Data AvailableResourceManagersU B E R | Data MesosvsYARNYARN MESOSSingleLevelScheduler TwoLevelSchedulerUseCgroupsforisola=on UseCgroupsforIsola=onCPU,Memoryasaresource CPU,MemoryandDiskasaresourceWorkswellwithHadoopworkloads WorkswellwithlongerrunningservicesYARNsupport=mebasedreserva=onsMesosdoesnothavesupportofreserva=onsDominantresourcescheduling SchedulingisdonebyframeworksanddependsoncasetocasebasisScalesBegerSimilarIsola=onDiskisbegerThisisImportantImpforbatchSLAsBegerforbatchU B E R | Data Lets8edthemtogetherYARNisgoodforHadoopMesosisgoodforLongerRunningServicesInaNutshellU B E R | Data U B E R | Data MyriadisMesosFrameworkforApacheYARN MesosmanagesDataCenterresources YARNmanagesHadoopworkloads Myriad GetsresourcesfromMesos LaunchesNodeManagersU B E R | Data YARNwillhandleresourceshandedovertoit. MesoswillworkonrestoftheresourcesMyriadsLimita8onsSta=cResourcePar==oningU B E R | Data YARNwillneverbeabletodooversubscrip=on. NodeManagerwillgoaway Fragmenta=onofresources Mesosoversubscrip=oncankillYARNtooMyriadsLimita8onsResourceOverSubscrip=onU B E R | Data NoGlobalQuotaEnforcement NoGlobalPriori=esMyriadsLimita8onsU B E R | Data Elas=cResourceManagement BinPacking Stability LongListMyriadsLimita8onsU B E R | Data UnifiedSchedulerU B E R | Data HighLevelCharacteris8cs GlobalQuotaManagement CentralSchedulingpolicies Oversubscrip=onforbothOnlineandBatch Isola=onandbinpacking SLAguaranteesatGlobalLevelU B E R | Data UnifiedSchedulerU B E R | Data FewTakeaways Weneedoneschedulinglayeracrossallworkloads Par==oningresourcesarenotgood Atleastcansave30%resources StabilityandsimplicitywinsinProduc=on Mul=LevelofresourceManagementandschedulingwillnotbescalableU B E R | Data U B E R | Data Ques=ons?mabansal@uber.commayank@apache.orgU B E R | Data ThankYou!!!