Hadoop @ eBay: Past, Present, and Future
An overview of eBay's experience with Hadoop in the Past and Present, as well as directions for the Future. Given by Ryan Hennig at the Big Data Meetup at eBay in Netanya, Israel on Dec 2, 2013
1. Hadoop @ eBay: Past, Present and Future Ryan Hennig Hadoop Platform Team 2. ABOUT ME 3. RYAN HENNIG Born and raised in Seattle, WA Studied Computer Science at University of Washington in Seattle Worked on Microsoft SQL Server 2006 2012 - Shipped SQL Server 2008, 2008 R2, 2012 Joined eBay Hadoop team in early 2012 - Based in Bellevue, suburb of SeattleCOMPUTE AND DATA INFRASTRUCTURE3 4. AGENDAPast: Growth of Hadoop at eBay Present: Hadoop Use Cases, Operations Tools Future: Hadoop 2.0 5. HADOOP AT EBAY: PAST Growth of Hadoop at eBay Adventures in Forking Partnership with Hortonworks 6. HADOOP EVOLUTION @ eBay2013 Shared clusters2012 2011 20102009 Search 2007 10snodesSingle digit nodesShared cluster 100s nodes 1000s + core PB CDH2 Shared clusters 1000s node 10,000+ core 10s PB Wilma (0.20) Shared clusters 1000s node 10,000+ core 10s PB Argon (0.22) 4k+ node 40,000+ core 50s PB HDP 1.xHADOOP AT EBAY: PAST6 7. ADVENTURES IN FORKING 2007-2010: eBay runs shared clusters on Cloudera Distribution of Hadoop 2010-2012: eBay runs shared clusters on custom Hadoop versions 2010: Wilma (based on 0.20) 2011: Argon (based on 0.22) 2012: Custom branch abandoned Lessons Learned Forking a fast-changing open source project is difficult and risky Balancing Development and operations needs Development team size Facebook had 100 eBay had 15 Coordination with open source community = lots of overhead Divergence from open source: Push changes early and oftenHADOOP AT EBAY: PAST7 8. HADOOP AT EBAY: PAST8 9. EBAY AND HORTONWORKS 2012: eBay enters partnership with HortonWorks Goals Focus on eBay-specific development internally Leverage HortonWorks expertise for general Hadoop Development Avoid source code divergence by making open source contribution a priority Benefits to HortonWorks Credibility enhanced by having a well-known customer Ability to test at large scaleHADOOP AT EBAY: PAST9 10. HADOOP AT EBAY: PRESENT Shared and Dedicated Clusters Job Distribution Use Case Examples eBay Data Platform Overview 11. SHARED AND DEDICATED CLUSTERS Shared clusters 10s of PB and 10s of thousands of slots per cluster Used primarily for analytics of user behavior and inventory Mix of production and ad-hoc jobs Mix of MR, Hive, PIG, Cascading etc. Hadoop and HBase security enabledDedicated clusters Very specific use cases like Index Building Tight SLAs for jobs (in order of minutes) Immediate revenue impact Usually smaller than our shared clusters, but still big (100s of nodes)HADOOP AT EBAY: PRESENT11 12. JOB DISTRIBUTION BY TYPEHADOOP AT EBAY: PRESENT12 13. USE CASE EXAMPLES Cassini, eBays new search engine: Use MR to build full and incremental near-real-time indexes Raw Data is stored in HBase for efficient updates and random read Strong SLAs: < 10 minutes Run on dedicated clustersRelated and similar Items recommendations: Use transactional data, click stream data, search index, etc. Production MR jobs on a shared clusterAnalytics dashboard: Run Mobius MR jobs to join click stream data and transactional data Store summary data in HBase Web application to query HBaseHADOOP AT EBAY: PRESENT13 14. HADOOP OPERATIONS LDAP Integration - All users stored in Active Directory, accessed via LDAP - Access to MapReduce Queues granted via MapReduce queues - Batch users: shared by a group of users Security - Kerberos as implemented by Microsoft Active Directory - One domain for users, another for service/server principals - Batch users authenticated via keytabs, not passwords Misc - 10s of slave nodes are broken at any given time - Often need to add several racks of machines at a timeHADOOP AT EBAY: PRESENT14 15. HADOOP OPERATIONS Team has Development and Operations Responsibilities - 2 Huge shared clusters - 1800+ users, exponential growth - About 10 Hadoop developers - Recently: operations work moved to dedicated team Developed several tools to manage operations - Hadoop Management Console: user-facing web app - ldap-admin: swiss-army knife style tool for hadoop admins - Puppet: for adding machines to the clusters, many racks at a time - Decom/Recom scripts: automatic detection, repair, decommission, and recommission of slave nodesHADOOP AT EBAY: PRESENT15 16. HADOOP MANAGEMENT CONSOLE Custom Web application built on Ruby on Rails Self-service tools are continually added to reduce support load User Management Access Requests Group Membership Batch User Management New Requests Sudoer management Dataset Management Explore Datasets Request New dataset transfer between Teradata and Hadoop Metadata tools Each dataset is stored in custom XML format Code Generation: Hive Tables, Java POJOs HADOOP AT EBAY: PRESENT16 17. HADOOP AT EBAY: PRESENT17 18. HADOOP AT EBAY: PRESENT18 19. HADOOP AT EBAY: PRESENT19 20. HADOOP AT EBAY: PRESENT20 21. HADOOP AT EBAY: PRESENT21 22. HADOOP AT EBAY: PRESENT22 23. HADOOP AT EBAY: PRESENT23 24. ldap-admin Command-line tool written in Ruby Swiss-army knife tool, features added on demand for support issues Often used features: Add a user to a group View key details for LDAP users and groups List all users, batch users, hadoop groups Reset batch user passwords and keytabs Show/add/remove sudoers for a batch account Run user diagnostics: check permissions, keytabs, etcHADOOP AT EBAY: PRESENT24 25. HADOOP AT EBAY: FUTURE HDFS Federation YARN New Scenarios Storage and Operational Efficiency 26. HDFS HA and Federation HDFS High-Availability for Reliability NameNode in Hadoop 1.0 is a Single Point of Failure Automated failover to hot standby Depends on ZooKeeper HDFS Federation for Scalability and Isolation Hadoop 1.0: Single NameNode service Secondary NameNode is not for failover Storage scales horizontally, but Namespace scales vertically No isolation for different tenants or applications Hadoop 2.0: HDFS Federation Partition the HDFS Namespace Many independent NameNodes Allows direct access to Block Storage w/o going through HDFS interfaceHADOOP AT EBAY: FUTURE26 27. HDFS HAHADOOP AT EBAY: FUTURE27 28. HDFS HAHADOOP AT EBAY: FUTURE28 29. HDFS HAHADOOP AT EBAY: FUTURE29 30. HDFS Federation Horizontal Scalability of HDFS Namespace Multiple independent NameNodes serving a subtree of the NameSpaceExample: NN1 provides /users, NN2 provides /reportsHADOOP AT EBAY: FUTURE30 31. YARN Hadoop 1.0: MapReduce JobTracker and TaskTracker services Handles Resource Management, Job ExecutionHadoop 2.0: YARN - Refactoring Responsiblities of JobTracker and TaskTracker into more general platform - Global ResourceManager - Cluster-wide resource managements - Per-application ApplicationMaster - Application-specific job controlHADOOP AT EBAY: FUTURE31 32. YARNHADOOP AT EBAY: FUTURE32 33. YARNHADOOP AT EBAY: FUTURE33 34. YARNHADOOP AT EBAY: FUTURE34 35. YARNHADOOP AT EBAY: FUTURE35 36. New Scenarios Iterative Query Stinger (Hive), Impala, etc Rapid Data exploration and analysis Graph Databases TitanDB, Giraph Billions of vertices and edges Complex Graph Traversals Applications: PayPal fraud detection, Social Graph Analysis Real-Time Processing Storm (Twitter), Apache S4 Reinforcement Learning, MonitoringHADOOP AT EBAY: FUTURE36 37. Efficiency and Reliability Storage Efficiency HDFS introduces a 3x storage cost for its replicas HDFS-RAID: more reliability for 1.5x storage cost Reed-Solomon Locally Repairable Codes (Project Xorbas) Tradeoff: the cost of repairing lost data is much higher Operational Efficiency More automation More self-service tools Better MonitoringHADOOP AT EBAY: FUTURE37 38. Open Source HMC Metadata Long term goal: standardize on open source technologies (HCatalog) Short term: explore what should be open sourced Hadoop Management Console Hadoop Access Request Automation Batch user creation and management Metadata management Code generation of dataset to Hive tables and Java POJOs ldap_admin tools Very useful but tightly coupled to eBays LDAP configuration Willing to open source if there is interestHADOOP AT EBAY: FUTURE38 39. THANK YOU Questions?