Presto Meetup Talk @ FB (03/19/15)

  • Published on

  • View

  • Download


Presto @ Netflix: Interactive Queries at Petabyte ScaleNezih Yigitbasi and Zhenxiao LuoBig Data PlatformNezih Yigitbasi () - add sparkNezih Yigitbasi () - check reinvent and Tom's slidesDaniel Weeks () - We should start calling Franklin "Metacat" (the open source name). Check with Charles for a logo.Outline Big data platform @ Netflix Why we love Presto? Our contributions What are we working on? What else we need?Cloud AppsS3SuroUrsulaSSTablesCassandraAegisthusEvent Data15mDailyDimension DataOur Data Pipelinedata from apps/services.event data 200b events: app logs, user activity (search event, movie detail click from website, etc.), system operational dataursula demultiplex the events into event types (~150 event types right now). latency of this ursula pipeline is 15mdimension data: subscriber data. aegisthus extracts data from cassandra which is the online backing store for netflix and writes to s3.DataWarehouseServiceToolsGatewaysBig Data Platform ArchitectureProdClientsClustersVPCQueryProdTestBonusProdmention that we have single dw on s3, spin up multiple clusters. ittle perf diff. on s3 vs hdfs as we are mostly cpu bound. reportingcharlotte: lineage Batch jobs (Pig, Hive) ETL jobs reporting and other analysis Ad-hoc queries interactive data exploration Looked at Impala, Redshift, Spark, and PrestoOur Use Casesimpala: no s3 supportspark loads all data, doesnt stream + stability issues at that time. it couldnt even handle an hour worth of data ~ 2013. spark sql recently graduated from alpha with the spark 1.3 release ( need to copy data from s3 to redshiftDeployment v 0.86 1 coordinator (r3.4xlarge) 250 workers (m2.4xlarge)ToolingNumbers ~2.5K queries/day against our 10PB Hive DW on S3 230+ Presto users out of 300+ platform users presto-cli, Python, R, BI tools (ODBC/JDBC), etc. Atlas/Suro for monitoring/loggingPresto @ Netflixr3.4xlarge and m2.4xlarge are both memory optimized instances where m2 is a previous generation instance type5PB of our 10PB Hive DW is in Parquet formatWhy we love Presto? Open source Fast Scalable Works well on AWS Good integration with the Hadoop stack ANSI SQLsingle warehouse on s3, spin up multiple test/prod presto clusters and query live data etc.Our Contributions 24 open PRs, 60+ commits S3 file system multipart upload, IAM roles, retries, monitoring, etc. Functions for complex types Parquet name/index-based access, type coercion, etc. Query optimization Various other bug fixess3 fs: exp backoff, exposed various configs for the aws sdk, multipart upload, IAM roles, and monitoring prestoS3FileSystem and AWS sdkbetter tooling/community support for parquet. good integration with existing tools hive, spark, etc..several bug fixes and new functions to manipulate complex types to close the gap between hive and prestoDDL: alter/create tableoptimization:(2085) Rewrite Single Distinct Aggregation into GroupBy and (1937) and Optimize joins with similar subqueriescomplex types: array: contains, concat, sort, map: map_agg and map constructors, map_keys, map_values, etc.bridge the gap between hive and presto Vectorized reader* Read based on column vectors Predicate pushdown Use statistics to skip data Lazy load Postpone loading the data until needed Lazy materialization Postpone decoding the data until needed What are we Working on?Parquet Optimizations* PARQUET-131Netflix Integration BI tools integration ODBC driver, Tableau web connector, etc. Better monitoring Ganglia Atlas Data lineage Presto Suro CharlotteWe log queries to our internal data pipeline (Suro) and another internal tool (Charlotte) analyzes data lineage Graceful cluster shrink Better resource management Dynamic type coercion for all file formats Support for more Hive types (e.g., decimal) Predictable metastore cache behavior Big table joins similar to HiveWhat else we need?we are pushing reporting to Presto with our Tableau/MS work. not for ETL. monitoring, scheduling improvements.Prestos distributed join is still memory-limited as there is no spills.hive decimal type: -> at least be able to read it, still openTHANK YOU