ABR Algorithms Explained (from Streaming Media East 2016)
PowerPoint PresentationADAPTIVE BITRATE ALGORITHMS: HOW THEY WORK AND HOW TO OPTIMIZE YOUR STACKStreaming Media East Track DTuesday, May 10, 2016 1:45 to 2:30 pmCLIENT-ACCELERATED STREAMINGStreamroot: Who are we?PARTNERSINFINITE POSSIBILITIES, LIMITLESS DELIVERYStreamroot combines the best of a controlled, centralized network with the resilience and scalability of a widely distributed delivery architecture.Presentation OutlineIntroduction: What are we trying to accomplish? Why does this matter?The Basics of how ABR algorithms work: constraints & parameters, processExample: hls.jsPossible improvements to basic ABR algorithms: smoothing, quantizing, schedulingExample: dash.jsGoing furtherAnother Approach: buffer levelsThe key to improving: testing and iteratingI. Why ABR?Multiplicity of network conditions and devices need to dynamically select resolutionHTTP / TCP stack removal from the transport protocol congestion logic client-level estimation & decisionsSource: FESTIVE diagram of HTTP streamingI. Design GoalsMaximize efficiency stream at the highest bitrate possibleMinimize rebuffering avoid underrun and playback stallsEncourage stability switch only when necessary(4. Promote fairness across network bottlenecks) I. Why this MattersBuffer ratioViews 24 min longer when buffer ratio is < 0.2% for live contentView time drops 40% when > 0.4% buffer ratio markBuffer ratio vs. play timeSource: NPAW aggregated data for a set of European live broadcasters II. The Basics: Constraints and ParametersCONSTRAINTSTRADEOFF PARAMETERSScreen size / Player sizeBuffer sizeCPU & Dropped frame thresholdBandwidth & possible bitrateStartup time / Rebuffering recovery(Bonus: P2P Bandwidth)II. The Basics: ConstraintsScreen & Player SizeBitrate should never be larger than the actual size of the video playerCPU & Dropped frame rateDowngrade when too many dropped frames per secondStartup timeAlways fetch the lowest quality first whenever the buffer is emptyII. The Basics: Tradeoff parametersMaximize bitrate available bandwidth estimationEstimate the available bandwidth based on prior segment(s)Available bandwidth = size of chunk / time taken to downloadMinimize rebuffering ratio buffer sizeBuffer ratio = buffering time / (buffering time + playback time)Abandon strategySource: BOLAExample: HLS.jsHTML5 (MSE-based) media engine open-sourced by Dailymotionhttps://github.com/dailymotion/hls.js Very modular, so you can change the rules without even forking the media engine!Explain what HLS.js is. Also say its quite simple to extend, as the different controllers are actually option parameters, and so can be easily replaced.Example: HLS.js player size level cappinghttps://github.com/dailymotion/hls.js/blob/master/src/controller/cap-level-controller.js#L68Checks the max CapLevel corresponding to current player sizeFrequency: every 1000 msChecks the max CapLevel corresponding to current player sizeEvery 1000ms.You can also add up manual level caps on initialization.If the cap level is bigger that the last one (which means the player size has grown, like in Fullscreen for exemple), then you flush the current buffer and ask for a new quality right away (force the buffer)Example: HLS.js dropped frame rulehttps://github.com/dailymotion/hls.js/blob/master/src/controller/fps-controller.js#L33 Calculates the dropped frames per second ratio. If > 0.2, bans the level forever goes into restricted capping levelsfpsDroppedMonitoringThresholdfpsDroppedMonitoringPeriod Calculates the dropped frames per second ratio. If it is > 0.2, bans the level for ever => goes into restricated levelsNot activated in production!fpsDroppedMonitoringThresholdfpsDroppedMonitoringPeriod Example: HLS.js startup strategyhttps://github.com/dailymotion/hls.js/blob/master/src/controller/stream-controller.js#L131First segment is loaded from the first level in the playlist, then continues with normal ABR rule.First segment always from the lowest quality, then it continues with normal rule (very simple simple rule in practice!)Another optimization is just to load this level (and playlist), and dont wait for the other levels to have been loaded Example: HLS.js bandwidth-based ABR controllerhttps://github.com/dailymotion/hls.js/blob/master/src/controller/abr-controller.jsSimple algorithm,inspired by Androids AVControllers ABR algoSimple algorithm, Example: HLS.js P2P bandwidth estimationHere talk about Streamroot, and the fact having the sources from different buffers is even more difficult!Code from us?xBasically a onProgress & bandwidth estimation too (coming from CDN & P2P network!)Request.onProgressRequest.onLoad => classic estimationWith P2P estimation! Dont wanna infinite speed, and thus includes a P2P bandwidth metric.Not the same for different peers, so averaged and smoothedCode from us?xBasically a onProgress & bandwidth estimation too (coming from CDN & P2P network!)Shema => a P2P cache and a CDN buffer => and time = 0Example: HLS.js bandwidth fragmentLoad abort rulehttps://github.com/dailymotion/hls.js/blob/master/src/controller/abr-controller.js#L51 One of the most important ones hereWhat happens if you started a request and then BW drops ? Especially important when you ahve long fragments, this can very easily lead to a buffer underrun! After Half of the needed time, compare the estimate time of arrival to time of buffer underrun. And then see if there is another level that could solve the issue?STRONG POINTSCOULD BE IMPROVEDVery simple and understandableAdd history parameter to BW estimation and adjustmentHandles CPU & player size constraintsStartup time constraint could be improved to get the lowest level firstConservative BW adjustment to avoid oscillationSound emergency abort mechanismExample: HLS.js sum-upSimple algorithm with better performances in practice compared to native implementations. Pros:Simple implementation, taking into account a lot of different paramsWorks as good as the other implementation at Dailymotion! (alshls, android, iPhone etc)Cons:Still Naive bandwidth estimation => possible overestimation, and possible oscillation around bitrates?We can do a lot of improvements on bandwidth estimation! difficult to correlate a unique segment download time to the real devices available bandwidth, for several reasons:You can have very quick bandwidth changes, especially on a mobile network, as well as unexpected bandwidth dropsThe requests can be living in parallel with other TCP request (HTTP or any other on the users device)This can lead to frequent estimation oscillations!Tweak the parametershttps://github.com/dailymotion/hls.js/blob/master/API.md#fine-tuning Dropped FPS: capLevelOnFPSDrop: false,fpsDroppedMonitoringPeriod: 5000,fpsDroppedMonitoringThreshold: 0.2PlayerSize:capLevelToPlayerSize: false,Write your own rules!AbrController: AbrControllercapLevelController: CapLevelController,fpsController: fpsControllerExample: HLS.js how to improveThe different static constants more for you use-case?You can play with themYou can also easily build your own rule!Here is an example on Github?First explain how to do that?III. Improvements: the pitfalls of bandwidth estimationNot resilient to sudden network fluctuationsOften leads to bitrate oscillationsBiased by HTTP/TCP calls on the same device/networkdifficult to correlate a unique segment download time to the real devices available bandwidth, for several reasons:You can have very quick bandwidth changes, especially on a mobile network, as well as unexpected bandwidth dropsThe requests can be living in parallel with other TCP request (HTTP or any other on the users device)This can lead to frequent estimation oscillations!III. Improvements: better bandwidth estimationA new 4-step approach:EstimationSmoothingQuantizingSchedulingSource: Block diagram for PANDAdifficult to correlate a unique segment download time to the real devices available bandwidth, for several reasons:You can have very quick bandwidth changes, especially on a mobile network, as well as unexpected bandwidth dropsThe requests can be living in parallel with other TCP request (HTTP or any other on the users device)This can lead to frequent estimation oscillations!III. Improvements: estimation & smoothingEstimation: take history into account!Smoothing: Apply a smoothing function to the range of values obtained. Possible functions: average, median, EMWA, harmonic meanHow many segments? 3? 10? 20?III. Improvements: quantizingQuantizing: quantize the smoothed bandwidth to a discrete bitrateAdditive increase multiplicative decrease conservative when switchingup, more aggressive when down.Source: FESTIVEGood to minimize the oscillations!Can have a different switch when UP or DOWN:Conservative when UP, less conservative when DOWNYou can also scale taking into account the bitrate (and its utility)III. Improvements: scheduling (bonus)Continuous & periodic download scheduling oscillation, over- or underused resourcesRandomize target buffer level to avoid startup bias and increase stability.Also extremely useful for promoting fairness!Source: FESTIVEExample 2: DASH.JSDash.js is the reference DASH player developed by DASH-IF.https://github.com/Dash-Industry-Forum/dash.js/wiki 4 different rules:2 Main:ThroughputRuleAbandonRequestsRule2 secondary:BufferOccupancyRuleInsufficientBufferRuleExample 2: DASH.JS main rulesSource: DASH-IF, MaxdomeDASH.Js has 4 different RulesThroughputRulecalculates bandwidth with some smoothing!No real quantizing (have a real estimate and no other values)AbandonRequestsRulecancels if takes more than 1.5x of donwloadBufferOccupancyRuleto now go down if buffer large enough (RICH BUFFER TRESHOLD) InsufficientBufferRule au tasExample 2: DASH.JS, sum-upSTRONG POINTSCOULD BE IMPROVEDSmoothes bandwidthNo quantization of bitratesSegment abort mechanism to avoid buffering during network dropsDoesnt handle CPU & Player size constraintsRich buffer threshold to avoid BW oscillationsExample 2: DASH.JS how to improveTweak the ParametersThroughputRule: AVERAGE_THROUGHPUT_SAMPLE_AMOUNT_LIVE = 2;AVERAGE_THROUGHPUT_SAMPLE_AMOUNT_VOD = 3;AbandonRequestRule:GRACE_TIME_THRESHOLD = 500;ABANDON_MULTIPLIER = 1.5;2. Write your own ruleshttps://github.com/Dash-Industry-Forum/dash.js/wiki/Migration-2.0#extending-dashjshttps://github.com/Dash-Industry-Forum/dash.js/blob/development/src/streaming/rules/abr/ABRRulesCollection.js BufferOccupancyRule: RICH_BUFFER_THRESHOLD = 20You can easily take the best out of hls.js here! Write a player size rule, a FPS drop rule change the Abandonrate rule!Its all very easy to do!Buffer size based ONLY no more bandwidth estimationsUses utility theory to make decisions: configurable tradeoff between rebuffering potential & bitrate maximization:Maximize Vn + y SnWhere:Vn is the bitrate utilitySn is the playback Smoothnessy is the tradeoff weight parameterIV. Going further: DASH.js BOLA, another approachBOLA stuff ? The approach is quite difficult to explain based on utility theory, and supposed to be a lot more efficient because there are no need to estimate the bandiwdth.BUTNot fully implemented in dash.js, and there are some optimisation constants that depend a lot on the use-case (target buffer, live, vod)Today not working great for small segment sizes AND small buffer size ( but good for 1+ min apparently?)Still work in progress, but an interesting approach!IV. Going further: test and iterate!Tweaking algorithms is easy, creating your forks too.Youve got the power!Know what is important to you (buffering, max bitrate, bandwidth savings)Compare and cross with QoS analytics to understand your audiencesTest and iterate: AB testing allows you to compare changes in real-time Significant improvements without even changing your workflow!We can give a lot of tips, but most of the use-cases are spcific (segment size, playlist size, latency and also which parameter is most important to you (buffer rate? Best bitrate ? Best bitrate no so useful if you KNOW that most of your user have a better bandwidth anyway? Number of switches)So whats important is to have a way to iterate and improve ?The best is to have AB testing on 50/50 of population, to be able to quickly see results and compare them! What happens if you just tweak one parameter ? The results can be quite stunning!QUESTIONS?Further Reading / Contact UsProbe and Adapt: Rate Adaptation for HTTP Video Streaming At Scale. Zhi Li, Xiaoqing Zhu, Josh Gahm, Rong Pan, Hao Hu, Ali C. Begen, Dave Oran, Cisco Systems, 7 Jul 2013.Improving Fairness, Efficiency, and Stability in HTTP-based Adaptive Video Streaming with FESTIVE, Junchen Jiang, Carnegie Mellon University, Vyas Sekar, Stony Brook University, Hui Zhang, Carnegie Mellon, University/Conviva Inc. 2012.ELASTIC: a Client-side Controller for Dynamic Adaptive Streaming over HTTP (DASH). Luca De Cicco, Member, IEEE, Vito Caldaralo, Vittorio Palmisano, and Saverio Mascolo, Senior Member, IEEE.BOLA: Near-Optimal Bitrate Adaptation for Online Videos. Kevin Spiteri, Rahul Urgaonkar , Ramesh K. Sitaraman, University of Massachusetts Amherst, Amazon Inc., Akamai Technologies Inc. Contact us at:Nikolay Rodionov, Co-Founder and CPO, email@example.com Erica Beavers, Head of Partnerships, firstname.lastname@example.orgWe can give a lot of tips, but most of the use-cases are spcific (segment size, playlist size, latency and also which parameter is most important to you (buffer rate? Best bitrate ? Best bitrate no so useful if you KNOW that most of your user have a better bandwidth anyway? Number of switches)So whats important is to have a way to iterate and improve ?The best is to have AB testing on 50/50 of population, to be able to quickly see results and compare them! What happens if you just tweak one parameter ? The results can be quite stunning!