Tuesday, December 1, 2015

Caching Content for Holiday Streaming

'Tis the season for holiday binging. How do seasonal viewing patterns affect how Netflix stores and streams content?

Our solution for delivering streaming content is Open Connect, a custom-built system that distributes and stores the audio and video content our members download when they stream a Netflix title (a movie or episode). Netflix has a unique library of large files with relatively predictable popularity, and Open Connect's global, distributed network of caching servers was designed with these attributes in mind. This system localizes content as close to our members as possible to achieve a high-quality playback experience through low latency access to content over optimal internet paths. A subset of highly-watched titles makes up a significant share of total streaming, and caching, the process of storing content based on how often it’s streamed by our members, is critical to ensuring enough copies of a popular title are available in a particular location to support the demand of all the members who want to stream it.

We curate rich, detailed data on what our members are watching, giving us a clear signal of which content is popular today. We enrich that signal with algorithms to produce a strong indicator of what will be popular tomorrow. As a title increases in popularity, more copies of it are added to our caching servers, replacing other, less popular content on a nightly cadence when the network is least busy. Deleting and adding files from servers comes with overhead, however, and we perform these swaps of content with the help of algorithms designed to balance cache efficiency with the network cost of replacing content.

How does this play out for titles with highly seasonal patterns? Metadata assembled by human taggers and reviewed by our internal enhanced content team tells us which titles are holiday-related, and using troves of streaming data we can track their popularity throughout the year. Holiday titles ramp in popularity starting in November, so more copies of these titles will be distributed among the network starting in November through their popularity peak at the end of December. The cycle comes full circle when the holiday content is displaced by relatively more popular titles in January.

Weekly holiday streaming.png

Holiday viewing follows a predictable annual pattern, but we also have to deal with less predictable scenarios like introducing new shows without any viewing history or external events that suddenly drive up the popularity of certain titles. For new titles, we model a combination of external and internal data points to create a predicted popularity, allowing us to appropriately cache that content before the first member ever streams it. For unexpected spikes in popularity driven by events like actors popping up in the news, we are designing mechanisms to let us quickly push content to our caches outside of the nightly replacement schedule as they are actively serving members. We're also exploring ways to evaluate popularity more locally; what's popular in Portland may not be what's popular in Philadelphia.

Whether your tastes run toward Love Actually or The Nightmare before Christmas, your viewing this holiday season provides valuable information to help optimize how Netflix stores its content.

Love data, measurement and Netflix? Join us!