Tomorrow we'll release another much-anticipated new series, The Get Down. Before you can hit “Play”, we have to distribute this new title to our global network of thousands of Open Connect appliances. Fortunately, this is now a routine exercise for us, ensuring our members around the world will have access to the title whenever they choose to watch it.
In a previous company blog post, we talked about content distribution throughout our Open Connect network at a high level. In this post, we’ll dig a little deeper into the complex reality of global content distribution. New titles come onto the service, titles increase and decrease in popularity, and sometimes faulty encodes need to be rapidly fixed and replaced. All of this content needs be positioned in the right place at the right time to provide a flawless viewing experience. So let’s take a closer look at how this works.
When a new piece of content is released, the digital assets that are associated with the title are handed off from the content provider to our Content Operations team. At this point, various types of processing and enhancements take place including quality control, encoding, and the addition of more assets that are required for integration into the Netflix platform. At the end of this phase, the title and its associated assets (different bitrates, subtitles, etc.) are repackaged and deployed to our Amazon Simple Storage Service (S3). Titles in S3 that are ready to be released and deployed are flagged via title metadata by the Content Operations team, and at this point Open Connect systems take over and start to deploy the title to the Open Connect Appliances (OCAs) in our network.
We deploy the majority of our updates proactively during configured fill windows. An important difference between our Open Connect CDN and other commercial CDNs is the concept of proactive caching. Because we can predict with high accuracy what our members will watch and what time of day they will watch it, we can make use of non-peak bandwidth to download most of the content updates to the OCAs in our network during these configurable time windows. By reducing disk reads (content serving) while we are performing disk writes (adding new content to the OCAs), we are able to optimize our disk efficiency by avoiding read/write contention. The predictability of off-peak traffic patterns helps with this optimization, but we still only have a finite amount of time every day to get our content pre-positioned to where it needs to be before our traffic starts to ramp up and we want to make all of the OCA capacity available for content serving.
To understand how our fill patterns work, it helps to understand how we architect OCAs into clusters, whether they are in an internet exchange point (IX) or embedded into an ISP’s network.
OCAs are grouped into manifest clusters, to distribute one or more copies of the catalog, depending on the popularity of the title. Each manifest cluster gets configured with an appropriate content region (the group of countries that are expected to stream content from the cluster), a particular popularity feed (which in simplified terms is an ordered list of titles, based on previous data about their popularity), and how many copies of the content it should hold. We compute independent popularity rankings by country, region, or other selection criteria. For those who are interested, we plan to go into more detail about popularity and content storage efficiency in future posts.
We then group our OCAs one step further into fill clusters. A fill cluster is a group of manifest clusters that have a shared content region and popularity feed. Each fill cluster is configured by the Open Connect Operations team with fill escalation policies (described below) and number of fill masters.
The following diagram shows an example of two manifest clusters that are part of the same fill cluster:
Fill Source Manifests
OCAs do not store any information about other OCAs in the network, title popularity, etc. All of this information is aggregated and stored in the AWS control plane. OCAs communicate at regular intervals with the control plane services, requesting (among other things) a manifest file that contains the list of titles they should be storing and serving to members. If there is a delta between the list of titles in the manifest and what they are currently storing, each OCA will send a request, during its configured fill window, that includes a list of the new or updated titles that it needs. The response from the control plane in AWS is a ranked list of potential download locations, aka fill sources, for each title. The determination of the list takes into consideration several high-level factors:
- Title (content) availability - Does the fill source have the requested title stored?
- Fill health - Can the fill source take on additional fill traffic?
- A calculated route cost - Described in the next section.
Calculating the Least Expensive Fill Source
It would be inefficient, in terms of both time and cost, to distribute a title directly from S3 to all of our OCAs, so we use a tiered approach. The goal is to ensure that the title is passed from one part of our network to another using the most efficient route possible.
To calculate the least expensive fill source, we take into account network state and some configuration parameters for each OCA that are set by the Open Connect Operations team. For example:
- BGP path attributes and physical location (latitude / longitude)
- Fill master (number per fill cluster)
- Fill escalation policies
A fill escalation policy defines:
- How many hops away an OCA can go to download content, and how long it should wait before doing so
- Whether the OCA can go to the entire Open Connect network (beyond the hops defined above), and how long it should wait before doing so
- Whether the OCA can go to S3, and how long it should wait before doing so
The control plane elects the specified number of OCAs as masters for a given title asset. The fill escalation policies that are applied to masters typically allow them to reach farther with less delay in order to grab that content and then share it locally with non-masters.
Given all of the input to our route calculations, rank order for fill sources works generally like this:
- Peer fill: Available OCAs within the same manifest cluster or the same subnet
- Tier fill: Available OCAs outside the manifest cluster configuration
- Cache fill: Direct download from S3
In a typical scenario, a group of OCAs in a fill cluster request fill sources for a new title when their fill window starts. Assuming this title only exists in S3 at this point, one of the OCAs in the fill cluster that is elected as a fill master starts downloading the title directly from S3. The other OCAs are not given a fill source at this point, because we want be as efficient as possible by always preferring to fill from nearby OCAs.
After the fill master OCA has completed its S3 download, it reports back to the control plane that it now has the title stored. The next time the other OCAs communicate with the control plane to request a fill source for this title, they are given the option to fill from the fill master.
When the second tier of OCAs complete their download, they report back their status, other OCAs can then fill from them, and so on. This process continues during the fill window. If there are titles being stored on an OCA that are no longer needed, they are put into a delete manifest and then deleted after a period of time that ensures we don’t interrupt any live sessions.
As the sun moves west and more members begin streaming, the fill window in this time zone ends, and the fill pattern continues as the fill window moves across other time zones - until enough of the OCAs in our global network that need to be able to serve this new title have it stored.
When there are a sufficient number of clusters with enough copies of the title to serve it appropriately, the title can be considered to be live from a serving perspective. This liveness indicator, in conjunction with contractual metadata about when a new title should be released, is used by the Netflix application - so the next time you hit “Play”, you have access to the latest and greatest Netflix content.
We are always making improvements to our fill process. The Open Connect Operations team uses internal tooling to constantly monitor our fill traffic, and alerts are set and monitored for OCAs that do not contain a threshold percentage of the catalog that they are supposed to be serving to members. When this happens, we correct the problem before the next fill cycle. We can also perform out-of-cycle “fast track” fills for new titles or other fixes that need to be deployed quickly - essentially following these same fill patterns while reducing propagation and processing times.
Now that Netflix operates in 190 countries and we have thousands of appliances embedded within many ISP networks around the world, we are even more obsessed with making sure that our OCAs get the latest content as quickly as possible while continuing to minimize bandwidth cost to our ISP partners.
For more information about Open Connect, take a look at the website.
If these kinds of large scale network and operations challenges are up your alley, check out our latest job openings!By Michael Costello and Ellen Livengood