Thursday, March 30, 2017

The Netflix HERMES Test: Quality Subtitling at Scale

Since Netflix launched globally, the scale of our localization efforts has increased dramatically.  It’s hard to believe that just 5 years ago, we only supported English, Spanish and Portuguese.  Now we’ve surpassed 20 languages - including languages like Korean, Chinese, Arabic and Polish - and that number continues to grow.  Our desire to delight members in “their” language, while staying true to creative intent and mindful of cultural nuances is important to ensure quality. It’s also fueling a need to rapidly add great talent who can help provide top-notch translations for our global members across all of these languages.

The need for localization quality at an increasing scale inspired us to build and launch HERMES, the first online subtitling and translation test and indexing system by a major content creator.  Before now, there was no standard test for media translation professionals, even though their work touches millions of people’s lives on a daily basis.  There is no common registration through a professional organization which captures the total number of professional media translators worldwide, no license numbers, accreditations, or databases for qualified professionals.  For instance, the number of working, professional Dutch subtitlers is estimated to be about 100 - 150 individuals worldwide.  We know this through market research Netflix conducted during our launch in the Netherlands several years ago, but this is a very anecdotal “guesstimate” and the actual number remains unknown to the industry.  


In the absence of a common registration scheme and standardized test, how do you find the best resources to do quality media translation?  Netflix does this by relying on third parties to source and manage localization efforts for our content.  But even this method often lacks the precision needed to drive constant improvement and innovation in the media translation space.  Each of these vendors recruit, qualify and measure their subcontractors (translators) differently, so it’s nearly impossible for Netflix to maintain a standard across all of them to ensure constant quality at a reliability and scale we need to support our constant international growth.  We can measure the company’s success through metrics like rejection rates, on-time rates, etc., but we can’t measure the individual.  This is like trying to win the World Cup in soccer and only being able to look at your team’s win/loss record, not knowing how many errors your players are making, blindly creating lineups without scoring averages and not having any idea how big your roster is for the next game.  It’s difficult and frustrating to try to “win” in this environment, yet this is largely how Netflix has had to operate in the localization space for the last few years, while still trying to drive improvement and quality.  


HERMES is emblematic of Hollywood meets Silicon Valley at Netflix, and was developed internally by the Content Localization and Media Engineering teams, with collaboration from renowned academics in the media translation space to create this five part test for subtitlers.  The test is designed to be highly scalable and consists of thousands of randomized combinations of questions so that no two tests should be the same.  The rounds consist of multiple choice questions given at a specifically timed pace, designed to test the candidate’s ability to:

  • Understand English
  • Translate idiomatic phrases into their target language
  • Identify both linguistic and technical errors
  • Subtitle proficiently

Idioms are expressions that are often times specific to a certain language (“you’re on a roll”, “he bought the farm”) and can be a tough challenge to translate into other languages. There are approximately 4,000 idioms in the English language and being able to translate them in a culturally accurate way is critical to preserving the creative intent for a piece of content.  Here’s an example from the HERMES test for translating English idioms into Norwegian:

Screen Shot 2017-03-28 at 8.53.16 AM.png

Upon completion, Netflix will have a good idea of the candidate’s skill level and can use this information to match projects with high quality language resources.  The real long term value of the HERMES platform is in the issuance of HERMES numbers (H-humbers).  This unique identifier is issued to each applicant upon sign-up for the test and will stick with them for the remainder of their career supplying translation services to Netflix.   By looking at the quantity of H-Numbers in a given language, Netflix can start to more precisely estimate the size of the potential resource pool for a given language and better project our time needed to localize libraries. Starting this summer, all subtitles delivered to Netflix will be required to have a valid H-Number tied to it.  This will allow Netflix to better correlate the metrics associated with a given translation to the individual who did the work.  


Over time, we’ll be able to use these metrics in concert with other innovations to “recommend” the best subtitler for specific work based on their past performance to Netflix.  Much like we recommend titles to our members, we aim to match our subtitlers in a similar way.  Perhaps they consider themselves a horror aficionado, but they excel at subtitling romantic comedies - theoretically, we can make this match so they’re able to do their best quality work.  


Since we unveiled our new HERMES tool two weeks ago, thousands of candidates around the world have already completed the test, covering all represented languages.  This is incredible to us because of the impact it will ultimately have on our members as we focus on continually improving the quality of the subtitles on the service.  We’re quickly approaching an inflection point where English won’t be the primary viewing experience on Netflix, and HERMES allows us to better vet the individuals doing this very important work so members can enjoy their favorite TV shows and movies in their language.

If you're a professional subtitler interested in taking the test, you can take it here.

Update: The volume of applicants has exceeded even our most optimistic expectations, so we have scaled up support resources. If you're encountering any issues taking the test, please escalate all questions and feedback to and someone will be in touch soon.

By Chris Fetner and Denny Sheehan

Tuesday, March 21, 2017

Update on HTML5 Video for Netflix

About four years ago, we shared our plans for playing premium video in HTML5, replacing Silverlight and eliminating the extra step of installing and updating browser plug-ins.  

Since then, we have launched HTML5 video on Chrome OS, Chrome, Internet Explorer, Safari, Opera, Firefox, and Edge on all supported operating systems.  And though we do not officially support Linux, Chrome playback has worked on that platform since late 2014.  Starting today, users of Firefox can also enjoy Netflix on Linux.  This marks a huge milestone for us and our partners, including Google, Microsoft, Apple, and Mozilla that helped make it possible.

But this is just the beginning.  We launched 4K Ultra HD on Microsoft Edge in December of 2016, and look forward to high-resolution video being available on more platforms soon.  We are also looking ahead to HDR video.  Netflix-supported TVs with Chromecast built-in—which use a version of our web player—already support Dolby Vision and HDR10.  And we are working with our partners to provide similar support on other platforms over time.

Netflix adoption of HTML5 has resulted in us contributing to a number of related industry standards including:
  • MPEG-DASH, which describes our streaming file formats, including fragmented MP4 and common encryption.  
  • WebCrypto, which protects user data from inspection or tampering and allows us to provide our subscription video service on the web.  
  • Media Source Extensions (MSE), which enable our web application to dynamically manage the playback session in response to ever-changing network conditions.
  • Encrypted Media Extensions (EME), which enables playback of protected content, and hardware-acceleration on capable platforms.

We intend to remain active participants in these and other standards over time.  This includes areas that are just beginning to formulate, like the handling of HDR images and graphics in CSS being discussed in the Color on the Web community group.

Our excitement about HTML5 video has remained strong over the past four years.  Plugin-free playback that works seamlessly on all major platforms helps us deliver compelling experiences no matter how you choose to watch.  This is apparent when you venture through Stranger Things in hardware accelerated HD on Safari, or become transfixed by The Crown in Ultra HD on Edge.  And eventually, you will be able to delight in the darkest details of Marvel’s Daredevil in stunning High Dynamic Range.

Monday, March 13, 2017

Netflix Security Monkey on Google Cloud Platform

Today we are happy to announce that Netflix Security Monkey has BETA support for tracking Google Cloud Platform (GCP) services. Initially we are providing support for the following GCP services:

  • Firewall Rules
  • Networking
  • Google Cloud Storage Buckets (GCS)
  • Service Accounts (IAM)

This work was performed by a few incredible Googlers with the mission to take open source projects and add support for Google’s cloud offerings. Thank you for the commits!

GCP support is available in the develop branch and will be included in release 0.9.0. This work helps to fulfill Security Monkey’s mission as the single place to go to monitor your entire deployment.

To get started with Security Monkey on GCP, check out the documentation.

See Rae Wang, Product Manager on GCP, highlight Security Monkey in her talk, “Gaining full control over your organization's cloud resources (presented at Google Cloud Next '17)”.

Security Monkey’s History

We released Security Monkey in June 2014 as an open source tool to monitor Amazon Web Services (AWS) changes and alert on potential security problems. In 2014 it was monitoring 11 AWS services and shipped with about two dozen security checks. Now the tool monitors 45 AWS services, 4 GCP services, and ships with about 130 security checks.

Future Plans for Security Monkey

We plan to continue decomposing Security Monkey into smaller, more maintainable, and reusable modules. We also plan to use new event driven triggers so that Security Monkey will recognize updates more quickly. With Custom Alerters, Security Monkey will transform from a purely monitoring tool to one that will allow for active response.

More Modular:
  • We have begun the process of moving the service watchers out of Security Monkey and into CloudAux. CloudAux currently supports the four GCP services and three (of the 45) AWS services.
  • We have plans to move the security checks (auditors) out of Security Monkey and into a separate library.
  • Admins may change polling intervals, enable/disable technologies, and modify issue scores from within the settings panel of the web UI.
Event Driven:
  • On AWS, CloudTrail will trigger CloudWatch Event Rules, which will then trigger Lambda functions. We have a working prototype of this flow.
  • On GCP, Stackdriver Logging and Audit Logs will trigger Cloud Functions.
  • As a note, CloudSploit has a product in beta that implements this event driven approach.
Custom Alerters:
  • These can be used to provide new notification methods or correct problems.
  • The documentation describes a custom alerter that sends events to Splunk.
We’ll be following up with a future blog post to discuss these changes in more detail. In the meantime, check out Security Monkey on GitHub, join the community of users, and jump into conversation in our Gitter room if you have questions or comments.

Special Thanks

We appreciate the great community support and contributions for Security Monkey and want to specially thank:

  • Google: GCP Support in CloudAux/Security Monkey
  • Bridgewater Associates: Modularization of Watchers, Auditors, Alerters. Dozens of new watchers. Modifying the architecture to abstract the environment being monitored.

Wednesday, March 8, 2017

Netflix Downloads on Android

By Greg Benson, Francois Goldfain, and Ashish Gupta

Netflix is now a global company, so we wanted to provide a viewing experience that was truly available everywhere even when the Internet is not working well. This led to these three prioritized download use cases:
  1. Better, uninterrupted videos on unreliable Internet
  2. Reducing mobile data usage
  3. Watching Netflix without an Internet connection (e.g. on a train or plane)

... So, What Do We Build?

From a product perspective, we had many initial questions about how the feature should behave: What bitrate & resolution should we download content at? How much configuration should we offer to users? How will video bookmarks work when offline? How do we handle profiles?

We adopted some guiding principles based on general Netflix philosophies about what kind of products we want to create: the Downloads interface should not be so prominent that it's distracting, and the UX should be as simple as possible.

We chose an aggressive timeline for the feature since we wanted to deliver the experience to our members as soon as possible. We aimed to create a great experience with just the right amount of scope, and we could iterate and run A/B tests to improve the feature later on. Fortunately, our Consumer Insights team also had enough time to qualify our initial user-experience ideas with members and non-members before they were built.

How Do Downloads Work?

From an organizational perspective, the downloads feature was a test of coordination between a wide variety of teams. A technical spec was created that represented a balancing act of meeting license requirements, member desires, and security requirements (protecting from fraud). For Android, we used the technical spec to define which pieces of data we'd need to transfer to the client in order to provide a single 'downloaded video':
  • Content manifest (URLs for audio and video files)
  • Media files:
    • Primary video track
    • 2 audio tracks (one primary language plus an alternate based on user language preferences)
    • 2 subtitle tracks (based on user language preferences)
    • Trick play data (images while scrubbing)
  • DRM licenses
  • Title-level metadata and artwork (cached to disk)

Download Mechanics

We initially looked at Android's DownloadManager as the mechanism to actually transfer files and data to the client. This component was easy-to-use and handled some of the functionality we wanted. However, it didn’t ultimately allow us to create the UX we needed.

We created the Netflix DownloadManager for the following reasons:
  • Download Notifications: display download progress in a notification as an aggregate of all the files related to one 'downloadable video'.
  • Pause/Resume Downloads: provide a way for users to temporarily halt downloading.
  • Network Handling: dynamic network selection criteria in case the user changes this preference during a download (WiFi-only vs. any connection).
  • Analytics: understanding the details of all user behavior and the reasons why a download was halted.
  • Change of URL (CDN switching): Our download manifest provides multiple CDNs for the same media content. In case of failures to one CDN we wanted the ability to failover to alternate sources.

Storing Metadata

To store metadata for downloaded titles, our first implementation was a simple solution of serializing and deserializing json blobs to files on disk. We knew there would be problems with this (many objects created, GC churn, not developer-friendly), so while it wasn't our desired long-term solution, it met our needs to get a prototype off the ground.

For our second iteration of managing stored data, we looked at a few possible solutions including built-in SQLite support. We’d also heard a lot about Realm lately and a few companies that had success in using it as a fast and simple data-storage solution. Because we had limited experience with Realm and the downloads metadata case was relatively small and straightforward, we thought it would be a great opportunity to try Realm out.

Realm turned out to be easy to use and has a few benefits we like:
  • Zero-copy IO (by using memory mapping)
  • Strong performance profile
  • It's transactional and has crash safety (via MVCC)
  • Objects are easy to implement
  • Easy to query, no SQL statements
Realm also provides straightforward support for versioning of data, which allows data to be migrated from schema to schema if changed as part of an application update. In this case, a RealmMigration can be created which allows for mapping of data.

The challenges we had that most impacted our implementation included single thread access for objects and a lack of support for vectors such as List<>.

Now that the stability of Realm has been demonstrated in the field with downloads metadata, we are moving forward with adopting it more broadly in the app for generalized video metadata storage.

Updating Metadata

JobScheduler was introduced in Lollipop and allows us to be more resource-efficient in our background processing and network requests. The OS can batch jobs together for an overall efficiency gain. Longer-term, we wanted to build up our experience with this system component since developers will be encouraged more strongly by Google to use it in the future (e.g. Android ‘O’).

For our download use cases, it provided a great opportunity to get low-cost (or effectively free) network usage by creating jobs that would only activate when the user was on an unmetered network. What can our app do in the background?

1. Maintenance jobs:
  • Content license renewals
  • Metadata updates
  • Sync playback metrics: operations, state data, and usage
2. Resume downloads when connectivity restored

There were two major issues we found with JobScheduler. The first was how to provide the updates we needed from JobScheduler on pre-Lollipop devices? For these devices, we wrote an abstraction layer over top of the job-scheduling component, and on pre-Lollipop devices we use the system's Network Connectivity receiver and AlarmManager service to schedule background tasks manually at set times.

The second major problem we encountered with JobScheduler was its issue of crashing in certain circumstances (public bug report filed here). While we weren't able to put in a direct fix for this crash, we were able to determine a workaround whereby we avoided calling JobService.onJobFinished() altogether in certain cases. The job ultimately times out on its own so the cost of operating like this seemed better than permitting the app to crash.

Playback of Content

There are a number of methods of playing video on Android, varying in their complexity and level of control:

Method Comments/Limitations Netflix Usage
MediaPlayer High-level API, hardware decoding but fairly high-level. Limited format support such as DASH, no easy way to apply DRM. Never Used
In-app solution Bundle everything in-app, including media playback and DRM, no use of Android system components. Everything on CPU, more battery drain, potentially lower quality playback. Used for early versions of SD playback
OpenMAX AL Introduced in ICS, low-level APIs from the platform, opened many doors. However, native-c interfaces, deprecated quickly by Google. SD only since all DRM was done in-app. Used for ICS/JB
MediaCodec Low-level but in Java. Playback can be built on top of system components. First version only supported in-app DRM which was complex and only SD. In Android 4.3 Google, introduced a modular framework which allowed us to use built-in/platform Widevine DRM support. Provided for Widevine L1 which allows us to play HD content with a hardware dependency. Used for 4.2 and above, HD for some 4.3+ devices.

Further, playback of offline (non-streaming) content is not supported by the Android system DASH player. It wasn't the only option, but we felt that downloads were a good opportunity to try Google’s new Android ExoPlayer. The features we liked were:
  • Support for DASH, HLS, Smooth Streaming, and local sources
  • Extremely modular design, extensible and customizable
  • Used by Google/OEMs/SoC vendors as part of Android certification (goes to device support and fragmentation)
  • Great documentation and tutorials
The modularity of ExoPlayer was attractive for us since it allowed us to plug in a variety of DRM solutions. Our previous in-app DRM solution did not support offline licenses so we also needed to provide support for an alternate DRM mechanism.

Supporting Widevine

Widevine was selected due to its broad Android support, ability to work with offline licenses, a hardware-based decryption module with a software fallback (suitable for nearly any mobile device), and validation required by Android's Compatibility Test Suite (CTS).

However, this was a difficult migration due to Android fragmentation. Some devices that should have had L3 didn’t, some devices had insecure implementations, and other devices had Widevine APIs that failed whenever we called them. Support was therefore inconsistent, so we had to have reporting in place to monitor these failure rates.

If we detect this kind of failure during app init then we have little choice but to disable the Downloads feature on that device since playback would not be possible. This is unfortunate for users but will hopefully improve over time as the operating system is updated on devices.

Complete block diagram of playback components used for downloads.

Improving Video Quality

Our encoding team has written previously about the specific work they did to enable high-quality, low-bandwidth mobile encodes using VP9 for Android. However, how did we decide to use VP9 in the first place?

Most mobile video streams for Netflix use H.264/AVC with the Main Profile (AVCMain). Downloads were a good opportunity for us to migrate to a new video codec to reduce downloaded content size and pave the way for improved streaming bitrates in the future. The advantages of VP9 encoding for us included:
  • Encodes produced using libvpx are ~32% more efficient than our x264 encodes.
  • Decoder required since Android KitKat, i.e. 100% coverage for current Netflix app deployment
  • Fragmented but growing hardware support: 33% of phones and 4% of tablets using Netflix have a chipset that supports VP9 decoding in hardware
Migrating to support a new video encode had some up-front and ongoing costs, not the least of which was an increased burden placed on our content-delivery system, specifically our Open-Connect Appliances (OCAs). Due to the new encoding formats, more versions of the video streams needed to be deployed and cached in our CDN which required more space on the boxes. This cost was worthwhile for us to provide improved efficiency for downloaded content in the near term, and in the long term will also benefit members streaming on mobile as we migrate to VP9 more broadly.


Many teams at Netflix were aligned to work together and release this feature under an ambitious timeline. We were pleased to bring lots of joy to our members around the world and give them the ability to take their favorite shows with them on the go.

The biggest proportion of downloading has been in Asia where we see strong traction in countries like India, Thailand, Singapore, Malaysia, Philippines, and Hong Kong.

The main suggestion we received for Android was around lack of SD card support, which we quickly addressed in a subsequent release in early 2017. We have now established a baseline experience for downloads, and will be able to A/B test a number of improvements and feature enhancements in coming months.