Wednesday, November 12, 2014

ZeroToDocker: An easy way to evaluate NetflixOSS through runtime packaging


The NetflixOSS platform and related ecosystem services are extensive.  While we make every attempt to document each project, being able to quickly evaluate NetflixOSS is a large challenge due to the breadth for most users.  This becomes a very large challenge to anyone trying to understand individual parts of the platform.

Another part of the challenge relates to how NetflixOSS was designed for scale. Most services are intended to be setup in a multi-node, auto-recoverable cluster.  While this is great once you are ready for production, it is prohibitively complex for new users to try out a smaller scale environment of NetflixOSS.

A final part of the challenge is that in order to keep the platform a collection of services and libraries that could be used wherever they make sense by users, the runtime artifacts are distributed in ways that can be later assembled by users in different ways.  This means many of the Java libraries are in Maven Central, some of the complete services are assembled as distribution zips and wars in our CI environment on CloudBees, and others through file distribution services.  None of these distributions gives you a single command line that is guaranteed to work across the many places people might want to run the NetflixOSS technologies.

A simple solution:

Recently it has become popular to be able to quickly demonstrate technology through the use of Docker containers.  If you search GitHub there are hundreds of projects that include Dockerfiles, which are the image build description file for Docker containers.  By including such Dockerfiles, the developer is showing you exactly how to not only build the code on GitHub, but also assemble and run the compiled artifacts as a full system.

ZeroToDocker is a project that solves the above problems.  Specifically, it allows anyone with a Docker host (on their laptop, on a VM in the cloud, etc.) to, with a single command, run a single node of any NetflixOSS technology.  If you have the network bandwidth to download 500-700M images, you can now run each part of the NetflixOSS platform with a single command.  For example, here is the command to run a single node of Zookeeper managed through NetflixOSS Exhibitor:
  • docker run -d --name exhibitor netflixoss/exhibitor:1.5.2
This command tells Docker to pull the image file “exhibitor” of version “1.5.2” from the official NetflixOSS account and run it with as a daemon with a container name of “exhibitor”.  This will start up just as quick as the Java processes would on a base OS due to the process model of Docker.  It will not start a separate OS instance.  It will also “containerize” the environment meaning that the exhibitor and zookeeper process will be isolated from other containers and processes running on the same Docker host.  Finally, if you examine the Dockerfile that builds this image you will see that it exposes the zookeeper and exhibitor ports of 2181, 2888, 3888, and 8080 to the network.  This means that if you can access these ports via standard Docker networking.  In fact you can load up the following URL:
  • http://EXHIBITORIPADDRESS:8080/exhibitor/v1/ui/index.html
All of this can be done in seconds beyond the initial image download time with very little starting knowledge of NetflixOSS.  We expect this should reduce the learning curve of starting NetflixOSS by at least an order of magnitude.

Images so far:

We decided to focus on the platform foundation of NetflixOSS, but we are already in discussions with our chaos testing and big data teams to create docker images of other aspects of the NetflixOSS ecosystem.  For now, we have released:

  • Asgard
  • Eureka
  • A Karyon based Hello World Example
  • A Zuul Proxy used to proxy to the Karyon service
  • Exhibitor managed Zookeeper
  • Security Monkey

Can you trust these images:

Some of our great OSS community members have already created Docker images for aspects of NetflixOSS.  While we don’t want to take anything away from these efforts, we wanted to take them a step further.  Recently, Docker announced Docker Hub.  You can think of Docker Hub as a ready to run image repository similar to how you think of GitHub for your code, or CloudBees for you deployment artifacts.  Docker Hub creates an open community around images.

Additionally Docker Hub has the concept of a trusted build.  What this means is anyone can point their Docker Hub account at GitHub and tell Docker Hub to build, on their behalf their, trusted images.  After these builds are done, the images are exposed to the standard cloud registry from which anyone can pull and run.  By the fact that the images are built by Docker in a trusted isolated environment combined with the fact that any user can trace the image build back to exact Dockerfiles and source on GitHub and Maven Central, you can see exactly where all the running code originated and make stronger decisions of trust.  With the exception of Oracle Java, Apache Tomcat, and Apache Zookeeper all of the code on images originates from trusted NetfixOSS builds.  Even Java (cloned from Feng Honglin’s Java 7 Dockerfile), Tomcat and Zookeeper are easy to trust as you can read the Dockerfile to trace exactly their origination.

Can you learn from these images:

If you go to the Docker Hub image you are now running, you can navigate back to the GitHub project that hosts the Dockerfiles.  Inside of the Dockerfile you will find the exact commands required to assemble this running image.  Also, files are included with the exact properties needed to have a functioning single node service.

This means you can get up and running very quickly on a simple NetflixOSS technology and then learn how those images work and then progress to your own production deployment using what you learned was under the covers of the running single instance.  While we tried to document this on the GitHub wiki in the past for each project, it is just so much easier to document through a running technology than document it fully in prose on a Wiki.

A final note on accelerated learning vs. production usage:

As noted on the ZeroToDocker Wiki, we are not recommending the use of these Docker images in production.  We designed these images to be as small as possible in scope so you can get the minimum function running as quickly as possible.  That means they do not consider production issues that matter like multi-host networking, security hardening, operational visibility, storage management, and high availability with automatic recovery.

We also want to make it clear that we do not run these images in production.  We continue to run almost all of our systems on the EC2 virtual machine based IaaS.  We do this as the EC2 environment along with Netflix additions provides all of the aforementioned production requirements.  We are starting to experiment with virtual machines running Docker hosting multiple containers in EC2, but those experiments are limited to classes of workloads that get unique value out of a container based deployment model while being managed globally by EC2 IaaS.

Based on the fact that these images are not production ready, we have decided to keep ZeroToDocker on our Netflix-Skunkworks account on GitHub.  However, we believe the value in helping people get up and running on NetflixOSS is valuable and wanted to make the images available.


We started small.  Over time, you can expect more images that represent a larger slice of the NetflixOSS platform and ecosystem.  We also may expand the complexity showing how to set up clusters or more tightly secure the images.  We’ve built specific versions of each service, but in the future will need to create a continuous integration system for the building of our images.

If you enjoy helping with open source and want to build future technologies like that we’ve just demonstrated, check out our some of our open jobs.  We are always looking for excellent engineers to extend our NetflixOSS platform and ecosystem.