Friday, May 3, 2013

Denominating Multi-Region Sites

Multi-Region in Context

Netflix has built a very dynamic cloud native application, but the network configuration that connects users and their streaming devices to the cloud has been managed as a static configuration. Each time a change was needed to the Domain Name Service (DNS) configuration, we would send a ticket to a Network Administrator asking them to login to the DNS vendor and make the change. We now have a large and complex DNS configuration that connects customers to tens of distinct service end-points.
During 2013 we have been working on the components needed to automate DNS configuration changes. This will let us send customers to more than one region in the cloud, and switch customer traffic between cloud regions. The Denominator project provides a command line tool and a Java library that manages DNS, and can interface to several different DNS vendors.
One of the options provided by some DNS vendors is called "directional routing". This looks up the location of a customer, and can automatically send customers on the west coast to a cloud region in Oregon, while sending customers on the east coast to a cloud region in Virginia. Each state can be configured separately. This feature is now supported by Denominator 1.1, which was released a few days ago. Denominator can create the configuration and dynamically switch customers from one coast to the other to balance traffic, or work around maintenance downtime and outages.
Each DNS lookup has a time-to-live (TTL) that tells the customer that the information is valid for a specific time period before it has to be re-checked. In normal operation, to minimize DNS lookups, this might be set to a few hours. A few hours in advance of a planned maintenance operation or rebalance, Denominator can be used to reduce the TTL to a few minutes. This speeds up the switchover.
We built Denominator to support multiple DNS vendors. The initial vendors used by Netflix for various purposes are AWS Route53, Neustar UltraDNS and DynECT. In Denominator 1.1 code has been contributed to the project to add Rackspace DNS support. The architecture is pluggable, so other DNS vendors are encouraged to add interfaces to their own products.
The diagram below shows the kind of end result we are working towards, where users in California could be routed to US-West and users in New York could be routed to US-East. Using Denominator, this can be configured using either Neustar UltraDNS or DynECT. We use AWS Route53 for other purposes, but it doesn't currently support directional routing by state.

Geo in code

Denominator supports geo (directional) record sets. Unlike the normal case, with a single record set per name and type, geo record sets must also differentiate by group. This group designates the territories which can view the set, such as Japan. In practice, only one geo record set is visible to a resolver. Which is chosen corresponds to the well-known location of the subnet of the ISP in use. As such, you can think of geo record sets as server-side profiles, containing policies used to serve up names conditionally based on client information.
For the below examples, consider the following tuples:
[ CNAME US], [ CNAME EU], [ CNAME Others]
Before changing region mappings, we lower the ttl of the geo groups affected.  This tells clients to look more often for changes, at the expense of higher load on the DNS servers.  Here's an example CLI command to reduce ttl for the US group down to 5 minutes.
$ denominator -p mock geo -z applyttl -n -t A -g US 300
;; in zone applying ttl 300 to rrset A US",
;; ok
If you were writing java to do the same thing, it would look like this:
geoApi.applyTTLToNameTypeAndGroup(300, "", "CNAME", "US");
Here's an example to split off California from the loadbalancer serving "US". This type of operation could be used as a part of a transition such as product testing.
// select the existing territories in US
existing = geoApi.getByNameTypeAndGroup("", "CNAME", "US");
// refine to exclude california
Geo existingGeo = toProfile(Geo.class).apply(existing);
Multimap<String, String> update = filterValues(existingGeo.getRegions(), not(equalTo("California")));
// apply the update
geoApi.applyRegionsToNameTypeAndGroup(update, "", "CNAME", "US");


Getting the design right portably required insight into how these systems operate in real life.  Many thanks to Colm from Route53 and Jeff from UltraDNS for sharing their years of experience as DNS engineers.  Our geo design evolved through an interesting flow channelling in from irc, mailing lists, and even twitter!  There are still many features to add, models to define and vendors to support. Please join us to make the world of DNS a better managed place.