Friday, December 13, 2013

STAASH - STorage As A Service over Http - A multi-storage abstraction layer

by Christos Kalantzis and Shyam Singh


Netflix’s Astyanax project, and the recipes contained therein, have been a tremendous tool in helping Java developers adopt Apache Cassandra (C*), both within Netflix and outside of Netflix.  A common request we have gotten from non-Java, or non-JVM based (eg: Python, Ruby, BASH, JavaScript), developers is that they would like to take advantage of the recipes provided in Astyanax such as All-Rows-Query or Chunked-Object-Store.

STAASH’s short-term goal is to recreate the most popular Astyanax recipes as a service, providing a REST based API to Cassandra. This would allow ANY language to consume those recipes.

STAASH’s long term ambitions are much greater. We would like STAASH to provide an abstraction layer to multiple storage engines, allowing each engine to be accessed using the same API.  If you ever wanted to migrate an application from one DB technology to another, STAASH can help with that migration. We also foresee the ability to join data between multiple storage engine types. (Think joining a MySQL table with a C* column family) We currently have a POC of this feature.

STAASH is being open sourced in a very early stage. We have tested it and it is already being adopted by some teams within Netflix.  We want to evolve this project in the open, building a community around the project and taking feedback and contributions at a very early stage of the project’s life cycle.

Below is a graphical representation of STAASH:

What is available today?

Although STAASH is being released early, there is plenty of functionality in it already.  Currently you can perform the following actions:
  • Creating or registering Storages, Databases and Tables
  • Listing available Storages, Databases and Tables
  • Create, Read and Update data in the Tables
  • Apache Cassandra and MySQL support
  • Joins between Tables, even across different Storages

Below is a sample of how to use STAASH to create a Keyspace and Column Family in C* that contains user information for a social network. We will be storing a username, last name, first name, email and whether they paid or not.

Storage API

Create storage using a curl request
A storage is defined as the composition of physical hardware  information and the persistence technology indicated by the attribute “cluster” and “type” respectively. (For more details about the API, please visit the STAASH wiki)

PUT http://host:port/staash/v1/admin/storage

Create a database called socialnetwork

PUT http://host:port/staash/v1/admin

Create a table called users

PUT http://host:port/staash/v1/admin/socialnetwork
{“name”:”users” ,”columns”:”username, lastname, firstname, email,paid”,“primarykey”:”username”,


Add a user

PUT http://host:port/staash/v1/data/socialnetwork/user
{“columns”:”username, lastname, firstname, email, paid”, “values”:”rfederer,federer,roger,,N”}

Read a user’s information

GET http://host:port/staash/v1/data/socialnetwork/user/rfederer
{“columns”:”username, lastname, firstname, email, paid”, “values”:”rfederer,federer,roger,,N”}

For more information on STAASH APIs please refer to the STAASH wiki.

Planned Road-map

Here are the features we plan on adding in the very short term.

Event Series API

At Netflix, we have many use cases where we persist events and then read those events in chronological order. This API is meant to hide the sharding complexity from the app developers and make time-series reading and writing easy and efficient.

Out of the box Reverse Indexing

When accessing data, developers often want to filter their data on a key, other than the primary key. In a distributed database, this requires the developer to store those keys in an inverted index so as to avoid the overhead (full cluster scan) of a secondary index. This new feature aims at automating the reverse indexing usage pattern within distributed databases.

Astyanax Chunked-Object-Store Recipe

Please refer to github Astyanax page for more info on Recipes.

How can you contribute?

We are looking for feedback, input about which Astyanax recipes you would like to see implemented first, as well as code contributors to the project. The STAASH github page can be found here.

We plan on holding Meetups, both physical and virtual, in early 2014 to dive deeper into STAASH and it’s direction. We are very excited about STAASH and the community that can be built around it.