The last thing i want to do is write yet another timeseries database. Yet here we are

There’s a lot of TSDB’s (timeseries databases) out there. Graphite’s whisper, OpenTSDB, Elasticsearch, KairosDB, InfluxDB, cyanite, prometheus, riak-ts, druid, blueflood, dalmatiner, akumuli and openNMS’ newTS, just to name “a few”. I’m sure I’ve seen about another dozen, ranging from 1-person pet projects to systems supporting the production monitoring of various large companies.

Graphite, despite its limited storage system and its lack of support for tags (and party because of it) has a tremendously powerful yet simple data processing api. Despite its age, its query api is mostly unrivaled except maybe for these two:

It’s an interesting exercise to study all these projects and uncovering their varying pro’s and cons.

At raintank we’re building a stellar open source monitoring stack with Grafana at the center. The open source ecosystem is already very rich. There’s many projects out there and some them are very good. So it makes a lot of sense to us to work with the existing communities and projects, and integrate with them where possible. That’s why we integrate with several TSDB’s (the upcoming Grafana 3 will take this concept to all kinds of integrations such as alerting and online services)

The other side of raintank is providing said open source monitoring stack as a hosted platform, on Grafana.net Luckily for us, there are a plethora of [TSDB] projects available, surely we could just try some and pick the one that suits the best, and maybe modify it a bit. The last thing we thought we needed to do, is spend our time writing yet another TSDB.

Than why are we doing just that?

Driven by different goals and requirements, people use different datasources in Grafana (and often combine several). Likewise, the requirements for our hosted offering are also fairly unique:

  1. resource efficiency, in particular disk space: we need good data compression to reduce cost
  2. support growth from small to medium-scale multi-tenant SaaS. We need solid clustering for availability and load balancing.
  3. whatever we use must be fully free/open source (a must for our business model. Everyone must be able to run the stack)
  4. operational simplicity
  5. a powerful / flexible querying API, preferably graphite compatible because it has a great data processing / querying API that a lot of people already know and love.
Intermezzo: Both the Graphite and its alternative Graphite-api projects support plugging in 3rd party databases through a plugin api. (docs for Graphite and Graphite-api). This is a great way to bolt Graphite's powerful API on top of a more api-limited database. I've used this in the past to write a graphite-influxdb plugin and at raintank we made one for Kairosdb since we ran on that for a while.

This largely removes the 5th concern.

Unfortunately for us, while some of the mentioned TSDB’s have absolutely great properties, none of them were able to check off all of our particular requirements.

We at raintank also don’t have the resources to build a general purpose TSDB that solves everything for everyone (nor the desire); however, we can be very pragmatic and stand on the shoulders of giants:

  1. Graphite already has a great api and alleviates the API concern mostly.
  2. We need performant storage and solid clustering. Cassandra has this pretty well covered.
  3. Elasticsearch does text/tags searching quite well. Using it as metadata store can take us pretty far.
  4. When I read Facebook’s “Gorilla” float compression paper I got very excited. These data compression numbers were looking great. Then Damian Gryski implemented it in Go with the go-tsz library. Data compression was very important to us and I felt this could be the key to a good solution. In retrospect this was confirmed: about a dozen projects are based on this library, including the new InfluxDB TSM storage engine. We’re seeing space usage of a few bytes per point, which is easily 10x less than uncompressed storage which most of the other TSDB’s use.

Meet Metric-tank

We built a service that ingests timeseries streams, compresses them using go-tsz and stores the chunks in Cassandra. It also keeps a configurable amount of data in RAM for fast retrievals of hot data (for alerting queries and some dashboards), in addition to querying from Cassandra.

Later we also added:

  • basic clustering so we can run several instances for redundancy, to do hot upgrades and for read load balancing. (no write sharding yet, though)
  • runtime consolidation (to offload graphite)
  • support for rollup bands (while loading raw data from Cassandra and runtime consolidation is fast enough, decoding the points was a bottleneck)

While we’re at it, we can also fix some of Graphite’s shortcomings. In particular we store several rollup bands per metric (min/max/avg, etc) so that:

We were able to lower our disk usage by about 10x and significantly shrink our Cassandra cluster. we’ve done tests where we kill Cassandra instances and everything just seems to work and recover fine.

If you want to try it out

The current code name is metric-tank which may change. It is currently not for the faint of heart. If you want to run it, you’ll need:

It takes in messagepack encoded metrics2.0 style data via NSQ

An easy way to see how the pieces fit together and get everything up and running is the raintank-docker stack. It spins up our entire stack, including Grafana 3 beta and Worldping, so an easy way to test data ingest is just adding some Worldping endpoints or use fake_metrics_to_nsq which uses the right format to ingest metrics into NSQ.

If you need any help join our public slack

We plan to make installation easier, support consuming from Kafka and of course a carbon input listener. Over time I’ld also like to take out more python code and replace it by Go, which is faster and much easier to install.

Add comment